Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-60517

Kubernetes plugin crashes with docker plugin .inside command

    • kubernetes-1.22.4

      I have the following jenkinsfile:

       

              stage('build') {
                  container('docker') {
                      def tag = "${env.JOB_NAME}:${env.BUILD_ID}".toLowerCase()
                      devTools = docker.build(tag, "--pull -f .docker/dev-tools/Dockerfile .")
                      devTools.inside() {
                          sh 'make -f Makefile.test -j$(nproc) build-test'
                      }
                  }
              }
      
      

      It works as expected with kubernetes plugin v1.22.0 (previous stable) and docker pipeline plugin v1.21 (current stable).

      Today I installed a newer kubernetes plugin v1.22.1 and my jobs started failing with:

      00:03:31.157  + docker inspect -f . <redacted>
      00:03:31.157  .
      00:03:31.185  [Pipeline] withDockerContainer
      00:03:31.198  [Pipeline] // withDockerContainer
      00:03:31.208  [Pipeline] }
      00:03:31.234  [Pipeline] // container
      00:03:31.251  [Pipeline] }
      00:03:31.282  [Pipeline] // stage
      00:03:31.303  [Pipeline] }
      00:03:31.347  [Pipeline] // node
      00:03:31.372  [Pipeline] }
      00:03:31.398  [Pipeline] // podTemplate
      00:03:31.430  [Pipeline] }
      00:03:31.457  [Pipeline] // timeout
      00:03:31.496  [Pipeline] End of Pipeline
      00:03:31.526  java.lang.NullPointerException
      00:03:31.526  	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.launch(ContainerExecDecorator.java:267)
      00:03:31.526  	at hudson.Launcher$ProcStarter.start(Launcher.java:455)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.launch(DockerClient.java:301)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.launch(DockerClient.java:282)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.launch(DockerClient.java:279)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.version(DockerClient.java:251)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.WithContainerStep$Execution.start(WithContainerStep.java:148)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:286)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:179)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
      00:03:31.526  	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)
      00:03:31.526  	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      00:03:31.526  	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      00:03:31.526  	at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(Docker.groovy:126)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.Docker.node(Docker.groovy:66)
      00:03:31.526  	at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(Docker.groovy:114)
      00:03:31.526  	at WorkflowScript.run(WorkflowScript:26)
      00:03:31.526  	at ___cps.transform___(Native Method)
      00:03:31.526  	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
      00:03:31.526  	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
      00:03:31.526  	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
      00:03:31.526  	at sun.reflect.GeneratedMethodAccessor234.invoke(Unknown Source)
      00:03:31.526  	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      00:03:31.526  	at java.lang.reflect.Method.invoke(Method.java:498)
      00:03:31.526  	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      00:03:31.526  	at com.cloudbees.groovy.cps.impl.ClosureBlock.eval(ClosureBlock.java:46)
      00:03:31.526  	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
      00:03:31.526  	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
      00:03:31.526  	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
      00:03:31.526  	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
      00:03:31.526  	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
      00:03:31.526  	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:405)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:317)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:281)
      00:03:31.526  	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
      00:03:31.526  	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      00:03:31.526  	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
      00:03:31.526  	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      00:03:31.526  	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      00:03:31.526  	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      00:03:31.526  	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      00:03:31.526  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      00:03:31.526  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      00:03:31.526  	at java.lang.Thread.run(Thread.java:748)
      

          [JENKINS-60517] Kubernetes plugin crashes with docker plugin .inside command

          Ivan Kurnosov added a comment -

          I have created reproduction: https://github.com/zerkms/kubernetes-jenkins-issue-60517

          As you can see there as long as `jnlp` is presented in `containers` (see `yaml`) - it breaks with that exception.

          From diff of 2 generated manifests the largest difference is that when it breaks - EVERY container has `workingDir: /home/jenkins/agent`

          I'm not sure if every container needs it, but nevertheless - that literally is the only property that changes (is being added)

          Ivan Kurnosov added a comment - I have created reproduction: https://github.com/zerkms/kubernetes-jenkins-issue-60517 As you can see there as long as `jnlp` is presented in `containers` (see `yaml`) - it breaks with that exception. From diff of 2 generated manifests the largest difference is that when it breaks - EVERY container has `workingDir: /home/jenkins/agent` I'm not sure if every container needs it, but nevertheless - that literally is the only property that changes (is being added)

          > From diff of 2 generated manifests the largest difference is that when it breaks - EVERY container has `workingDir: /home/jenkins/agent`

          Were the workingDir different with 1.22.0 ? The fact that /home/jenkins/agent is the same across all containers is not a problem as far as I understand. But I think that you are right, this bug may be caused by JENKINS-58975. That containerWorkingDirFilePath (derived from starter.pwd()) may be null.

          Allan BURDAJEWICZ added a comment - > From diff of 2 generated manifests the largest difference is that when it breaks - EVERY container has `workingDir: /home/jenkins/agent` Were the workingDir different with 1.22.0 ? The fact that /home/jenkins/agent is the same across all containers is not a problem as far as I understand. But I think that you are right, this bug may be caused by JENKINS-58975 . That containerWorkingDirFilePath (derived from starter.pwd() ) may be null .

          Allan BURDAJEWICZ added a comment - - edited

          I am curious, as a workaround does that make any difference if you explicitly set a workingDir: "/home/jenkins/agent" to you container templates ? I do think it would pass the first condition and not fail on the second one (NPE) but I might b wrong

          Allan BURDAJEWICZ added a comment - - edited I am curious, as a workaround does that make any difference if you explicitly set a workingDir: "/home/jenkins/agent" to you container templates ? I do think it would pass the first condition and not fail on the second one (NPE) but I might b wrong

          Ivan Kurnosov added a comment -

          allan_burdajewicz

          > Were the workingDir different with 1.22.0 ?

          it's hard to tell, I only was researching the broken version. At the moment I've already left office for my christmas break and won't be able to re-check it til mid-January

          I also had the same idea to try to set `workingDir` manually, but the debugging session took a bit more time than I expected and I simply forgot it in the middle.

          Ivan Kurnosov added a comment - allan_burdajewicz > Were the workingDir different with 1.22.0 ? it's hard to tell, I only was researching the broken version. At the moment I've already left office for my christmas break and won't be able to re-check it til mid-January I also had the same idea to try to set `workingDir` manually, but the debugging session took a bit more time than I expected and I simply forgot it in the middle.

          zerkms allan_burdajewicz

          Here is my understanding / observation

          devTools = docker.build(tag, "--pull -f .docker/dev-tools/Dockerfile .") command builds a docker image, image is built inside the docker-dind container. Subsequent sh command inside the devTools.inside() block gets executed inside the dynamically created container inside the docker-dind container. NPE happens during this shell execution inside the dynamically created container. As this shell command is getting executed with in a container created outside of kubernetes pod, container working directory is getting set to null.

          Added a null check to avoid NPE, added unit test to test this usage and raised a PR.

          PR: https://github.com/jenkinsci/kubernetes-plugin/pull/671

          Narayanan Singaram added a comment - zerkms allan_burdajewicz Here is my understanding / observation devTools = docker.build(tag, "--pull -f .docker/dev-tools/Dockerfile .") command builds a docker image, image is built inside the docker-dind container. Subsequent sh command inside the devTools.inside() block gets executed inside the dynamically created container inside the docker-dind container. NPE happens during this shell execution inside the dynamically created container. As this shell command is getting executed with in a container created outside of kubernetes pod, container working directory is getting set to null. Added a null check to avoid NPE, added unit test to test this usage and raised a PR. PR: https://github.com/jenkinsci/kubernetes-plugin/pull/671

            narayanan Narayanan Singaram
            zerkms Ivan Kurnosov
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: