Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63414

Global Docker agent breaks nested agent usage

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      pipeline {
          agent none
          stages {
              stage('parent stage') {
                  agent {
                      docker {
                          image 'ubuntu:bionic'
                      }
                  }
                  stages {
                      stage('inherited agent') {
                          steps {
                              sh 'uname -a'
                          }
                      }
                      stage('explicit agent') {
                          agent {
                              node {
                                  label 'master'
                              }
                          }
                          steps {
                              sh 'uname -a'
                          }
                      }
                  }
              }
          }
      }
      

      The above pipeline results in the following output:

      Started by user unknown or anonymous
      Running in Durability level: MAX_SURVIVABILITY
      [Pipeline] Start of Pipeline
      [Pipeline] node
      Running on Jenkins in /var/lib/jenkins/workspace/docker-durable-bug
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (parent stage)
      [Pipeline] getContext
      [Pipeline] isUnix
      [Pipeline] sh
      + docker inspect -f . ubuntu:bionic
      .
      [Pipeline] withDockerContainer
      Jenkins does not seem to be running inside a container
      $ docker run -t -d -u 982:982 -w /var/lib/jenkins/workspace/docker-durable-bug -v /var/lib/jenkins/workspace/docker-durable-bug:/var/lib/jenkins/workspace/docker-durable-bug:rw,z -v /var/lib/jenkins/workspace/docker-durable-bug@tmp:/var/lib/jenkins/workspace/docker-durable-bug@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** ubuntu:bionic cat
      $ docker top c84a4a643ed58929a86d80300821f04249f3e882de21c190043ac475b43eb3f6 -eo pid,comm
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (inherited agent)
      [Pipeline] sh
      + uname -a
      Linux c84a4a643ed5 5.8.1-arch1-1 #1 SMP PREEMPT Wed, 12 Aug 2020 18:50:43 +0000 x86_64 x86_64 x86_64 GNU/Linux
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] stage
      [Pipeline] { (explicit agent)
      [Pipeline] node
      Running on Jenkins in /var/lib/jenkins/workspace/docker-durable-bug@2
      [Pipeline] {
      [Pipeline] sh
      process apparently never started in /var/lib/jenkins/workspace/docker-durable-bug@2@tmp/durable-4a02313c
      (running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] }
      $ docker stop --time=1 c84a4a643ed58929a86d80300821f04249f3e882de21c190043ac475b43eb3f6
      $ docker rm -f c84a4a643ed58929a86d80300821f04249f3e882de21c190043ac475b43eb3f6
      [Pipeline] // withDockerContainer
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] End of Pipeline
      ERROR: script returned exit code -2
      Finished: FAILURE
      

      If I use a regular agent, rather than a docker one, there's no problem.
      The above example works in a clean environment and it is explicitly prepared to reproduce it.

        Attachments

          Activity

          Hide
          caseteroconamor Ruben Sancho Ramos added a comment -

          I'm having the same issue, these is my pipeline:

           

          pipeline {
            agent {
              kubernetes {
                yamlFile 'agent_definition.yml'
                idleMinutes 5
              }
            }
            stages {
              stage('List Git Repo'){
                steps {
                  sh 'echo hola'
                  container('awsclislave') {
                    sh '''
                      . ./aws_auth.sh
                      aws s3 ls
                      aws sts get-caller-identity
                    '''
                  }
                }
              }
            }
          }

          It works fine on the first "sh 'echo hola'", but not on the second piece of code. If I enabled that 
          -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true I get this:
           
           
           

          [Pipeline] // stage
          [Pipeline] withEnv
          [Pipeline] {
          [Pipeline] stage
          [Pipeline] { (List Git Repo)
          [Pipeline] sh
          + echo hola
          hola
          [Pipeline] container
          [Pipeline] {
          [Pipeline] sh
          sh: 1: cd: can't cd to /home/jenkins/agent/workspace/_jenkins_triggering_tests_master
          sh: 1: cannot create /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-log.txt: Directory nonexistent
          sh: 1: cannot create /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-result.txt.tmp: Directory nonexistent
          mv: cannot stat '/home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-result.txt.tmp': No such file or directory
          process apparently never started in /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc
          [Pipeline] }
          [Pipeline] // container
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // withEnv
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          [Pipeline] // podTemplate
          [Pipeline] End of Pipeline
          

            I logged inside the running container and I saw the files are there but on a different location:

           

          $ cd /home/jenkins/workspace
          $ pwd
          /home/jenkins/workspace
          $ ls
          _jenkins_triggering_tests_master  _jenkins_triggering_tests_master@tmp workspaces.txt
          
          

          the path should be /home/jenkins/workspace/ instead of /home/jenkins/agent/workspace/

           

          Show
          caseteroconamor Ruben Sancho Ramos added a comment - I'm having the same issue, these is my pipeline:   pipeline { agent { kubernetes { yamlFile 'agent_definition.yml' idleMinutes 5 } } stages { stage('List Git Repo'){ steps { sh 'echo hola' container('awsclislave') { sh ''' . ./aws_auth.sh aws s3 ls aws sts get-caller-identity ''' } } } } } It works fine on the first "sh 'echo hola'", but not on the second piece of code. If I enabled that  -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true I get this:       [Pipeline] // stage [Pipeline] withEnv [Pipeline] { [Pipeline] stage [Pipeline] { (List Git Repo) [Pipeline] sh + echo hola hola [Pipeline] container [Pipeline] { [Pipeline] sh sh: 1: cd: can't cd to /home/jenkins/agent/workspace/_jenkins_triggering_tests_master sh: 1: cannot create /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-log.txt: Directory nonexistent sh: 1: cannot create /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-result.txt.tmp: Directory nonexistent mv: cannot stat '/home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc/jenkins-result.txt.tmp': No such file or directory process apparently never started in /home/jenkins/agent/workspace/_jenkins_triggering_tests_master@tmp/durable-9413a0fc [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // podTemplate [Pipeline] End of Pipeline   I logged inside the running container and I saw the files are there but on a different location:   $ cd /home/jenkins/workspace $ pwd /home/jenkins/workspace $ ls _jenkins_triggering_tests_master  _jenkins_triggering_tests_master@tmp workspaces.txt the path should be /home/jenkins/workspace/ instead of /home/jenkins/agent/workspace/  
          Hide
          caseteroconamor Ruben Sancho Ramos added a comment -

          Found the solution for my issue:
          https://superuser.com/questions/1459174/jenkins-pipeline-sh-step-hangs

          changing the "workingDir" to /home/jenkins/agent fixed it!

          Show
          caseteroconamor Ruben Sancho Ramos added a comment - Found the solution for my issue: https://superuser.com/questions/1459174/jenkins-pipeline-sh-step-hangs changing the "workingDir" to /home/jenkins/agent fixed it!
          Hide
          hiteshkumar Hitesh kumar added a comment -

          Hi Bogomil Vasilev 

          I am using jenkins version : Jenkins 2.249.1

          Durable task : 1.35

          We run builds in kubenetes farm and our builds are dockerised, After the upgrade of Jenkins, and durable plugins as we see multiple issues raised with "sh" initialisation inside the container breaks, we considered the above solution as suggested and set workingDir: "/home/jenkins/agent" and builds are success after making the change.

           

          How ever still some of the builds on Jenkins are still failing randomly with same error

          [2021-05-10T15:16:33.046Z] [Pipeline] sh [2021-05-10T15:22:08.073Z] process apparently never started in /home/jenkins/agent/workspace/CORE-CommitStage@tmp/durable-f6a728e7 [2021-05-10T15:22:08.087Z] [Pipeline] }

          Also we had already enabled as per suggestions- 
          -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true \

          Issue is however not persistent but still jobs fails randomly.  Looking for permanent fix for the current issue.

          Show
          hiteshkumar Hitesh kumar added a comment - Hi Bogomil Vasilev   I am using jenkins version :  Jenkins 2.249.1 Durable task : 1.35 We run builds in kubenetes farm and our builds are dockerised, After the upgrade of Jenkins, and durable plugins as we see multiple issues raised with "sh" initialisation inside the container breaks, we considered the above solution as suggested and set workingDir: "/home/jenkins/agent" and builds are success after making the change.   How ever still some of the builds on Jenkins are still failing randomly with same error [2021-05-10T15:16:33.046Z] [Pipeline] sh [2021-05-10T15:22:08.073Z] process apparently never started in /home/jenkins/agent/workspace/CORE-CommitStage@tmp/durable-f6a728e7 [2021-05-10T15:22:08.087Z] [Pipeline] } Also we had already enabled as per suggestions-  -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true \ Issue is however not persistent but still jobs fails randomly.  Looking for permanent fix for the current issue.
          Hide
          sturk0552 Serdar added a comment -

          Hi There, 

          Jenkins version: Jenkins 2.277.2

          Durable task: 1.35

          Node: SSH agent

          Agent root directory: /home/<user>/jenkins

          -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true

          We run parallel builds on docker to verify our codebase on multiple platforms but we see random failures as listed below.  

          I reviewed the durable-task source code and seems like it is failing to generate durable-2b3ef7ef a directory does not exist. link title

           

          [2021-05-24T18:14:59.321Z] process apparently never started in /home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef
          [2021-05-24T18:15:01.657Z] sh: 1: cannot create /home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef/jenkins-result.txt.tmp: Directory nonexistent
          [2021-05-24T18:15:01.659Z] mv: cannot stat '/home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef/jenkins-result.txt.tmp': No such file or directory
          script returned exit code -2

           

          Show
          sturk0552 Serdar added a comment - Hi There,  Jenkins version:  Jenkins 2.277.2 Durable task: 1.35 Node: SSH agent Agent root directory: /home/<user>/jenkins -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true We run parallel builds on docker to verify our codebase on multiple platforms but we see random failures as listed below.   I reviewed the durable-task source code and seems like it is failing to generate  durable-2b3ef7ef  a directory does not exist.  link title   [2021-05-24T18:14:59.321Z] process apparently never started in /home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef [2021-05-24T18:15:01.657Z] sh: 1: cannot create /home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef/jenkins-result.txt.tmp: Directory nonexistent [2021-05-24T18:15:01.659Z] mv: cannot stat '/home/<user>/jenkins/workspace/lumpdk-coverage_PR-4785@tmp/durable-2b3ef7ef/jenkins-result.txt.tmp': No such file or directory script returned exit code -2  
          Hide
          markewaite Mark Waite added a comment - - edited

          I see the same behavior as is described by Bogomil Vasilev. Defining the outer agent as a docker agent seems to affect the inner agent, even if the inner agent is unrelated to docker.

          I'm not sure if Pipeline ever officially supported replacing an agent definition in a nested stage when an outer stage has defined an agent. In this case, it appears that the outer agent definition is affecting the replacement agent in the nested stage. I've never replaced an outer agent definition with an inner agent definition and have never seen it referenced in any of the Jenkins documentation. That doesn't mean it is not valid, just that I've never seen it.

          My environment only has docker available on agents with the label 'docker'. The below job definition incorrectly attempts to invoke the command docker on the nested agent definition that will use the label 'windows' with Durable Task plugin 1.39 and Docker Pipeline plugin 1.26. None of my 'windows' agents have the command 'docker' or the label 'docker'.

          pipeline {
              agent none
              stages {
                  stage('parent stage') {
                      agent {
                          docker {
                              image 'ubuntu:bionic'
                              label 'docker'
                          }
                      }
                      stages {
                          stage('inherited agent') {
                              steps {
                                  sh 'hostname;uname -a' // run inside a docker container
                              }
                          }
                          stage('explicit agent') {
                              agent {
                                  node {
                                      label 'windows'
                                  }
                              }
                              steps {
                                  bat 'echo %PATH%' // run without docker, yet docker command is called
                              }
                          }
                      }
                  }
              }
          }
          

          If I avoid the nested agent definition, the job behaves correctly.

          If the outer agent definition is a simple label (not docker), the job behaves correctly. I assume that is because there is less initialization code for a labeled agent, while there is special initialization code for a docker agent. That's just me making an assumption.

          I don't have access to a kubernetes cluster, but I assume the same type of failure would happen if the outer agent were a kubernetes agent and the nested agent were a non-kubernetes agent.

          Show
          markewaite Mark Waite added a comment - - edited I see the same behavior as is described by Bogomil Vasilev . Defining the outer agent as a docker agent seems to affect the inner agent, even if the inner agent is unrelated to docker. I'm not sure if Pipeline ever officially supported replacing an agent definition in a nested stage when an outer stage has defined an agent. In this case, it appears that the outer agent definition is affecting the replacement agent in the nested stage. I've never replaced an outer agent definition with an inner agent definition and have never seen it referenced in any of the Jenkins documentation. That doesn't mean it is not valid, just that I've never seen it. My environment only has docker available on agents with the label 'docker'. The below job definition incorrectly attempts to invoke the command docker on the nested agent definition that will use the label 'windows' with Durable Task plugin 1.39 and Docker Pipeline plugin 1.26. None of my 'windows' agents have the command 'docker' or the label 'docker'. pipeline { agent none stages { stage('parent stage') { agent { docker { image 'ubuntu:bionic' label 'docker' } } stages { stage('inherited agent') { steps { sh 'hostname;uname -a' // run inside a docker container } } stage('explicit agent') { agent { node { label 'windows' } } steps { bat 'echo %PATH%' // run without docker, yet docker command is called } } } } } } If I avoid the nested agent definition, the job behaves correctly. If the outer agent definition is a simple label (not docker), the job behaves correctly. I assume that is because there is less initialization code for a labeled agent, while there is special initialization code for a docker agent. That's just me making an assumption. I don't have access to a kubernetes cluster, but I assume the same type of failure would happen if the outer agent were a kubernetes agent and the nested agent were a non-kubernetes agent.

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            smirky Bogomil Vasilev
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated: