Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50179

parallel step with failFast set to false, parallel branch 'foo' kills other parallel branch 'bar' when foo times-out.

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major

      When I try to run parallel branches with failFast set to false, parallel branch 'foo' kills other parallel branch 'bar' when foo times-out.
      When I use try/catch in 'foo', exception is not being handled and 'bar' gets killed.
      In both cases 'bar is getting killed'

      Code snippet without try/catch:

      pipeline {
          agent any
          options {
              timestamps()
              timeout(time: 30, unit: 'MINUTES')
          }
          stages {
              stage('Setup'){
                  steps{
                      script{
                          parallel 'foo': {
                              timeout(time: 60, unit: 'SECONDS'){
                                  sh "sleep 90"
                              }
                              echo "Done with foo"
                          },
                          'bar': {
                              sh "sleep 200"
                              echo "Done with bar"
                          },
                          failFast: false
                      }
                  }
              }
          }
      }
      

      Snippet of the console log without try/catch block (you can see that 'bar' is getting killed):

      [Pipeline] node
      Running on Jenkins in /Users/Shared/Jenkins/Home/workspace/dummy
      [Pipeline] {
      [Pipeline] timestamps
      [Pipeline] {
      [Pipeline] timeout
      16:28:05 Timeout set to expire in 30 min
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (Setup)
      [Pipeline] script
      [Pipeline] {
      [Pipeline] parallel
      [Pipeline] [foo] { (Branch: foo)
      [Pipeline] [bar] { (Branch: bar)
      [Pipeline] [foo] timeout
      16:28:05 [foo] Timeout set to expire in 1 min 0 sec
      [Pipeline] [foo] {
      [Pipeline] [bar] sh
      16:28:05 [bar] [dummy] Running shell script
      [Pipeline] [foo] sh
      16:28:05 [bar] + sleep 200
      16:28:05 [foo] [dummy] Running shell script
      16:28:05 [foo] + sleep 90
      16:29:05 [foo] Cancelling nested steps due to timeout
      16:29:05 [bar] sh: line 1:  9520 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-920b4cdf/script.sh' > '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-920b4cdf/jenkins-log.txt' 2>&1
      16:29:05 [foo] Sending interrupt signal to process
      16:29:05 [foo] sh: line 1:  9526 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-9c5ebea8/script.sh' > '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-9c5ebea8/jenkins-log.txt' 2>&1
      [Pipeline] [bar] }
      16:29:14 [bar] Failed in branch bar
      [Pipeline] [foo] }
      [Pipeline] [foo] // timeout
      [Pipeline] [foo] }
      16:29:14 [foo] Failed in branch foo
      [Pipeline] // parallel
      [Pipeline] }
      [Pipeline] // script
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] }
      [Pipeline] // timeout
      [Pipeline] }
      [Pipeline] // timestamps
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] End of Pipeline
      ERROR: script returned exit code 143
      Finished: FAILURE
      

      Code snippet with try/catch inside 'foo' branch:

      pipeline {
          agent any
          options {
              timestamps()
              timeout(time: 30, unit: 'MINUTES')
          }
          stages {
              stage('Setup'){
                  steps{
                      script{
                          parallel 'foo': {
                              try{
                                  timeout(time: 60, unit: 'SECONDS'){
                                      sh "sleep 90"
                                  }
                                  echo "Done with foo"
                              } catch(err) {
                                  "Caught exception ignore: ${err}"
                              }
                              
                          },
                          'bar': {
                              sh "sleep 200"
                              echo "Done with bar"
                          },
                          failFast: false
                      }
                  }
              }
          }
      }
      

      Snippet of the console log with try/catch block inside 'foo' branch (you can see that 'bar' is getting killed):

      [Pipeline] node
      Running on Jenkins in /Users/Shared/Jenkins/Home/workspace/dummy
      [Pipeline] {
      [Pipeline] timestamps
      [Pipeline] {
      [Pipeline] timeout
      16:32:58 Timeout set to expire in 30 min
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (Setup)
      [Pipeline] script
      [Pipeline] {
      [Pipeline] parallel
      [Pipeline] [foo] { (Branch: foo)
      [Pipeline] [bar] { (Branch: bar)
      [Pipeline] [foo] timeout
      16:32:58 [foo] Timeout set to expire in 1 min 0 sec
      [Pipeline] [foo] {
      [Pipeline] [bar] sh
      16:32:58 [bar] [dummy] Running shell script
      [Pipeline] [foo] sh
      16:32:58 [bar] + sleep 200
      16:32:58 [foo] [dummy] Running shell script
      16:32:58 [foo] + sleep 90
      16:33:58 [foo] Cancelling nested steps due to timeout
      16:33:58 [bar] sh: line 1:  9756 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-4d63fae2/script.sh' > '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-4d63fae2/jenkins-log.txt' 2>&1
      16:33:58 [foo] Sending interrupt signal to process
      16:33:58 [foo] sh: line 1:  9762 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-76186e5f/script.sh' > '/Users/Shared/Jenkins/Home/workspace/dummy@tmp/durable-76186e5f/jenkins-log.txt' 2>&1
      [Pipeline] [bar] }
      16:34:07 [bar] Failed in branch bar
      [Pipeline] [foo] }
      [Pipeline] [foo] // timeout
      [Pipeline] [foo] }
      [Pipeline] // parallel
      [Pipeline] }
      [Pipeline] // script
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] }
      [Pipeline] // timeout
      [Pipeline] }
      [Pipeline] // timestamps
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] End of Pipeline
      ERROR: script returned exit code 143
      Finished: FAILURE
      

          [JENKINS-50179] parallel step with failFast set to false, parallel branch 'foo' kills other parallel branch 'bar' when foo times-out.

          I tried having parallel in steps and not inside script block. None the less I see same error.

          pipeline {
              agent any
              options {
                  timestamps()
                  timeout(time: 30, unit: 'MINUTES')
              }
              stages {
                  stage('Setup'){
                      steps{
                              parallel(
                                  'foo': {
                                      timeout(time: 60, unit: 'SECONDS'){
                                          sh "sleep 90"
                                      }
                                      echo "Done with foo"
                                  },
                                  'bar': {
                                      sh "sleep 200"
                                      echo "Done with bar"
                                  },
                                  failFast: false
                              )
                      }
                  }
              }
          }
          

          snippet for console output:

          [Pipeline] node
          Running on Jenkins in /Users/Shared/Jenkins/Home/workspace/test_scripts
          [Pipeline] {
          [Pipeline] timestamps
          [Pipeline] {
          [Pipeline] timeout
          08:53:06 Timeout set to expire in 30 min
          [Pipeline] {
          [Pipeline] stage
          [Pipeline] { (Setup)
          [Pipeline] parallel
          [Pipeline] [foo] { (Branch: foo)
          [Pipeline] [bar] { (Branch: bar)
          [Pipeline] [foo] timeout
          08:53:06 [foo] Timeout set to expire in 1 min 0 sec
          [Pipeline] [foo] {
          [Pipeline] [bar] sh
          08:53:06 [bar] [test_scripts] Running shell script
          [Pipeline] [foo] sh
          08:53:06 [bar] + sleep 200
          08:53:06 [foo] [test_scripts] Running shell script
          08:53:06 [foo] + sleep 90
          08:54:06 [foo] Cancelling nested steps due to timeout
          08:54:06 [bar] sh: line 1: 11698 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-0a2c2946/script.sh' > '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-0a2c2946/jenkins-log.txt' 2>&1
          08:54:06 [foo] Sending interrupt signal to process
          08:54:06 [foo] sh: line 1: 11704 Terminated: 15          JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-cd67dc01/script.sh' > '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-cd67dc01/jenkins-log.txt' 2>&1
          [Pipeline] [bar] }
          08:54:15 [bar] Failed in branch bar
          [Pipeline] [foo] }
          [Pipeline] [foo] // timeout
          [Pipeline] [foo] }
          08:54:15 [foo] Failed in branch foo
          [Pipeline] // parallel
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // timeout
          [Pipeline] }
          [Pipeline] // timestamps
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] End of Pipeline
          ERROR: script returned exit code 143
          Finished: FAILURE
          

          Aravinder Bandi added a comment - I tried having parallel in steps and not inside script block. None the less I see same error. pipeline { agent any options { timestamps() timeout(time: 30, unit: 'MINUTES' ) } stages { stage( 'Setup' ){ steps{ parallel( 'foo' : { timeout(time: 60, unit: 'SECONDS' ){ sh "sleep 90" } echo "Done with foo" }, 'bar' : { sh "sleep 200" echo "Done with bar" }, failFast: false ) } } } } snippet for console output: [Pipeline] node Running on Jenkins in /Users/Shared/Jenkins/Home/workspace/test_scripts [Pipeline] { [Pipeline] timestamps [Pipeline] { [Pipeline] timeout 08:53:06 Timeout set to expire in 30 min [Pipeline] { [Pipeline] stage [Pipeline] { (Setup) [Pipeline] parallel [Pipeline] [foo] { (Branch: foo) [Pipeline] [bar] { (Branch: bar) [Pipeline] [foo] timeout 08:53:06 [foo] Timeout set to expire in 1 min 0 sec [Pipeline] [foo] { [Pipeline] [bar] sh 08:53:06 [bar] [test_scripts] Running shell script [Pipeline] [foo] sh 08:53:06 [bar] + sleep 200 08:53:06 [foo] [test_scripts] Running shell script 08:53:06 [foo] + sleep 90 08:54:06 [foo] Cancelling nested steps due to timeout 08:54:06 [bar] sh: line 1: 11698 Terminated: 15 JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-0a2c2946/script.sh' > '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-0a2c2946/jenkins-log.txt' 2>&1 08:54:06 [foo] Sending interrupt signal to process 08:54:06 [foo] sh: line 1: 11704 Terminated: 15 JENKINS_SERVER_COOKIE=$jsc '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-cd67dc01/script.sh' > '/Users/Shared/Jenkins/Home/workspace/test_scripts@tmp/durable-cd67dc01/jenkins-log.txt' 2>&1 [Pipeline] [bar] } 08:54:15 [bar] Failed in branch bar [Pipeline] [foo] } [Pipeline] [foo] // timeout [Pipeline] [foo] } 08:54:15 [foo] Failed in branch foo [Pipeline] // parallel [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // timeout [Pipeline] } [Pipeline] // timestamps [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline ERROR: script returned exit code 143 Finished: FAILURE

          It fails the same way with the declarative pipeline's parallel stages.

          pipeline {
            agent any
            environment {
              TERM = "xterm-256color"
            }
            options {
              timestamps()
              timeout(time: 3, unit: 'MINUTES')
              ansiColor('xterm')
            }
            stages {
              stage('Setup'){
          
                failFast false
                parallel {
                  stage('foo') {
                    steps {
                      script {
                        try {
                          timeout(time: 3, unit: 'SECONDS') {
                            try {
                              sh "sleep 90"
                            } catch(ex) {
                              echo "NARF foo inside: ${ex}"
                              throw ex
                            } finally {
                              echo "NARF foo inside finally"
                            }
                          }
                        } catch(ex) {
                          echo "NARF foo outside: ${ex}"
                          throw ex
                        } finally {
                          echo "NARF foo outside finally"
                        }
                        echo "Done with foo"
                      }
                    }
                  }
                  stage('bar') {
                    steps {
                      script {
                        try {
                          sh "sleep 10"
                        } catch(ex) {
                          echo "NARF bar: ${ex}"
                          throw ex
                        } finally {
                          echo "NARF bar finally"
                        }
                        echo "Done with bar"
                      }
                    }
                  }
                }
              }
            }
          }
          

          Christian Höltje added a comment - It fails the same way with the declarative pipeline's parallel stages. pipeline { agent any environment { TERM = "xterm-256color" } options { timestamps() timeout(time: 3, unit: 'MINUTES' ) ansiColor( 'xterm' ) } stages { stage( 'Setup' ){ failFast false parallel { stage( 'foo' ) { steps { script { try { timeout(time: 3, unit: 'SECONDS' ) { try { sh "sleep 90" } catch (ex) { echo "NARF foo inside: ${ex}" throw ex } finally { echo "NARF foo inside finally " } } } catch (ex) { echo "NARF foo outside: ${ex}" throw ex } finally { echo "NARF foo outside finally " } echo "Done with foo" } } } stage( 'bar' ) { steps { script { try { sh "sleep 10" } catch (ex) { echo "NARF bar: ${ex}" throw ex } finally { echo "NARF bar finally " } echo "Done with bar" } } } } } } }

          I have a theory: timeouts are usually implemented via asking the OS to send a SIGALRM at some point in the future (e.g. setitimer() or alarm()).

          The problem with signals is that they effect all the (Java) threads in a process. If each parallel sub-executor have their own error and timeout handlers, then a SIGALRM will kill them all.

          Christian Höltje added a comment - I have a theory: timeouts are usually implemented via asking the OS to send a SIGALRM at some point in the future (e.g. setitimer()  or alarm() ). The problem with signals is that they effect all the (Java) threads in a process. If each parallel sub-executor have their own error and timeout handlers, then a SIGALRM will kill them all.

          Shoo Yoo Yoon added a comment -

          The same issue when I try to abort one of branches from UI.

          Shoo Yoo Yoon added a comment - The same issue when I try to abort one of branches from UI.

          I'm also suffering from this bug. Did anybody find a work-around how parallel sh executions could be protected from being killed?

          In my pipeline I have an additional parallel stage which runs dependencyCheck. dependencyCheck also runs a command line tool (the OWASP dependency-check) which is not interrupted. Therefore it seems to be somehow possible to start sub processes in ways different to sh.

           

          Boris Folgmann added a comment - I'm also suffering from this bug. Did anybody find a work-around how parallel sh executions could be protected from being killed? In my pipeline I have an additional parallel stage which runs dependencyCheck. dependencyCheck also runs a command line tool (the OWASP dependency-check) which is not interrupted. Therefore it seems to be somehow possible to start sub processes in ways different to sh.  

            Unassigned Unassigned
            aravinder111 Aravinder Bandi
            Votes:
            9 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated: