Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34637

pipeline DSL: timeout() does not work if withEnv() is enclosed

    XMLWordPrintable

Details

    Description

      Pipeline/ workflow DSL: timeout() does not work as expected when a withEnv() is introduced inside the (same) timeout + node blocks and over the (same) shell command.
      Specifically, without the withEnv block, the timeout actually interrupts the shell command.
      With the withEnv block, the timeout does not interrupt the shell command; yet, the build still ends reporting that it was interrupted/ aborted.

      Attachments

        Issue Links

          Activity

            hushp1pt Tony Wallace added a comment - - edited

            This pipeline DSL works as expected: ie, the "sleep" is interrupted at 20 seconds, the timeout exception is caught, the build ends Timeout has been exceeded.

            def err = null
            try{
                timeout(time:20, unit:'SECONDS') {
                    node('chapcs00-ptmp') {
                        sh '''#!/bin/bash -ex
                            sleep 60
                        '''
                    }
                }
            } catch( caughtError ) {
                println 'catch'
                err = caughtError
            } finally {
                println 'finally'
                if(err) {
                    throw err
                }
            }
            
            console output:
            
            Started by user Tony Wallace
            [Pipeline] timeout
            [Pipeline] {
            [Pipeline] node
            Running on chapcs00-ptmp in /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout
            [Pipeline] {
            [Pipeline] sh
            [Z-timeout] Running shell script
            + sleep 60
            Sending interrupt signal to process
            /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout@tmp/durable-28979ffb/script.sh: line 2: 90895 Terminated              sleep 60
            [Pipeline] }
            [Pipeline] // node
            [Pipeline] }
            [Pipeline] // timeout
            [Pipeline] echo
            catch
            [Pipeline] echo
            finally
            [Pipeline] End of Pipeline
            Timeout has been exceeded
            Finished: ABORTED
            

            This DSL script does not work as expected: the sleep runs all the way to 60 seconds. THEN the build ends the same way, Timeout has been exceeded, except timeout did not actually interrupt anything.

            def err = null
            try{
                timeout(time:20, unit:'SECONDS') {
                    node('chapcs00-ptmp') {
                        withEnv(['BLAH=foo',]) {
                            sh '''#!/bin/bash -ex
                                sleep 60
                                : should not be here
                            '''
                        }
                    }
                }
            } catch( caughtError ) {
                println 'catch'
                err = caughtError
            } finally {
                println 'finally'
                if(err) {
                    throw err
                }
            }
            
            console output:
            
            Started by user Tony Wallace
            [Pipeline] timeout
            [Pipeline] {
            [Pipeline] node
            Running on chapcs00-ptmp in /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout
            [Pipeline] {
            [Pipeline] withEnv
            [Pipeline] {
            [Pipeline] sh
            [Z-timeout] Running shell script
            + sleep 60
            + : should not be here
            [Pipeline] }
            [Pipeline] // withEnv
            [Pipeline] }
            [Pipeline] // node
            [Pipeline] }
            [Pipeline] // timeout
            [Pipeline] echo
            catch
            [Pipeline] echo
            finally
            [Pipeline] End of Pipeline
            Timeout has been exceeded
            Finished: ABORTED
            
            hushp1pt Tony Wallace added a comment - - edited This pipeline DSL works as expected: ie, the "sleep" is interrupted at 20 seconds, the timeout exception is caught, the build ends Timeout has been exceeded . def err = null try{ timeout(time:20, unit:'SECONDS') { node('chapcs00-ptmp') { sh '''#!/bin/bash -ex sleep 60 ''' } } } catch( caughtError ) { println 'catch' err = caughtError } finally { println 'finally' if(err) { throw err } } console output: Started by user Tony Wallace [Pipeline] timeout [Pipeline] { [Pipeline] node Running on chapcs00-ptmp in /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout [Pipeline] { [Pipeline] sh [Z-timeout] Running shell script + sleep 60 Sending interrupt signal to process /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout@tmp/durable-28979ffb/script.sh: line 2: 90895 Terminated sleep 60 [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // timeout [Pipeline] echo catch [Pipeline] echo finally [Pipeline] End of Pipeline Timeout has been exceeded Finished: ABORTED This DSL script does not work as expected: the sleep runs all the way to 60 seconds. THEN the build ends the same way, Timeout has been exceeded , except timeout did not actually interrupt anything. def err = null try{ timeout(time:20, unit:'SECONDS') { node('chapcs00-ptmp') { withEnv(['BLAH=foo',]) { sh '''#!/bin/bash -ex sleep 60 : should not be here ''' } } } } catch( caughtError ) { println 'catch' err = caughtError } finally { println 'finally' if(err) { throw err } } console output: Started by user Tony Wallace [Pipeline] timeout [Pipeline] { [Pipeline] node Running on chapcs00-ptmp in /ptmp/jenkins/chapel-ci/chapcs00-ptmp/workspace/Z-timeout [Pipeline] { [Pipeline] withEnv [Pipeline] { [Pipeline] sh [Z-timeout] Running shell script + sleep 60 + : should not be here [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // timeout [Pipeline] echo catch [Pipeline] echo finally [Pipeline] End of Pipeline Timeout has been exceeded Finished: ABORTED
            hushp1pt Tony Wallace added a comment -

            The only difference between scripts with right and wrong behavior is

            withEnv(['BLAH=foo',]) {  }

            was added around the previous shell script

            hushp1pt Tony Wallace added a comment - The only difference between scripts with right and wrong behavior is withEnv(['BLAH=foo',]) { } was added around the previous shell script
            hushp1pt Tony Wallace added a comment -

            configuration details attached.

            hushp1pt Tony Wallace added a comment - configuration details attached.

            I think this belongs rather to the workflow-basic-steps-plugin component.

            I am seeing the same problem with withEnv, in which I have a long-running sh './gradlew ...', but I can also reproduce it with other blocks such as stage.

            timeout(time: 5, unit: 'SECONDS') {
                stage('weird') {
                    sleep 7
                    echo 'Huh, build should have timed out'
                }
            }
            

            Results in a successful build:

            [Pipeline] timeout
            [Pipeline] {
            [Pipeline] stage
            [Pipeline] { (weird)
            [Pipeline] sleep
            Sleeping for 7 sec
            [Pipeline] echo
            Huh, build should have timed out
            [Pipeline] }
            [Pipeline] // stage
            [Pipeline] }
            [Pipeline] // timeout
            [Pipeline] End of Pipeline
            Finished: SUCCESS
            

            Jenkins 2.23
            Pipeline: Basic Steps 2.1

            orrc Christopher Orr added a comment - I think this belongs rather to the workflow-basic-steps-plugin component. I am seeing the same problem with withEnv , in which I have a long-running sh './gradlew ...' , but I can also reproduce it with other blocks such as stage . timeout(time: 5, unit: 'SECONDS' ) { stage( 'weird' ) { sleep 7 echo 'Huh, build should have timed out' } } Results in a successful build: [Pipeline] timeout [Pipeline] { [Pipeline] stage [Pipeline] { (weird) [Pipeline] sleep Sleeping for 7 sec [Pipeline] echo Huh, build should have timed out [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // timeout [Pipeline] End of Pipeline Finished: SUCCESS Jenkins 2.23 Pipeline: Basic Steps 2.1

            This looks pretty severe to me as anything "durable" inside a timeout will fail to be cancelled, right?

            amuniz Antonio Muñiz added a comment - This looks pretty severe to me as anything "durable" inside a timeout will fail to be cancelled, right?
            jglick Jesse Glick added a comment -

            I suspect this is rather a bug in workflow-cps-plugin, specifically CpsBodyExecution.cancel.

            jglick Jesse Glick added a comment - I suspect this is rather a bug in workflow-cps-plugin , specifically CpsBodyExecution.cancel .
            jglick Jesse Glick added a comment -

            Indeed the problem is in CpsBodyExecution, not timeout.

            jglick Jesse Glick added a comment - Indeed the problem is in CpsBodyExecution , not timeout .
            covalence Domingo K added a comment - - edited

            Confirmed with code below for another example without "withEnv".

            def disableClusterRoutingAllocation(node){
                timeout(time: 10, unit: "SECONDS"){
                    waitUntil{
                        try {
                            sh "curl -XPUT -sS http://$node/_cluster/settings -d '{ \"transient\" : { \"cluster.routing.allocation.enable\" : \"none\" } }' | grep \"acknowledged\":true"
                            return true
                        } catch(exception){
                            return false
                        }
                    }
                }
            }
            
            covalence Domingo K added a comment - - edited Confirmed with code below for another example without "withEnv". def disableClusterRoutingAllocation(node){ timeout(time: 10, unit: "SECONDS" ){ waitUntil{ try { sh "curl -XPUT -sS http: //$node/_cluster/settings -d '{ \" transient \ " : { \" cluster.routing.allocation.enable\ " : \" none\ " } }' | grep \" acknowledged\ ": true " return true } catch (exception){ return false } } } }
            jglick Jesse Glick added a comment -

            Yes the bug is not in timeout or withEnv, it is more general. I am working on it.

            jglick Jesse Glick added a comment - Yes the bug is not in timeout or withEnv , it is more general. I am working on it.

            Not sure if this is a different issue or another manifestation of the issue with CpsBodyExecution, but passing a submitter into input is enough to trigger this issue with timeout for me.

            timeout(time: 3, unit: 'SECONDS') {
                // input will hang forever.
                input submitter: 'user'
            }
            
            timeout(time: 3, unit: 'SECONDS') {
                // will timeout as expected
                input()
            }
            

            If this is a different issue, let me know and I'll submit a new issue.

            steveprentice Steve Prentice added a comment - Not sure if this is a different issue or another manifestation of the issue with CpsBodyExecution , but passing a submitter into input is enough to trigger this issue with timeout for me. timeout(time: 3, unit: 'SECONDS' ) { // input will hang forever. input submitter: 'user' } timeout(time: 3, unit: 'SECONDS' ) { // will timeout as expected input() } If this is a different issue, let me know and I'll submit a new issue.
            jglick Jesse Glick added a comment -

            That is an unrelated issue: JENKINS-38380

            jglick Jesse Glick added a comment - That is an unrelated issue: JENKINS-38380

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
            src/test/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecutionTest.java
            http://jenkins-ci.org/commit/workflow-cps-plugin/6933a4925a47b07206eaf059484b37c069aebe62
            Log:
            [FIXED JENKINS-34637] CpsBodyExecution.cancel was failing to interrupt the innermost execution, and block-scoped StepExecution.stop does not generally kill its body (JENKINS-26148).
            getCurrentExecutions was also in direct violation of its Javadoc, though it does not appear to have ever been called, much less tested.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java src/test/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecutionTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/6933a4925a47b07206eaf059484b37c069aebe62 Log: [FIXED JENKINS-34637] CpsBodyExecution.cancel was failing to interrupt the innermost execution, and block-scoped StepExecution.stop does not generally kill its body ( JENKINS-26148 ). getCurrentExecutions was also in direct violation of its Javadoc, though it does not appear to have ever been called, much less tested.

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowDefinition.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/steps/ParallelStep.java
            src/test/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecutionTest.java
            src/test/java/org/jenkinsci/plugins/workflow/cps/CpsThreadDumpTest.java
            http://jenkins-ci.org/commit/workflow-cps-plugin/bee2879e1e133bc05d3f127b7221a08529fdcb1e
            Log:
            Merge pull request #76 from jglick/timeout-block-JENKINS-34637

            JENKINS-34637 Failure to kill bodies from timeout

            Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/b5c8ca8c7118...bee2879e1e13

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowDefinition.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java src/main/java/org/jenkinsci/plugins/workflow/cps/steps/ParallelStep.java src/test/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecutionTest.java src/test/java/org/jenkinsci/plugins/workflow/cps/CpsThreadDumpTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/bee2879e1e133bc05d3f127b7221a08529fdcb1e Log: Merge pull request #76 from jglick/timeout-block- JENKINS-34637 JENKINS-34637 Failure to kill bodies from timeout Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/b5c8ca8c7118...bee2879e1e13

            Code changed in jenkins
            User: Christopher Orr
            Path:
            src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java
            http://jenkins-ci.org/commit/workflow-basic-steps-plugin/55488ff554d7edfd966b73ee11b35c4ab2d8811b
            Log:
            Add test case to reproduce JENKINS-34637.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Christopher Orr Path: src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java http://jenkins-ci.org/commit/workflow-basic-steps-plugin/55488ff554d7edfd966b73ee11b35c4ab2d8811b Log: Add test case to reproduce JENKINS-34637 .

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java
            http://jenkins-ci.org/commit/workflow-basic-steps-plugin/ae8b586e824294f687ebdf40cadddff599c597bc
            Log:
            JENKINS-34637 Verifying behavior of timeout step around other block-scoped steps.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java http://jenkins-ci.org/commit/workflow-basic-steps-plugin/ae8b586e824294f687ebdf40cadddff599c597bc Log: JENKINS-34637 Verifying behavior of timeout step around other block-scoped steps.

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java
            src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java
            http://jenkins-ci.org/commit/workflow-basic-steps-plugin/5d2b2327b7ae5daec98bf747520c2960c8685f0c
            Log:
            Merge pull request #24 from jglick/timeout-block-JENKINS-34637

            JENKINS-34637 Test for timeout bug

            Compare: https://github.com/jenkinsci/workflow-basic-steps-plugin/compare/f541cd2cda5f...5d2b2327b7ae

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java src/test/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepRunTest.java http://jenkins-ci.org/commit/workflow-basic-steps-plugin/5d2b2327b7ae5daec98bf747520c2960c8685f0c Log: Merge pull request #24 from jglick/timeout-block- JENKINS-34637 JENKINS-34637 Test for timeout bug Compare: https://github.com/jenkinsci/workflow-basic-steps-plugin/compare/f541cd2cda5f...5d2b2327b7ae

            People

              jglick Jesse Glick
              hushp1pt Tony Wallace
              Votes:
              9 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: