-
Bug
-
Resolution: Fixed
-
Major
-
-
workflow-basic-steps 2.20
In case the timeout occurs, and Jenkins is restarted during the grace period if waits for the inner block to terminate, then the build hangs forever with this exception in the Jenkins log:
2020-03-13 02:09:40.575+0000 [id=1502] WARNING o.j.p.w.f.FlowExecutionList$ItemListenerImpl$1#onFailure: Failed to load CpsFlowExecution[Owner[devops-gate/master/blackbox-self-service/25907:devops-gate/master/blackbox-self-service #25907]] java.lang.NullPointerException at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.cancel(TimeoutStepExecution.java:151) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.setupTimer(TimeoutStepExecution.java:139) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.onResume(TimeoutStepExecution.java:90) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:185) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:180) ...
Reproducability of this issue relies on a block that does not immediately Exit. For example:
node { timeout (time: 10, unit: 'SECONDS') { build job: 'hang2', parameters: [ new StringParameterValue('A','B') ], quietPeriod: 0 }}
with a second Pipeline Job hang2:
retry(3) { sleep 300 }
Creates this console log:
Gestartet durch Benutzer RK [Pipeline] node Running on host in /$JENKINS_HOME/workspace/hang [Pipeline] { [Pipeline] timeout Timeout set to expire in 10 Sekunden [Pipeline] { [Pipeline] build (Building hang2) Scheduling project: hang2 Starting building: hang2 #1 Cancelling nested steps due to timeout Resuming build at Fri Mar 10 15:49:00 CET 2017 after Jenkins restart Waiting to resume hang #1|: ??? Waiting to resume hang #1|: host ist offline Waiting to resume hang #1|: host ist offline Ready to run at Fri Mar 10 15:49:10 CET 2017 Timeout expired 3,7 Sekunden ago
... and then it hangs forever.
Reason: when onResume() is called, the timer is expired, so cancel() is called, and since it already tried to cancel, forcible is true, and then killer is null, causing an NPE.
Fix: Check killer for null on line 94 in cancel() in TimeoutStepExecution().
Rationale for Major, not minor bug: breaks restart resiliense.
- is duplicated by
-
JENKINS-61019 java.lang.NullPointerException at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.cancel
-
- Closed
-
- relates to
-
JENKINS-39072 timeout step should include more logging/diagnostics information
-
- Resolved
-
- links to
[JENKINS-42940] Timeout step hangs after restart if timeout occurred, but enclosed block did not exit yet
Summary | Original: hangs after restart if timeout occurred, but enclosing block did not exit yet | New: hangs after restart if timeout occurred, but enclosed block did not exit yet |
Summary | Original: hangs after restart if timeout occurred, but enclosed block did not exit yet | New: [fix included] hangs after restart if timeout occurred, but enclosed block did not exit yet |
Epic Link | New: JENKINS-35399 [ 171192 ] |
Labels | New: NPE NullPointerException restart |
Labels | Original: NPE NullPointerException restart | New: NPE NullPointerException restart triaged-2018-11 |
Link |
New:
This issue is duplicated by |
Summary | Original: [fix included] hangs after restart if timeout occurred, but enclosed block did not exit yet | New: Timeout step hangs after restart if timeout occurred, but enclosed block did not exit yet |
Assignee | New: Devin Nusbaum [ dnusbaum ] |
Description |
Original:
In case the timeout occurs, and Jenkins is restarted during the grace period if waits for the inner block to terminate, then the build hangs forever with this exception in the Jenkins log: Mär 10, 2017 3:49:10 PM org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1 onFailure WARNUNG: Failed to load CpsFlowExecution[Owner[hang/1:hang #1]] java.lang.NullPointerException at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.cancel(TimeoutStepExecution.java:94) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.setupTimer(TimeoutStepExecution.java:88) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.onResume(TimeoutStepExecution.java:57) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:185) Reproducability of this issue relies on a block that does not immediately Exit. For example: node \{ timeout (time: 10, unit: 'SECONDS') \{ build job: 'hang2', parameters: [ new StringParameterValue('A','B') ], quietPeriod: 0 }} with a second Pipeline Job hang2: retry(3) \{ sleep 300 } Creates this console log: Gestartet durch Benutzer RK [Pipeline] node Running on host in /$JENKINS_HOME/workspace/hang [Pipeline] \{ [Pipeline] timeout Timeout set to expire in 10 Sekunden [Pipeline] \{ [Pipeline] build (Building hang2) Scheduling project: hang2 Starting building: hang2 #1 Cancelling nested steps due to timeout Resuming build at Fri Mar 10 15:49:00 CET 2017 after Jenkins restart Waiting to resume hang #1|: ??? Waiting to resume hang #1|: host ist offline Waiting to resume hang #1|: host ist offline Ready to run at Fri Mar 10 15:49:10 CET 2017 Timeout expired 3,7 Sekunden ago ... and then it hangs forever. Reason: when onResume() is called, the timer is expired, so cancel() is called, and since it already tried to cancel, forcible is true, and then killer is null, causing an NPE. Fix: Check killer for null on line 94 in cancel() in TimeoutStepExecution(). Rationale for Major, not minor bug: breaks restart resiliense. |
New:
In case the timeout occurs, and Jenkins is restarted during the grace period if waits for the inner block to terminate, then the build hangs forever with this exception in the Jenkins log: {noformat} 2020-03-13 02:09:40.575+0000 [id=1502] WARNING o.j.p.w.f.FlowExecutionList$ItemListenerImpl$1#onFailure: Failed to load CpsFlowExecution[Owner[devops-gate/master/blackbox-self-service/25907:devops-gate/master/blackbox-self-service #25907]] java.lang.NullPointerException at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.cancel(TimeoutStepExecution.java:151) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.setupTimer(TimeoutStepExecution.java:139) at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.onResume(TimeoutStepExecution.java:90) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:185) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:180) ... {noformat} Reproducability of this issue relies on a block that does not immediately Exit. For example: {code} node { timeout (time: 10, unit: 'SECONDS') { build job: 'hang2', parameters: [ new StringParameterValue('A','B') ], quietPeriod: 0 }} {code} with a second Pipeline Job hang2: {code} retry(3) { sleep 300 } {code} Creates this console log: {noformat} Gestartet durch Benutzer RK [Pipeline] node Running on host in /$JENKINS_HOME/workspace/hang [Pipeline] { [Pipeline] timeout Timeout set to expire in 10 Sekunden [Pipeline] { [Pipeline] build (Building hang2) Scheduling project: hang2 Starting building: hang2 #1 Cancelling nested steps due to timeout Resuming build at Fri Mar 10 15:49:00 CET 2017 after Jenkins restart Waiting to resume hang #1|: ??? Waiting to resume hang #1|: host ist offline Waiting to resume hang #1|: host ist offline Ready to run at Fri Mar 10 15:49:10 CET 2017 Timeout expired 3,7 Sekunden ago {noformat} ... and then it hangs forever. Reason: when onResume() is called, the timer is expired, so cancel() is called, and since it already tried to cancel, forcible is true, and then killer is null, causing an NPE. Fix: Check killer for null on line 94 in cancel() in TimeoutStepExecution(). Rationale for Major, not minor bug: breaks restart resiliense. |
Remote Link | New: This issue links to "jenkinsci/workflow-basic-steps-plugin#112 (Web Link)" [ 24855 ] |