-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
workflow-api 2.47 and older
Pipeline builds that are missing from FlowExecutionList, but which are still in progress, may hang forever after a Jenkins restart.
Normally, FlowExecutionList is responsible for resuming running Pipeline builds after a restart, but really anything that causes the build to be loaded will make it resume. However, if the Pipeline is missing from FlowExecutionList and resumes because it is loaded directly, then this code is skipped, and any step executions in that build are not resumed. This can result in the Pipeline hanging forever.
I ran into this issue while backing up and restoring a large Jenkins controller using a file-based backup system while Jenkins was running. Since Jenkins was running, the serialized state of FlowExecutionList and the build itself did not match in the backup. I am not sure if it is possible to run into this issue in non-backup scenarios.
That said, we can harden against this issue by having Pipelines resume their step executions directly when they are loaded, rather than relying on FlowExecutionList to do so. This way it does not matter if the serialized state of FlowExecutionList is somehow incorrect and something else causes a Pipeline to resume. See jenkinsci/workflow-api-plugin#178.
- causes
-
JENKINS-67351 thread deadlock after update to 2.319.1
-
- Resolved
-
- relates to
-
JENKINS-43587 Pipeline fails to resume after master restart/plugin upgrade
-
- Resolved
-
- links to
[JENKINS-67164] Pipelines missing from FlowExecutionList hang forever after resuming
Description |
Original:
Pipeline builds that are missing from {{FlowExecutionList}}, but which are still in progress, may hang forever after a Jenkins restart.
Normally, {{FlowExecutionList}} is responsible for resuming running Pipeline builds after a restart, but really anything that causes the build to be loaded will make it resume. However, if the Pipeline is missing from {{FlowExecutionList}} and resumes because it is loaded directly, then [this code](https://github.com/jenkinsci/workflow-api-plugin/blob/b922745a12d0a7816c74028cfed232b73b531767/src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecutionList.java#L177-L197) is skipped, and any step executions in that build are not resumed. This can result in the Pipeline hanging forever. I ran into this issue while backing up and restoring a large Jenkins controller using a file-based backup system while Jenkins was running. Since Jenkins was running, the serialized state of FlowExecutionList and the build itself did not match in the backup. I am not sure if it is possible to run into this issue in non-backup scenarios. That said, we can harden against this issue by having Pipelines resume their step executions directly when they are loaded, rather than relying on {{FlowExecutionList}} to do so. This way it does not matter if the serialized state of {{FlowExecutionList}} is somehow incorrect and something else causes a Pipeline to resume. See [jenkinsci/workflow-api-plugin#178](https://github.com/jenkinsci/workflow-api-plugin/pull/178). |
New:
Pipeline builds that are missing from {{FlowExecutionList}}, but which are still in progress, may hang forever after a Jenkins restart.
Normally, {{FlowExecutionList}} is responsible for resuming running Pipeline builds after a restart, but really anything that causes the build to be loaded will make it resume. However, if the Pipeline is missing from {{FlowExecutionList}} and resumes because it is loaded directly, then [this code|https://github.com/jenkinsci/workflow-api-plugin/blob/b922745a12d0a7816c74028cfed232b73b531767/src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecutionList.java#L177-L197] is skipped, and any step executions in that build are not resumed. This can result in the Pipeline hanging forever. I ran into this issue while backing up and restoring a large Jenkins controller using a file-based backup system while Jenkins was running. Since Jenkins was running, the serialized state of FlowExecutionList and the build itself did not match in the backup. I am not sure if it is possible to run into this issue in non-backup scenarios. That said, we can harden against this issue by having Pipelines resume their step executions directly when they are loaded, rather than relying on {{FlowExecutionList}} to do so. This way it does not matter if the serialized state of {{FlowExecutionList}} is somehow incorrect and something else causes a Pipeline to resume. See [jenkinsci/workflow-api-plugin#178|https://github.com/jenkinsci/workflow-api-plugin/pull/178]. |
Link |
New:
This issue relates to |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
Status | Original: In Progress [ 3 ] | New: In Review [ 10005 ] |
Remote Link | New: This issue links to "jenkinsci/workflow-api-plugin#178 (Web Link)" [ 27229 ] |
Resolution | New: Fixed [ 1 ] | |
Status | Original: In Review [ 10005 ] | New: Fixed but Unreleased [ 10203 ] |
Released As | New: 1105.v3de5e2efac97 | |
Status | Original: Fixed but Unreleased [ 10203 ] | New: Resolved [ 5 ] |
Released As | Original: 1105.v3de5e2efac97 | New: workflow-api 1105.v3de5e2efac97 |
Link |
New:
This issue causes |
Resolution | Original: Fixed [ 1 ] | |
Status | Original: Resolved [ 5 ] | New: Reopened [ 4 ] |