Details
-
Type:
Bug
-
Status: Fixed but Unreleased (View Workflow)
-
Priority:
Critical
-
Resolution: Incomplete
-
Component/s: workflow-cps-plugin
-
Labels:None
-
Environment:Will edit with additional details once authorized to disclose.
Jenkins: 2.121.1
Pipeline 2.5
-
Similar Issues:
Description
We've encountered occasional hung threads living far longer than their jobs, causing system instability. Root cause is that after build logs are compressed, an additional line is appended, 'Creating placeholder flownodes because failed loading originals.', which corrupts the gz archive. If we remove the appended line, the log can be extracted.
The workaround is to move the build folder on the master, kill any remaining threads, and often we must reboot the master. This has happened multiple times so far, and we've setup thread duration monitoring jobs to detect threads & builds over X ms. Advice on additional ways of capturing relevant log information would be appreciated.
The only place I've found the offending line is:
https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L640
Closest existing issue was: https://issues.jenkins-ci.org/browse/JENKINS-50199?jql=text%20~%20%22Creating%20placeholder%20flownodes%20because%20failed%20loading%20originals.%22
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Description |
We've encountered occasional hung threads living far longer than their jobs, causing system instability. Root cause is that after build logs are compressed, and additional line is appended, 'Creating placeholder flownodes because failed loading originals.', which corrupts the gz archive. If we remove the appended line, the log could be extracted. The workaround is to move the build folder on the master, kill any remaining threads, and often we must reboot the master. This has happened multiple times so far, and we've setup thread duration monitoring jobs to detect threads & builds over X ms. Advice on additional ways of capturing relevant log information would be appreciated. The only place I've found the offending line is: [https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L640] Closest existing issue was: https://issues.jenkins-ci.org/browse/JENKINS-50199?jql=text%20~%20%22Creating%20placeholder%20flownodes%20because%20failed%20loading%20originals.%22 |
We've encountered occasional hung threads living far longer than their jobs, causing system instability. Root cause is that after build logs are compressed, an additional line is appended, 'Creating placeholder flownodes because failed loading originals.', which corrupts the gz archive. If we remove the appended line, the log could be extracted. The workaround is to move the build folder on the master, kill any remaining threads, and often we must reboot the master. This has happened multiple times so far, and we've setup thread duration monitoring jobs to detect threads & builds over X ms. Advice on additional ways of capturing relevant log information would be appreciated. The only place I've found the offending line is: [https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L640] Closest existing issue was: https://issues.jenkins-ci.org/browse/JENKINS-50199?jql=text%20~%20%22Creating%20placeholder%20flownodes%20because%20failed%20loading%20originals.%22 |
Description |
We've encountered occasional hung threads living far longer than their jobs, causing system instability. Root cause is that after build logs are compressed, an additional line is appended, 'Creating placeholder flownodes because failed loading originals.', which corrupts the gz archive. If we remove the appended line, the log could be extracted. The workaround is to move the build folder on the master, kill any remaining threads, and often we must reboot the master. This has happened multiple times so far, and we've setup thread duration monitoring jobs to detect threads & builds over X ms. Advice on additional ways of capturing relevant log information would be appreciated. The only place I've found the offending line is: [https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L640] Closest existing issue was: https://issues.jenkins-ci.org/browse/JENKINS-50199?jql=text%20~%20%22Creating%20placeholder%20flownodes%20because%20failed%20loading%20originals.%22 |
We've encountered occasional hung threads living far longer than their jobs, causing system instability. Root cause is that after build logs are compressed, an additional line is appended, 'Creating placeholder flownodes because failed loading originals.', which corrupts the gz archive. If we remove the appended line, the log can be extracted. The workaround is to move the build folder on the master, kill any remaining threads, and often we must reboot the master. This has happened multiple times so far, and we've setup thread duration monitoring jobs to detect threads & builds over X ms. Advice on additional ways of capturing relevant log information would be appreciated. The only place I've found the offending line is: [https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L640] Closest existing issue was: https://issues.jenkins-ci.org/browse/JENKINS-50199?jql=text%20~%20%22Creating%20placeholder%20flownodes%20because%20failed%20loading%20originals.%22 |
Resolution | Incomplete [ 4 ] | |
Status | Open [ 1 ] | Fixed but Unreleased [ 10203 ] |
This may have the same root cause as https://issues.jenkins-ci.org/browse/JENKINS-50199
We are on a slightly older version of the Pipeline Job plugin than the fix. Will test and soak on 2.25+