Details
-
Type:
Bug
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Component/s: workflow-job-plugin
-
Labels:
-
Environment:workflow-job plugin 2.18-2.21
-
Similar Issues:
Description
Builds and Queue maintenance are slowed down because operations require a lock on WorkflowRun to obtain the execution due to lazy-loading, and that lock can be contended heavily as a result. Normally the lock is only briefly held, but the save operation can hold it for a longer period.
On normal systems the impact should be fairly small but with highly concurrent Pipelines it may result in many threads blocked due to usages like the below (and other within-step operations):
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:844)
- waiting to lock <0x00000004ef316030> (a org.jenkinsci.plugins.workflow.job.WorkflowRun)
at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:1100)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getExecution(CpsStepContext.java:213)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getExecution(CpsStepContext.java:95)
at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:89)
at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.run(ExecutorStepExecution.java:409)
at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.runForDisplay(ExecutorStepExecution.java:418)
at hudson.util.XStream2.toXMLUTF8(XStream2.java:310) at org.jenkinsci.plugins.workflow.support.PipelineIOUtils.writeByXStream(PipelineIOUtils.java:34) at org.jenkinsci.plugins.workflow.job.WorkflowRun.save(WorkflowRun.java:1256) - locked <0x00000004ef316030> (a org.jenkinsci.plugins.workflow.job.WorkflowRun) at hudson.BulkChange.commit(BulkChange.java:98) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1447) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$3.run(CpsThreadGroup.java:417)
We are experiencing this issue pretty severely today. Here is a screenshot of the currently running web requests all blocked on this issue causing very high latency to our users. Also our Jenkins rebooted today which could be related if things got further and further backed up.