[JENKINS-42937] Massive slowdown of pipeline execution when making massive use of parallel - Jenkins Jira

Type: Bug
Resolution: Duplicate
Priority: Blocker
Component/s: pipeline
Labels:
None
Environment:
Jenkins 2.50 on Windows (master)

Similar Issues:
Powered by SuggestiMate

Show
URL:
https://github.com/jenkinsci/workflow-support-plugin/pull/34

I use pipeline parallel feature to schedule all regression tests we use in parallel, before the weekend we decided to change the job, to use the full set of tests (roughly 650 test suites). To ensure a good test system utilization we build up a dedicated node with four executors and we create a parallel branch per test suite. This was quite well proven with smaller sets of tests, now we observe an tremendous slowdown of pipeline execution, to a point were a simple informative echo takes several minutes to be processed according to the job log.

Looking at related systems utilization, it turns out that the master runs cpu bound on one core of four (Java all the time 25% or more). Using Jenkins monitor plugin we found all the time is consumed by WorkflowRun.copyLogs in LinearBlockHoppingScanner.java:120-123, see stack trace provided.

org.jenkinsci.plugins.workflow.graphanalysis.LinearBlockHoppingScanner.next(LinearBlockHoppingScanner.java:120)
org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.next(AbstractFlowScanner.java:212)
org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.next(AbstractFlowScanner.java:94)
org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:255)
org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.findFirstMatch(LinearScanner.java:135)
org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:274)
org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.isRunning(LogActionImpl.java:153)
org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.getLogText(LogActionImpl.java:128)
org.jenkinsci.plugins.workflow.job.WorkflowRun.copyLogs(WorkflowRun.java:441)
org.jenkinsci.plugins.workflow.job.WorkflowRun.access$600(WorkflowRun.java:125)
org.jenkinsci.plugins.workflow.job.WorkflowRun$3.run(WorkflowRun.java:313)
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
java.util.concurrent.FutureTask.runAndReset(Unknown Source)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)

Looking at source code it seems to loop over a list in a next method, calling on each iteration .contains(...) on another collection. Depending on underlying collections this could introduce a huge runtime complexity.

duplicates

JENKINS-40934 LogActionImpl listener inefficient; poor performance queuing large parallel workloads

Resolved

Assignee:: Sam Van Oort

Reporter:: Florian Manschwetus

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2017-03-20 13:11

Updated:: 2017-03-21 13:24

Resolved:: 2017-03-21 13:24

Details

Description

Attachments

Issue Links

Activity

People

Dates