[JENKINS-45553] Parallel pipeline execution scales poorly

Type: Bug
Resolution: Fixed
Priority: Critical
Component/s: core, workflow-durable-task-step-plugin, workflow-support-plugin
Labels:
- performance
- pipeline
Environment:

Hide
Jenkins 2.69
Pipeline 2.5

Ubuntu 16.04
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
JVM args: -XX:+UseG1GC -XX:+ExplicitGCInvokesConcurrent -XX:+ParallelRefProcEnabled -XX:+UseStringDeduplication -XX:+UnlockDiagnosticVMOptions -XX:G1SummarizeRSetStatsPeriod=1 -server -XX:+AlwaysPreTouch -Djenkins.install.runSetupWizard=false -Dgroovy.use.classvalue=true -Xmx8192m -Xms8192m

Show
Jenkins 2.69 Pipeline 2.5 Ubuntu 16.04 OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode) JVM args: -XX:+UseG1GC -XX:+ExplicitGCInvokesConcurrent -XX:+ParallelRefProcEnabled -XX:+UseStringDeduplication -XX:+UnlockDiagnosticVMOptions -XX:G1SummarizeRSetStatsPeriod=1 -server -XX:+AlwaysPreTouch -Djenkins.install.runSetupWizard=false -Dgroovy.use.classvalue=true -Xmx8192m -Xms8192m

Similar Issues:
Powered by SuggestiMate

Show

Execution of parallel blocks scales poorly for values of N > 100. With ~50 nodes (each with 4 executors, for a total of ~200 slots), the following pipeline job takes extraordinarily long to execute:

def stepsForParallel = [:]
for (int i = 0; i < Integer.valueOf(params.SUB_JOBS); i++) {
  def s = "subjob_${i}" 
  stepsForParallel[s] = {
    node("darwin") {
      echo "hello"
    }
  }
}
parallel stepsForParallel

SUB_JOBS   Time (sec)
---------------------
 100         10
 200         40
 300         96
 400        214
 500        392
 600        660
 700        960
 800       1500
 900       2220
1000       gave up...

At no point does the underlying system become taxed (CPU utilization is very low, as this is a very beefy system – 28 cores, 128GB RAM, SSDs)

CPU and Thread CPU Time Sampling (via VisualVM) are attached for reference.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

JENKINS-45553_20170725.tgz
17 kB
2017-07-25 23:17
VisualVM_Thread_CPU_Time_Sampling_After_Run.png
105 kB
2017-07-14 23:05
VisualVM_Thread_CPU_Time_Sampling.png
84 kB
2017-07-14 22:58
VisualVM_CPU_Sampling.png
69 kB
2017-07-14 22:58

depends on

JENKINS-38381 [JEP-210] Optimize log handling in Pipeline and Durable Task

Resolved

JENKINS-36547 Queue.Task.getFullDisplayName is a poor choice of key for LoadBalancer.CONSISTENT_HASH

Resolved

is duplicated by

JENKINS-45876 Jenkins becomes extremely slow while running a lot of tests parallel

Resolved

relates to

JENKINS-34542 Hang in ExecutorStepExecution

Resolved

JENKINS-42556 PlaceholderTask.runForDisplay vulnerable to AccessDeniedException

Resolved

JENKINS-38223 FlowNode.isRunning is not very useful

Closed

JENKINS-26132 Executor should show the current stage the flow run is in

Resolved

JENKINS-40934 LogActionImpl listener inefficient; poor performance queuing large parallel workloads

Resolved

links to

core PR 2947

core PR 2948

workflow-durable-task-step PR 44

workflow-durable-task-step PR 45

workflow-durable-task-step PR 46

workflow-support PR 38

(3 relates to, 6 links to)

Jesse Glick added a comment - 2017-07-27 20:59

Reproducible at n=1000, with one mock-slave agent with 200 executors. Seems to be a grab-bag of issues. Well-known copyLogs overhead (~~JENKINS-38381~~ off the top of my head); related LogActionImpl.isRunning; some step or another still using Guice; core Queue management overhead; HMACConfidentialKey.createMac being too slow for repeated use from ConsoleNote.encodeTo; etc.

Jesse Glick added a comment - 2017-07-27 20:59 Reproducible at n=1000, with one mock-slave agent with 200 executors. Seems to be a grab-bag of issues. Well-known copyLogs overhead ( JENKINS-38381 off the top of my head); related LogActionImpl.isRunning ; some step or another still using Guice; core Queue management overhead; HMACConfidentialKey.createMac being too slow for repeated use from ConsoleNote.encodeTo ; etc.

Jesse Glick added a comment - 2017-07-27 21:05

Also Launcher.kill from PlaceholderTask.finish can be slow.

Jesse Glick added a comment - 2017-07-27 21:05 Also Launcher.kill from PlaceholderTask.finish can be slow.

Jesse Glick added a comment - 2017-07-27 21:26

Also Queue.maintain should not call String taskDisplayName = p.task.getFullDisplayName(); unless logging is enabled!

Jesse Glick added a comment - 2017-07-27 21:26 Also Queue.maintain should not call String taskDisplayName = p.task.getFullDisplayName(); unless logging is enabled!

Jesse Glick added a comment - 2017-07-27 22:50

Filed PRs for the low-hanging fruit.

Jesse Glick added a comment - 2017-07-27 22:50 Filed PRs for the low-hanging fruit.

Florian Manschwetus added a comment - 2017-07-28 09:33 - edited

Just to note, this is a real PITA, currently our master runs fulltime 100% CPU on 4 cores and consumes close to 4,5G RAM as the scans keep the threads busy, blocking background GC.
It runs 4 builds with 4 active parallel branches each so 16 active branches in sum.

Florian Manschwetus added a comment - 2017-07-28 09:33 - edited Just to note, this is a real PITA, currently our master runs fulltime 100% CPU on 4 cores and consumes close to 4,5G RAM as the scans keep the threads busy, blocking background GC. It runs 4 builds with 4 active parallel branches each so 16 active branches in sum.

SCM/JIRA link daemon added a comment - 2017-07-30 12:41

Code changed in jenkins
User: Oleg Nenashev
Path:
core/src/main/java/hudson/model/Queue.java
http://jenkins-ci.org/commit/jenkins/378199ff846b2b5b19b56a40b43289127f74f9cc
Log:
Merge pull request #2947 from jglick/Queue-opt

~~JENKINS-45553~~ Avoid calling Task.getFullDisplayName unless and until we need to

Compare: https://github.com/jenkinsci/jenkins/compare/4df5895b7eb2...378199ff846b

SCM/JIRA link daemon added a comment - 2017-07-30 12:41 Code changed in jenkins User: Oleg Nenashev Path: core/src/main/java/hudson/model/Queue.java http://jenkins-ci.org/commit/jenkins/378199ff846b2b5b19b56a40b43289127f74f9cc Log: Merge pull request #2947 from jglick/Queue-opt JENKINS-45553 Avoid calling Task.getFullDisplayName unless and until we need to Compare: https://github.com/jenkinsci/jenkins/compare/4df5895b7eb2...378199ff846b

SCM/JIRA link daemon added a comment - 2017-07-30 12:42

Code changed in jenkins
User: Jesse Glick
Path:
core/src/main/java/jenkins/security/HMACConfidentialKey.java
http://jenkins-ci.org/commit/jenkins/564ca0b6d7f50c3b1164f3f1de8dad5c8c89a8cb
Log:
~~JENKINS-45553~~ - Cache the Mac so we do not need to constantly recreate it (#2948)

~~JENKINS-45553~~ - Cache the Mac so we do not need to constantly recreate it

SCM/JIRA link daemon added a comment - 2017-07-30 12:42 Code changed in jenkins User: Jesse Glick Path: core/src/main/java/jenkins/security/HMACConfidentialKey.java http://jenkins-ci.org/commit/jenkins/564ca0b6d7f50c3b1164f3f1de8dad5c8c89a8cb Log: JENKINS-45553 - Cache the Mac so we do not need to constantly recreate it (#2948) JENKINS-45553 - Cache the Mac so we do not need to constantly recreate it

Florian Meser added a comment - 2017-08-01 07:15

It seems there are also connections to stages. Something at the end of an stage seems to kind of "reset" this exponential duration. I.e. if you would split SUB_JOBS=800 into four stages each =200 the duration would be far less.

See also JENKINS-45876

Florian Meser added a comment - 2017-08-01 07:15 It seems there are also connections to stages. Something at the end of an stage seems to kind of "reset" this exponential duration. I.e. if you would split SUB_JOBS=800 into four stages each =200 the duration would be far less. See also JENKINS-45876

Florian Manschwetus added a comment - 2017-08-01 07:28 - edited

I could probably support this observation, as we reuse the parallel worker spawning subparallels construct for each test stage, I thought till now, leaving the enclosing parallel resets behavior.

If interested/relevant I could try to elaborate a bit more about our scheduling construct.

Also it looks like rolling back Supporting APIs to 2.13 reverts the false improvement in ~~JENKINS-40934~~

Florian Manschwetus added a comment - 2017-08-01 07:28 - edited I could probably support this observation, as we reuse the parallel worker spawning subparallels construct for each test stage, I thought till now, leaving the enclosing parallel resets behavior. If interested/relevant I could try to elaborate a bit more about our scheduling construct. Also it looks like rolling back Supporting APIs to 2.13 reverts the false improvement in JENKINS-40934

Jesse Glick added a comment - 2017-08-01 16:18

https://github.com/jenkinsci/workflow-support-plugin/pull/38 pretty much fixed the problem for me, so no need to waste time testing the current code.

Jesse Glick added a comment - 2017-08-01 16:18 https://github.com/jenkinsci/workflow-support-plugin/pull/38 pretty much fixed the problem for me, so no need to waste time testing the current code.

Florian Manschwetus added a comment - 2017-08-02 12:25

I get this messages in the log, do they matter?

Aug 01, 2017 1:32:47 PM SEVERE io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
nestedBranches size: 520 not equal to parallelBranchEndNodes: 521
Aug 01, 2017 1:32:47 PM SEVERE io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
nestedBranches size: 1 not equal to parallelBranchEndNodes: 2

Florian Manschwetus added a comment - 2017-08-02 12:25 I get this messages in the log, do they matter? Aug 01, 2017 1:32:47 PM SEVERE io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart nestedBranches size: 520 not equal to parallelBranchEndNodes: 521 Aug 01, 2017 1:32:47 PM SEVERE io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart nestedBranches size: 1 not equal to parallelBranchEndNodes: 2

Jesse Glick added a comment - 2017-08-02 13:39

Sounds like a blueocean-plugin bug, not necessarily related.

Jesse Glick added a comment - 2017-08-02 13:39 Sounds like a blueocean-plugin bug, not necessarily related.

Sam Van Oort added a comment - 2017-08-03 02:32

manschwetus were they throwing that error before applying this change? If not, that would actually make me nervous jglick - the APIs that BlueOcean class depends on are pretty tightly coupled to the FlowNode APIs, and one should never see more parallel branch starts detected than ends.

Sam Van Oort added a comment - 2017-08-03 02:32 manschwetus were they throwing that error before applying this change? If not, that would actually make me nervous jglick - the APIs that BlueOcean class depends on are pretty tightly coupled to the FlowNode APIs, and one should never see more parallel branch starts detected than ends.

Florian Manschwetus added a comment - 2017-08-03 05:26

Afaik yes, as I haven't installed anything beside BlueOcean since roughly a week, I just tried to downgrade pipeline supporting, but this seems to not have completed/worked as I expected, maybe I have to restart Jenkins to have the downgrade effective.

Florian Manschwetus added a comment - 2017-08-03 05:26 Afaik yes, as I haven't installed anything beside BlueOcean since roughly a week, I just tried to downgrade pipeline supporting, but this seems to not have completed/worked as I expected, maybe I have to restart Jenkins to have the downgrade effective.

Florian Manschwetus added a comment - 2017-08-03 06:57

jglick, in which component and version, the will be in? As there seems to be an issue with 2.72 and you state your self, to not waste time with the first changes.

Florian Manschwetus added a comment - 2017-08-03 06:57 jglick , in which component and version, the will be in? As there seems to be an issue with 2.72 and you state your self, to not waste time with the first changes.

Jesse Glick added a comment - 2017-08-03 19:06

Certain fixes are indeed in core, but the most critical is in workflow-api + workflow-support; versions TBD.

Jesse Glick added a comment - 2017-08-03 19:06 Certain fixes are indeed in core, but the most critical is in workflow-api + workflow-support ; versions TBD.

Jesse Glick added a comment - 2017-08-03 19:16

Still have a hotspot related to ~~JENKINS-26132~~

"AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#731]" #1011 daemon prio=5 os_prio=0 tid=0x00007f4dd0010000 nid=0x1a36 runnable [0x00007f4d6bafa000]
   java.lang.Thread.State: RUNNABLE
	at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
	at java.util.LinkedHashSet.<init>(LinkedHashSet.java:169)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:132)
	at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.setup(LinearScanner.java:187)
	at org.jenkinsci.plugins.workflow.graphanalysis.LinearBlockHoppingScanner.setup(LinearBlockHoppingScanner.java:68)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:172)
	at org.jenkinsci.plugins.workflow.graphanalysis.FlowScanningUtils.fetchEnclosingBlocks(FlowScanningUtils.java:122)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.computeEnclosingLabel(ExecutorStepExecution.java:473)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getEnclosingLabel(ExecutorStepExecution.java:465)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:422)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getFullDisplayName(ExecutorStepExecution.java:438)
	at hudson.model.LoadBalancer$1.assignGreedily(LoadBalancer.java:115)
	at hudson.model.LoadBalancer$1.map(LoadBalancer.java:105)
	at hudson.model.LoadBalancer$2.map(LoadBalancer.java:157)
	at hudson.model.Queue.maintain(Queue.java:1571)
	at …

which would best be solved by implementing ~~JENKINS-36547~~.

Jesse Glick added a comment - 2017-08-03 19:16 Still have a hotspot related to JENKINS-26132 "AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#731]" #1011 daemon prio=5 os_prio=0 tid=0x00007f4dd0010000 nid=0x1a36 runnable [0x00007f4d6bafa000] java.lang.Thread.State: RUNNABLE at java.util.AbstractCollection.addAll(AbstractCollection.java:343) at java.util.LinkedHashSet.<init>(LinkedHashSet.java:169) at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:132) at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.setup(LinearScanner.java:187) at org.jenkinsci.plugins.workflow.graphanalysis.LinearBlockHoppingScanner.setup(LinearBlockHoppingScanner.java:68) at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:172) at org.jenkinsci.plugins.workflow.graphanalysis.FlowScanningUtils.fetchEnclosingBlocks(FlowScanningUtils.java:122) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.computeEnclosingLabel(ExecutorStepExecution.java:473) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getEnclosingLabel(ExecutorStepExecution.java:465) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:422) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getFullDisplayName(ExecutorStepExecution.java:438) at hudson.model.LoadBalancer$1.assignGreedily(LoadBalancer.java:115) at hudson.model.LoadBalancer$1.map(LoadBalancer.java:105) at hudson.model.LoadBalancer$2.map(LoadBalancer.java:157) at hudson.model.Queue.maintain(Queue.java:1571) at … which would best be solved by implementing JENKINS-36547 .

Jesse Glick added a comment - 2017-08-03 19:19

The code slated to be deleted in ~~JENKINS-38381~~ also remains a hotspot, as previously noted:

"Running CpsFlowExecution[Owner[…]]" …
   java.lang.Thread.State: RUNNABLE
	at java.util.HashMap.putVal(HashMap.java:628)
	at java.util.HashMap.put(HashMap.java:611)
	at java.util.HashSet.add(HashSet.java:219)
	at java.util.AbstractCollection.addAll(AbstractCollection.java:344)
	at java.util.LinkedHashSet.<init>(LinkedHashSet.java:169)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:132)
	at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.setup(LinearScanner.java:187)
	at org.jenkinsci.plugins.workflow.graphanalysis.LinearBlockHoppingScanner.setup(LinearBlockHoppingScanner.java:68)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:251)
	at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.findFirstMatch(LinearScanner.java:135)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:274)
	at org.jenkinsci.plugins.workflow.graph.FlowNode.isActive(FlowNode.java:136)
	at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.getLogText(LogActionImpl.java:123)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun.copyLogs(WorkflowRun.java:485)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$600(WorkflowRun.java:134)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:959)
	- locked <0x00000006ca550158> (a java.util.concurrent.atomic.AtomicBoolean)
	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1221)
	at …

Jesse Glick added a comment - 2017-08-03 19:19 The code slated to be deleted in JENKINS-38381 also remains a hotspot, as previously noted: "Running CpsFlowExecution[Owner[…]]" … java.lang.Thread.State: RUNNABLE at java.util.HashMap.putVal(HashMap.java:628) at java.util.HashMap.put(HashMap.java:611) at java.util.HashSet.add(HashSet.java:219) at java.util.AbstractCollection.addAll(AbstractCollection.java:344) at java.util.LinkedHashSet.<init>(LinkedHashSet.java:169) at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:132) at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.setup(LinearScanner.java:187) at org.jenkinsci.plugins.workflow.graphanalysis.LinearBlockHoppingScanner.setup(LinearBlockHoppingScanner.java:68) at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:251) at org.jenkinsci.plugins.workflow.graphanalysis.LinearScanner.findFirstMatch(LinearScanner.java:135) at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.findFirstMatch(AbstractFlowScanner.java:274) at org.jenkinsci.plugins.workflow.graph.FlowNode.isActive(FlowNode.java:136) at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.getLogText(LogActionImpl.java:123) at org.jenkinsci.plugins.workflow.job.WorkflowRun.copyLogs(WorkflowRun.java:485) at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$600(WorkflowRun.java:134) at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:959) - locked <0x00000006ca550158> (a java.util.concurrent.atomic.AtomicBoolean) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1221) at …

Sam Van Oort added a comment - 2017-08-03 19:51

jglick RE the first hotspot, what about the suggestion I made last year to reduce the cost of this call? https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/2#r59458780

I'd anticipated nearly a year ago that this would potentially be a performance problem, and think now might be a reasonable time to implement the suggestions there.

RE the second hotspot – what about reusing a single instance of a LinearScanner for logging within a single flow, maybe caching in a transient field? This reduces some of the setup work, and may play nicer with the CPU cache since the same object is getting hit often from a single thread.

I'd designed the scanners to be reusable for that sort of use case, so would be nice to see if it pays off like expected.

Sam Van Oort added a comment - 2017-08-03 19:51 jglick RE the first hotspot, what about the suggestion I made last year to reduce the cost of this call? https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/2#r59458780 I'd anticipated nearly a year ago that this would potentially be a performance problem, and think now might be a reasonable time to implement the suggestions there. RE the second hotspot – what about reusing a single instance of a LinearScanner for logging within a single flow, maybe caching in a transient field? This reduces some of the setup work, and may play nicer with the CPU cache since the same object is getting hit often from a single thread. I'd designed the scanners to be reusable for that sort of use case, so would be nice to see if it pays off like expected.

Jesse Glick added a comment - 2017-08-03 20:05

what about the suggestion I made last year to reduce the cost of this call

getEnclosingLabel already does some caching. But queue maintenance calls are a different matter.

This reduces some of the setup work

Not sure how—you would still be calling setup with each head. Anyway feel free to play with this kind of micro-optimization later; the patches as they stand make a clear improvement.

Jesse Glick added a comment - 2017-08-03 20:05 what about the suggestion I made last year to reduce the cost of this call getEnclosingLabel already does some caching. But queue maintenance calls are a different matter. This reduces some of the setup work Not sure how—you would still be calling setup with each head. Anyway feel free to play with this kind of micro-optimization later; the patches as they stand make a clear improvement.

Florian Manschwetus added a comment - 2017-08-10 13:28 - edited

Is there anything blocking progress in this matter?

As this is a regression, prior to latest supporting api change our pipeline has finished in an acceptable time frame (~1 day, now it takes ~6 days!!!)

I already tried to rollback supporting-apis, but the dependencies are too tight, it will require rolling back all things around pipeline one by one.

Florian Manschwetus added a comment - 2017-08-10 13:28 - edited Is there anything blocking progress in this matter? As this is a regression, prior to latest supporting api change our pipeline has finished in an acceptable time frame (~1 day, now it takes ~6 days!!!) I already tried to rollback supporting-apis, but the dependencies are too tight, it will require rolling back all things around pipeline one by one.

Florian Meser added a comment - 2017-08-11 07:09

Same question. What is blocking the progress? This fix is crucial for us atm.

Can a statement be made when this issue will be released (at least approx.)? Last release was end of march.

Florian Meser added a comment - 2017-08-11 07:09 Same question. What is blocking the progress? This fix is crucial for us atm. Can a statement be made when this issue will be released (at least approx.)? Last release was end of march.

Sam Van Oort added a comment - 2017-08-11 15:00

florian_meser The fix that was done earlier had a significant bug in it, and the attempt to work around to that caused another performance regression by undoing a fix we did earlier to prevent abusive performance if you have many nodes followed by a lot of parallel branches.

I believe jglick has arrived at a hybrid solution that will satisfy the requirements in all cases but requires some additional work (not sure on ETA, maybe he can speak to it – might end up falling to someone else). However this solution should generate far more significant performance improvements for all.

Sam Van Oort added a comment - 2017-08-11 15:00 florian_meser The fix that was done earlier had a significant bug in it, and the attempt to work around to that caused another performance regression by undoing a fix we did earlier to prevent abusive performance if you have many nodes followed by a lot of parallel branches. I believe jglick has arrived at a hybrid solution that will satisfy the requirements in all cases but requires some additional work (not sure on ETA, maybe he can speak to it – might end up falling to someone else). However this solution should generate far more significant performance improvements for all.

Florian Manschwetus added a comment - 2017-08-18 07:50 - edited

As this seems to take much longer as expected, is it possible to respin supporting apis 2.14 without ~~JENKINS-40934~~, as this change has made things seriously worse?

I'm currently checking the option of going back to supporting APIs 2.13, but this means rolling back ppl groovy to 2.29 and also all dependents on ppl groovy >2.29, only fiddling out what to roll back the version to go back to is a serious issue.

Found another solution, as the change making things worse is suspected to be https://github.com/jenkinsci/workflow-support-plugin/commit/4b48f17fc8525a9d2cfb63c763d7ec5070192712 , I cloned supporting apis 2.14 and reverted this merge and build the result as 2.14.1-sth and installed it, now I can verify that reverting this change solves the regression. Of course really fixing this is the best thing to do, but when this needs more time, probably revert this change in a patch release till the real fix is ready.

Florian Manschwetus added a comment - 2017-08-18 07:50 - edited As this seems to take much longer as expected, is it possible to respin supporting apis 2.14 without JENKINS-40934 , as this change has made things seriously worse? I'm currently checking the option of going back to supporting APIs 2.13, but this means rolling back ppl groovy to 2.29 and also all dependents on ppl groovy >2.29, only fiddling out what to roll back the version to go back to is a serious issue. Found another solution, as the change making things worse is suspected to be https://github.com/jenkinsci/workflow-support-plugin/commit/4b48f17fc8525a9d2cfb63c763d7ec5070192712 , I cloned supporting apis 2.14 and reverted this merge and build the result as 2.14.1-sth and installed it, now I can verify that reverting this change solves the regression. Of course really fixing this is the best thing to do, but when this needs more time, probably revert this change in a patch release till the real fix is ready.

Jesse Glick added a comment - 2017-08-18 14:56

Comment about Task.fullDisplayName

Jesse Glick added a comment - 2017-08-18 14:56 Comment about Task.fullDisplayName

Jesse Glick added a comment - 2017-08-18 14:56

manschwetus please be patient, fixes are under development / review.

Jesse Glick added a comment - 2017-08-18 14:56 manschwetus please be patient, fixes are under development / review.

Sam Van Oort added a comment - 2017-08-18 17:04

It looks generally good now, manschwetus – you will probably be able to run the SNAPSHOT build off Jesse's PR branch and see a big improvement, until we've finished dotting the last few i's and crossing t's to cut the ultimate release.

Sam Van Oort added a comment - 2017-08-18 17:04 It looks generally good now, manschwetus – you will probably be able to run the SNAPSHOT build off Jesse's PR branch and see a big improvement, until we've finished dotting the last few i's and crossing t's to cut the ultimate release.

Jesse Glick added a comment - 2017-08-22 15:14

The pair of PRs has been approved; awaiting abayer or svanoort to cut releases.

Jesse Glick added a comment - 2017-08-22 15:14 The pair of PRs has been approved; awaiting abayer or svanoort to cut releases.

Andrew Bayer added a comment - 2017-08-22 15:26

I'm gonna defer to svanoort for those. =)

Andrew Bayer added a comment - 2017-08-22 15:26 I'm gonna defer to svanoort for those. =)

Sam Van Oort added a comment - 2017-08-22 15:50

abayer No, that's fine – feel free to cut a release with these.

Sam Van Oort added a comment - 2017-08-22 15:50 abayer No, that's fine – feel free to cut a release with these.

SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Code changed in jenkins
User: Jesse Glick
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java
http://jenkins-ci.org/commit/workflow-support-plugin/d78f301c4630ad3d9cbd408494ed1ad36e97b132
Log:
~~JENKINS-38223~~ Using FlowNode.isActive to eliminate the main overhead in ~~JENKINS-45553~~.

SCM/JIRA link daemon added a comment - 2017-08-22 18:59 Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java http://jenkins-ci.org/commit/workflow-support-plugin/d78f301c4630ad3d9cbd408494ed1ad36e97b132 Log: JENKINS-38223 Using FlowNode.isActive to eliminate the main overhead in JENKINS-45553 .

SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Code changed in jenkins
User: Sam Van Oort
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java
src/main/java/org/jenkinsci/plugins/workflow/support/visualization/table/FlowGraphTable.java
src/test/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImplTest.java
http://jenkins-ci.org/commit/workflow-support-plugin/236f06ca40fc019ca4da2cadb4d0804971faa9db
Log:
Merge pull request #38 from jglick/FlowNode.isActive-~~JENKINS-38223~~

~~JENKINS-38223~~ Using FlowNode.isActive to improve ~~JENKINS-45553~~

Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/01f9538af15c...236f06ca40fc

SCM/JIRA link daemon added a comment - 2017-08-22 18:59 Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java src/main/java/org/jenkinsci/plugins/workflow/support/visualization/table/FlowGraphTable.java src/test/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImplTest.java http://jenkins-ci.org/commit/workflow-support-plugin/236f06ca40fc019ca4da2cadb4d0804971faa9db Log: Merge pull request #38 from jglick/FlowNode.isActive- JENKINS-38223 JENKINS-38223 Using FlowNode.isActive to improve JENKINS-45553 Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/01f9538af15c...236f06ca40fc

Sam Van Oort added a comment - 2017-08-26 15:12

manschwetus I'd recommend consuming the SNAPSHOT build of this, since it should fully resolve your issues

Sam Van Oort added a comment - 2017-08-26 15:12 manschwetus I'd recommend consuming the SNAPSHOT build of this, since it should fully resolve your issues

Jesse Glick added a comment - 2017-08-26 22:19

I think all planned fixes have been merged, though some remain unreleased.

Jesse Glick added a comment - 2017-08-26 22:19 I think all planned fixes have been merged, though some remain unreleased.

Oliver Gondža added a comment - 2017-09-01 13:29

All core fixes went into 2.72 so it will be part of LTS line 2.73 naturally.

Oliver Gondža added a comment - 2017-09-01 13:29 All core fixes went into 2.72 so it will be part of LTS line 2.73 naturally.

Florian Manschwetus added a comment - 2017-09-21 12:23

I'm still wating for a new release of supporting-apis.

Florian Manschwetus added a comment - 2017-09-21 12:23 I'm still wating for a new release of supporting-apis.

Florian Meser added a comment - 2017-09-21 13:28

I'm a little confused. A "clear improvement" was mentioned before. Which of the changes will reveal this "clear improvement" and are they released yet?

The core changes which are already released since 2.72? I've tried the new LTS 2.73.1 and there was no improvement so I guess there are changes in plugins that are not released yet.

Would be nice to somehow get an overview of the status like "which components are affected", "what's released so far" and "what still needs to be released".

Florian Meser added a comment - 2017-09-21 13:28 I'm a little confused. A "clear improvement" was mentioned before. Which of the changes will reveal this "clear improvement" and are they released yet? The core changes which are already released since 2.72? I've tried the new LTS 2.73.1 and there was no improvement so I guess there are changes in plugins that are not released yet. Would be nice to somehow get an overview of the status like "which components are affected", "what's released so far" and "what still needs to be released".

SCM/JIRA link daemon added a comment - 2017-09-22 17:49

Code changed in jenkins
User: Sam Van Oort
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java
src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java
src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java
http://jenkins-ci.org/commit/workflow-api-plugin/4ba6b42b651d9a5853a32681e9582f9649ad8fa2
Log:
Merge pull request #50 from svanoort/jenkins-27395-block-structure-lookup

~~JENKINS-37573~~ / ~~JENKINS-45553~~ Provide a fast view of block structures in the flow graph

Compare: https://github.com/jenkinsci/workflow-api-plugin/compare/63e8ad0c2715...4ba6b42b651d

SCM/JIRA link daemon added a comment - 2017-09-22 17:49 Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java http://jenkins-ci.org/commit/workflow-api-plugin/4ba6b42b651d9a5853a32681e9582f9649ad8fa2 Log: Merge pull request #50 from svanoort/jenkins-27395-block-structure-lookup JENKINS-37573 / JENKINS-45553 Provide a fast view of block structures in the flow graph Compare: https://github.com/jenkinsci/workflow-api-plugin/compare/63e8ad0c2715...4ba6b42b651d

Sam Van Oort added a comment - 2017-09-22 21:27

florian_meser You wait for the release coming over this weekend, or build and install the master branch of the plugins https://github.com/jenkinsci/workflow-api-plugin/commits/master and https://github.com/jenkinsci/workflow-support-plugin – we had an additional enhancement that was lumped along with it (plus some further optimizations) and needed to be integrated. It's ready now though and should be live shortly.

Sam Van Oort added a comment - 2017-09-22 21:27 florian_meser You wait for the release coming over this weekend, or build and install the master branch of the plugins https://github.com/jenkinsci/workflow-api-plugin/commits/master and https://github.com/jenkinsci/workflow-support-plugin – we had an additional enhancement that was lumped along with it (plus some further optimizations) and needed to be integrated. It's ready now though and should be live shortly.

SCM/JIRA link daemon added a comment - 2017-09-22 21:28

Code changed in jenkins
User: Sam Van Oort
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java
src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java
src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java
http://jenkins-ci.org/commit/workflow-api-plugin/70a05dc4d4a330b797ce370fd587fccfe74355d9
Log:
Revert "Revert "~~JENKINS-37573~~ / ~~JENKINS-45553~~ Provide a fast view of block structures in the flow graph""

This reverts commit 88ffdfc69c43bd4dde21a6578b5ac466999b4fd4.

SCM/JIRA link daemon added a comment - 2017-09-22 21:28 Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java http://jenkins-ci.org/commit/workflow-api-plugin/70a05dc4d4a330b797ce370fd587fccfe74355d9 Log: Revert "Revert " JENKINS-37573 / JENKINS-45553 Provide a fast view of block structures in the flow graph"" This reverts commit 88ffdfc69c43bd4dde21a6578b5ac466999b4fd4.

Sam Van Oort added a comment - 2017-09-22 21:30

Benchmarks at ~10x as fast, by the way, more likely quite a bit more (hitting some limits on the test system).

Sam Van Oort added a comment - 2017-09-22 21:30 Benchmarks at ~10x as fast, by the way, more likely quite a bit more (hitting some limits on the test system).

Sam Van Oort added a comment - 2017-09-26 22:23

Resolved by core + changes released in workflow-api v2.22 and workflow-support 2.15

Sam Van Oort added a comment - 2017-09-26 22:23 Resolved by core + changes released in workflow-api v2.22 and workflow-support 2.15

Sam Van Oort added a comment - 2017-09-26 22:26

manschwetus / florian_meser We appreciate your patience - releases are now cut and this should reflect a fairly comprehensive improvement for your case plus several related one. It requires Pipeline API Plugin v2.22 and Pipeline Supporting APIs version 2.15 to be installed to get the full benefits.

Please give them a try and let us know how they work out for you – based on our testing you should see a tremendous performance improvement from these changes!

Sam Van Oort added a comment - 2017-09-26 22:26 manschwetus / florian_meser We appreciate your patience - releases are now cut and this should reflect a fairly comprehensive improvement for your case plus several related one. It requires Pipeline API Plugin v2.22 and Pipeline Supporting APIs version 2.15 to be installed to get the full benefits. Please give them a try and let us know how they work out for you – based on our testing you should see a tremendous performance improvement from these changes!

Puneeth Nanjundaswamy added a comment - 2017-09-27 20:16

Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 20 not equal to parallelBranchEndNodes: 19
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 23 not equal to parallelBranchEndNodes: 22
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 20 not equal to parallelBranchEndNodes: 19
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 23 not equal to parallelBranchEndNodes: 22
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0
Sep 27, 2017 8:09:24 PM io.jenkins.blueocean.rest.impl.pipeline.PipelineNodeGraphVisitor parallelStart
SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15

Jenkins is spamming the logs with this. Jenkins 2.73.1. Pipeline API Plugin v2.22 and Pipeline Supporting APIs version 2.15. How do I fix this?

Puneeth Nanjundaswamy added a comment - 2017-09-27 20:16 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 20 not equal to parallelBranchEndNodes: 19 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 23 not equal to parallelBranchEndNodes: 22 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 20 not equal to parallelBranchEndNodes: 19 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 23 not equal to parallelBranchEndNodes: 22 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 1 not equal to parallelBranchEndNodes: 0 Sep 27, 2017 8:09:24 PM io.jenkins.blueocean. rest .impl.pipeline.PipelineNodeGraphVisitor parallelStart SEVERE: nestedBranches size: 16 not equal to parallelBranchEndNodes: 15 Jenkins is spamming the logs with this. Jenkins 2.73.1. Pipeline API Plugin v2.22 and Pipeline Supporting APIs version 2.15. How do I fix this?

Sam Van Oort added a comment - 2017-09-27 21:17

puneeth_n You'll need to open a new bug and provide the pipeline that triggers this error. It could be an issue in one of two places and there's nothing in that log to permit me to reproduce this.

Sam Van Oort added a comment - 2017-09-27 21:17 puneeth_n You'll need to open a new bug and provide the pipeline that triggers this error. It could be an issue in one of two places and there's nothing in that log to permit me to reproduce this.

SCM/JIRA link daemon added a comment - 2017-09-28 20:07

Code changed in jenkins
User: Sam Van Oort
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java
src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java
src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java
http://jenkins-ci.org/commit/workflow-cps-plugin/88ffdfc69c43bd4dde21a6578b5ac466999b4fd4
Log:
Revert "~~JENKINS-37573~~ / ~~JENKINS-45553~~ Provide a fast view of block structures in the flow graph"

SCM/JIRA link daemon added a comment - 2017-09-28 20:07 Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/88ffdfc69c43bd4dde21a6578b5ac466999b4fd4 Log: Revert " JENKINS-37573 / JENKINS-45553 Provide a fast view of block structures in the flow graph"

SCM/JIRA link daemon added a comment - 2017-09-28 20:07

Code changed in jenkins
User: Sam Van Oort
Path:
pom.xml
src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java
src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java
src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java
src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java
src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java
http://jenkins-ci.org/commit/workflow-cps-plugin/c0daeb5ce9ba55e6f51cb6c8db903cc5fbba324b
Log:
Merge pull request #52 from jenkinsci/revert-50-jenkins-27395-block-structure-lookup

Revert "~~JENKINS-37573~~ / ~~JENKINS-45553~~ Provide a fast view of block structures in the flow graph"

SCM/JIRA link daemon added a comment - 2017-09-28 20:07 Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecution.java src/main/java/org/jenkinsci/plugins/workflow/graph/BlockStartNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/FlowNode.java src/main/java/org/jenkinsci/plugins/workflow/graph/GraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graph/StandardGraphLookupView.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScanningUtils.java src/main/java/org/jenkinsci/plugins/workflow/graphanalysis/NodeStepNamePredicate.java src/test/java/org/jenkinsci/plugins/workflow/graph/FlowNodeTest.java src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/FlowScannerTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/c0daeb5ce9ba55e6f51cb6c8db903cc5fbba324b Log: Merge pull request #52 from jenkinsci/revert-50-jenkins-27395-block-structure-lookup Revert " JENKINS-37573 / JENKINS-45553 Provide a fast view of block structures in the flow graph"

Florian Meser added a comment - 2018-01-16 12:56

Hello svanoort, like you mentioned above I just tested the new versions and there definitely is an improvement. I updated short after you wrote that comment and I'm still using those versions. We pretty much rely on this feature since our whole test infrastructure depends on deploying data on nodes for many branches so we pretty much got a 24/7 running Jenkins (-with up to 1-2k executors in queue).

Never the less the scaling can not be considered as stable. We got many tests that need ~2m and wait ~10-15min (worst case) for being processed by Jenkins. Like mentioned in https://issues.jenkins-ci.org/browse/JENKINS-45876 there seems to be kind of an quadratic or exponential correlation. That means even if there is a big improvement it gets to it's limits when crossing this edge.

In my opinion there is still room for further improvements to ensure also large jenkins environments become more effective.

Florian Meser added a comment - 2018-01-16 12:56 Hello svanoort , like you mentioned above I just tested the new versions and there definitely is an improvement. I updated short after you wrote that comment and I'm still using those versions. We pretty much rely on this feature since our whole test infrastructure depends on deploying data on nodes for many branches so we pretty much got a 24/7 running Jenkins (-with up to 1-2k executors in queue). Never the less the scaling can not be considered as stable. We got many tests that need ~2m and wait ~10-15min (worst case) for being processed by Jenkins. Like mentioned in https://issues.jenkins-ci.org/browse/JENKINS-45876 there seems to be kind of an quadratic or exponential correlation. That means even if there is a big improvement it gets to it's limits when crossing this edge. In my opinion there is still room for further improvements to ensure also large jenkins environments become more effective.

Sam Van Oort added a comment - 2018-01-16 16:07

florian_meser I agree completely that there is some room for further optimization of massively-parallel pipeline execution – the best place to currently follow the work and investigations is https://issues.jenkins-ci.org/browse/JENKINS-47724 now. That ticket also includes some concrete advice that may help with your scenario.

If you'd like to add some quantitative scaling observations to help identify where the bottleneck is, that might be of some assistance – I also expect the work currently in beta release from JENKINS-47170 will help a bit (reduces the per-flownode overheads associated with pipelines quite significantly – that's a small component of parallel execution).

Very likely you'll see a big improvement from the next phase of that work, https://issues.jenkins-ci.org/browse/JENKINS-38381, which was the culprit here for a lot of the nonlinear behaviors – that's slated to be my next strategic push on performance, along with some tactical fixes that may help with your scenario.

Sam Van Oort added a comment - 2018-01-16 16:07 florian_meser I agree completely that there is some room for further optimization of massively-parallel pipeline execution – the best place to currently follow the work and investigations is https://issues.jenkins-ci.org/browse/JENKINS-47724 now. That ticket also includes some concrete advice that may help with your scenario. If you'd like to add some quantitative scaling observations to help identify where the bottleneck is, that might be of some assistance – I also expect the work currently in beta release from JENKINS-47170 will help a bit (reduces the per-flownode overheads associated with pipelines quite significantly – that's a small component of parallel execution). Very likely you'll see a big improvement from the next phase of that work, https://issues.jenkins-ci.org/browse/JENKINS-38381 , which was the culprit here for a lot of the nonlinear behaviors – that's slated to be my next strategic push on performance, along with some tactical fixes that may help with your scenario.

Sam Van Oort added a comment - 2018-01-16 16:18

One other comment: the bottlenecks appears to be only with massive parallels in a single pipeline – if you break your job into smaller ones with fewer parallel branches in each, this overheads per-branch will be less important.

Pipeline is also never going to achieve fully linear scale-out with large numbers of executors, because only some parts of the execution can take full advantage of parallel execution – primarily the shell/batch/powershell steps that should be doing the bulk of work. Our work is primarily focused on reducing the other overheads so it can spend more time executing those steps.

Amdahl's Law in spades, basically.

Sam Van Oort added a comment - 2018-01-16 16:18 One other comment: the bottlenecks appears to be only with massive parallels in a single pipeline – if you break your job into smaller ones with fewer parallel branches in each, this overheads per-branch will be less important. Pipeline is also never going to achieve fully linear scale-out with large numbers of executors, because only some parts of the execution can take full advantage of parallel execution – primarily the shell/batch/powershell steps that should be doing the bulk of work. Our work is primarily focused on reducing the other overheads so it can spend more time executing those steps. Amdahl's Law in spades, basically.

Florian Meser added a comment - 2018-02-07 14:46 - edited

svanoort I'm currently trying to implement some time measurement to get quantitative scaling observations. Currently I don't got much time to spent for that though. As far as I got something i'll let you know.

I don't know if this is offtopic but it seems that another neck breaker just came in. Therefor the question: are there any observation regarding the Meltdown/Spectre Windows7 updates topic which, again, seem to dramatic reduce the performance of our so called "massive parallels in a single pipeline"?

I'm observing a dramatic loss of performance although no changes in our Jenkins-Pipeline were made regarding this symptomatic. With KB4056894 there was definitely a patch containing Meltdown/Spectre topics. I'm quiet curious if I'm the only one who is having this kind of trouble.

Florian Meser added a comment - 2018-02-07 14:46 - edited svanoort I'm currently trying to implement some time measurement to get quantitative scaling observations. Currently I don't got much time to spent for that though. As far as I got something i'll let you know. I don't know if this is offtopic but it seems that another neck breaker just came in. Therefor the question: are there any observation regarding the Meltdown/Spectre Windows7 updates topic which, again, seem to dramatic reduce the performance of our so called "massive parallels in a single pipeline"? I'm observing a dramatic loss of performance although no changes in our Jenkins-Pipeline were made regarding this symptomatic. With KB4056894 there was definitely a patch containing Meltdown/Spectre topics. I'm quiet curious if I'm the only one who is having this kind of trouble.

Sam Van Oort added a comment - 2018-02-07 16:50

florian_meser I'm not sure what the performance impact of the Meltdown/Spectre updates is on Windows - not really set up for scaling tests on Windows, but it might be related to changes in IO performance.

Please try out the advice I just added in the latest comment on https://issues.jenkins-ci.org/browse/JENKINS-47724 – this should help considerably. The last few months have been heavily focused on performance improvements to Pipeline and it should show in a big way.

Sam Van Oort added a comment - 2018-02-07 16:50 florian_meser I'm not sure what the performance impact of the Meltdown/Spectre updates is on Windows - not really set up for scaling tests on Windows, but it might be related to changes in IO performance. Please try out the advice I just added in the latest comment on https://issues.jenkins-ci.org/browse/JENKINS-47724 – this should help considerably. The last few months have been heavily focused on performance improvements to Pipeline and it should show in a big way.

Assignee:: Jesse Glick

Reporter:: Tom Skrainar

Votes:: 4 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2017-07-14 23:07

Updated:: 2018-02-07 16:50

Resolved:: 2017-08-26 22:19

Jenkins

Details

Description

Attachments

Attachments

Issue Links

Activity

Collapse comment: Jesse Glick added a comment - 2017-07-27 20:59

Expand comment: Jesse Glick added a comment - 2017-07-27 20:59

Collapse comment: Jesse Glick added a comment - 2017-07-27 21:05

Expand comment: Jesse Glick added a comment - 2017-07-27 21:05

Collapse comment: Jesse Glick added a comment - 2017-07-27 21:26

Expand comment: Jesse Glick added a comment - 2017-07-27 21:26

Collapse comment: Jesse Glick added a comment - 2017-07-27 22:50

Expand comment: Jesse Glick added a comment - 2017-07-27 22:50

Collapse comment: Florian Manschwetus added a comment - 2017-07-28 09:33, Edited by Florian Manschwetus - 2017-07-28 09:35

Expand comment: Florian Manschwetus added a comment - 2017-07-28 09:33, Edited by Florian Manschwetus - 2017-07-28 09:35

Collapse comment: SCM/JIRA link daemon added a comment - 2017-07-30 12:41

Expand comment: SCM/JIRA link daemon added a comment - 2017-07-30 12:41

Collapse comment: SCM/JIRA link daemon added a comment - 2017-07-30 12:42

Expand comment: SCM/JIRA link daemon added a comment - 2017-07-30 12:42

Collapse comment: Florian Meser added a comment - 2017-08-01 07:15

Expand comment: Florian Meser added a comment - 2017-08-01 07:15

Collapse comment: Florian Manschwetus added a comment - 2017-08-01 07:28, Edited by Florian Manschwetus - 2017-08-01 07:47

Expand comment: Florian Manschwetus added a comment - 2017-08-01 07:28, Edited by Florian Manschwetus - 2017-08-01 07:47

Collapse comment: Jesse Glick added a comment - 2017-08-01 16:18

Expand comment: Jesse Glick added a comment - 2017-08-01 16:18

Collapse comment: Florian Manschwetus added a comment - 2017-08-02 12:25

Expand comment: Florian Manschwetus added a comment - 2017-08-02 12:25

Collapse comment: Jesse Glick added a comment - 2017-08-02 13:39

Expand comment: Jesse Glick added a comment - 2017-08-02 13:39

Collapse comment: Sam Van Oort added a comment - 2017-08-03 02:32

Expand comment: Sam Van Oort added a comment - 2017-08-03 02:32

Collapse comment: Florian Manschwetus added a comment - 2017-08-03 05:26

Expand comment: Florian Manschwetus added a comment - 2017-08-03 05:26

Collapse comment: Florian Manschwetus added a comment - 2017-08-03 06:57

Expand comment: Florian Manschwetus added a comment - 2017-08-03 06:57

Collapse comment: Jesse Glick added a comment - 2017-08-03 19:06

Expand comment: Jesse Glick added a comment - 2017-08-03 19:06

Collapse comment: Jesse Glick added a comment - 2017-08-03 19:16

Expand comment: Jesse Glick added a comment - 2017-08-03 19:16

Collapse comment: Jesse Glick added a comment - 2017-08-03 19:19

Expand comment: Jesse Glick added a comment - 2017-08-03 19:19

Collapse comment: Sam Van Oort added a comment - 2017-08-03 19:51

Expand comment: Sam Van Oort added a comment - 2017-08-03 19:51

Collapse comment: Jesse Glick added a comment - 2017-08-03 20:05

Expand comment: Jesse Glick added a comment - 2017-08-03 20:05

Collapse comment: Florian Manschwetus added a comment - 2017-08-10 13:28, Edited by Florian Manschwetus - 2017-08-10 13:31

Expand comment: Florian Manschwetus added a comment - 2017-08-10 13:28, Edited by Florian Manschwetus - 2017-08-10 13:31

Collapse comment: Florian Meser added a comment - 2017-08-11 07:09

Expand comment: Florian Meser added a comment - 2017-08-11 07:09

Collapse comment: Sam Van Oort added a comment - 2017-08-11 15:00

Expand comment: Sam Van Oort added a comment - 2017-08-11 15:00

Collapse comment: Florian Manschwetus added a comment - 2017-08-18 07:50, Edited by Florian Manschwetus - 2017-08-18 12:12

Expand comment: Florian Manschwetus added a comment - 2017-08-18 07:50, Edited by Florian Manschwetus - 2017-08-18 12:12

Collapse comment: Jesse Glick added a comment - 2017-08-18 14:56

Expand comment: Jesse Glick added a comment - 2017-08-18 14:56

Collapse comment: Jesse Glick added a comment - 2017-08-18 14:56

Expand comment: Jesse Glick added a comment - 2017-08-18 14:56

Collapse comment: Sam Van Oort added a comment - 2017-08-18 17:04

Expand comment: Sam Van Oort added a comment - 2017-08-18 17:04

Collapse comment: Jesse Glick added a comment - 2017-08-22 15:14

Expand comment: Jesse Glick added a comment - 2017-08-22 15:14

Collapse comment: Andrew Bayer added a comment - 2017-08-22 15:26

Expand comment: Andrew Bayer added a comment - 2017-08-22 15:26

Collapse comment: Sam Van Oort added a comment - 2017-08-22 15:50

Expand comment: Sam Van Oort added a comment - 2017-08-22 15:50

Collapse comment: SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Expand comment: SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Collapse comment: SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Expand comment: SCM/JIRA link daemon added a comment - 2017-08-22 18:59

Collapse comment: Sam Van Oort added a comment - 2017-08-26 15:12

Expand comment: Sam Van Oort added a comment - 2017-08-26 15:12

Collapse comment: Jesse Glick added a comment - 2017-08-26 22:19

Expand comment: Jesse Glick added a comment - 2017-08-26 22:19

Collapse comment: Oliver Gondža added a comment - 2017-09-01 13:29

Expand comment: Oliver Gondža added a comment - 2017-09-01 13:29

Collapse comment: Florian Manschwetus added a comment - 2017-09-21 12:23

Expand comment: Florian Manschwetus added a comment - 2017-09-21 12:23