Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-40934

LogActionImpl listener inefficient; poor performance queuing large parallel workloads

    XMLWordPrintable

Details

    Description

      When the number of parallel branches in pipeline increases a lot, the performance of the pipeline decreases significantly. There appears to be an n^2 algorithm somewhere.

      ~200 branches - okay
      4k branches - glacial

      Attached is a stack trace taken while the behavior was exibited. I'm pretty sure this ends up n^2:

      "Running CpsFlowExecutionOwner[corex_full/54:corex_full #54]" Id=421 Group=main RUNNABLE
      at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getCurrentHeads(CpsFlowExecution.java:703)

      • locked org.jenkinsci.plugins.workflow.cps.CpsFlowExecution@6ebdc0a3
        at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.isRunning(LogActionImpl.java:152)
        at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.access$000(LogActionImpl.java:66)
        at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl$1.onNewHead(LogActionImpl.java:93)
        at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1081)
        at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.notifyNewHead(CpsThreadGroup.java:381)
        at org.jenkinsci.plugins.workflow.cps.FlowHead.setNewHead(FlowHead.java:119)
        at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:171)
        at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:126)
        at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108)

      The code walks the heads every time a new head is added.

      Attachments

        Issue Links

          Activity

            jglick Jesse Glick added a comment -

            Similar code was copied to JENKINS-38381.

            jglick Jesse Glick added a comment - Similar code was copied to JENKINS-38381 .
            jglick Jesse Glick added a comment -

            Would be better to have a tested and efficient implement in FlowNode to begin with: JENKINS-38223.

            jglick Jesse Glick added a comment - Would be better to have a tested and efficient implement in FlowNode to begin with:  JENKINS-38223 .
            svanoort Sam Van Oort added a comment - - edited

            jglick I agree that it would be better, but let us not permit "perfect" to become the enemy of "good" – if we were going to add a reusable API to FlowNode then the quality bar is much higher and it needs to address the fact that an O( n ) search is required at all.  There's no harm in making the fast fix while discussing.

            svanoort Sam Van Oort added a comment - - edited jglick I agree that it would be better, but let us not permit "perfect" to become the enemy of "good" – if we were going to add a reusable API to FlowNode then the quality bar is much higher and it needs to address the fact that an O( n ) search is required at all.  There's no harm in making the fast fix while discussing.
            jglick Jesse Glick added a comment -

            I think it is fine to take the implementation here and just move it to FlowNode, deprecating isRunning and noting in Javadoc that the current implementation runs in linear time. Better than nothing.

            jglick Jesse Glick added a comment - I think it is fine to take the implementation here and just move it to FlowNode , deprecating isRunning and noting in Javadoc that the current implementation runs in linear time. Better than nothing.

            Code changed in jenkins
            User: Sam Van Oort
            Path:
            src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java
            http://jenkins-ci.org/commit/workflow-support-plugin/4b48f17fc8525a9d2cfb63c763d7ec5070192712
            Log:
            Merge pull request #34 from svanoort/fix-bad-big-o-for-copylogs-with-parallels

            JENKINS-40934 Avoid redundant scans of parallels

            Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/494446bf5962...4b48f17fc852

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Sam Van Oort Path: src/main/java/org/jenkinsci/plugins/workflow/support/actions/LogActionImpl.java http://jenkins-ci.org/commit/workflow-support-plugin/4b48f17fc8525a9d2cfb63c763d7ec5070192712 Log: Merge pull request #34 from svanoort/fix-bad-big-o-for-copylogs-with-parallels JENKINS-40934 Avoid redundant scans of parallels Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/494446bf5962...4b48f17fc852
            svanoort Sam Van Oort added a comment -

            Updating to note it is merged in now, pending next release. 

            svanoort Sam Van Oort added a comment - Updating to note it is merged in now, pending next release. 
            manschwetus Florian Manschwetus added a comment - - edited

            Sorry, but it looks like this has made thinks worse, as my previous work around of implementing workers starting closures from the original map one after another, now, after upgrading jenkins since several weeks of continues test execution (we had a release and therefore an infrastructure freeze), suffers also an extreme slowdown, increasing over test execution time.
            The construction I used is:

            parallel {
              worker0...n{
                for (dummyMap ={ nextOriginalClosure };!dummyMap.isEmpty;dummyMap = { nextOriginalClosure }){
                  parallel dummyMap
                }
              }
            }

            This was working quite ok, till I updated last week from ~2.55 (not sure backup is gone, due to double update) to 2.70 and to 2.71.

            Jenkins Monitoring is installed, so ask for further details if needed...

            manschwetus Florian Manschwetus added a comment - - edited Sorry, but it looks like this has made thinks worse, as my previous work around of implementing workers starting closures from the original map one after another, now, after upgrading jenkins since several weeks of continues test execution (we had a release and therefore an infrastructure freeze), suffers also an extreme slowdown, increasing over test execution time. The construction I used is: parallel { worker0...n{ for (dummyMap ={ nextOriginalClosure };!dummyMap.isEmpty;dummyMap = { nextOriginalClosure }){ parallel dummyMap } } } This was working quite ok, till I updated last week from ~2.55 (not sure backup is gone, due to double update) to 2.70 and to 2.71. Jenkins Monitoring is installed, so ask for further details if needed...
            jglick Jesse Glick added a comment -

            JENKINS-45553 tracks a slew of optimizations.

            jglick Jesse Glick added a comment - JENKINS-45553 tracks a slew of optimizations.

            Is it possible to exclusively rolling back Pipeline, supporting APIs, to 2.13, to get my pipeline working in reasonable time again?
            (I currently suspect the related change in there, to rendering the construction, sketched above, useless)
            As our daily test job, normally taking around 24h, now runs 5 to 6 days!!

            Thx

            manschwetus Florian Manschwetus added a comment - Is it possible to exclusively rolling back Pipeline, supporting APIs, to 2.13, to get my pipeline working in reasonable time again? (I currently suspect the related change in there, to rendering the construction, sketched above, useless) As our daily test job, normally taking around 24h, now runs 5 to 6 days!! Thx
            jglick Jesse Glick added a comment -

            We roll forward. manschwetus please see JENKINS-45553.

            jglick Jesse Glick added a comment - We roll forward. manschwetus please see  JENKINS-45553 .

            People

              svanoort Sam Van Oort
              mmitche Matthew Mitchell
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: