Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-21798

Multi Job Plugin causes CPU to max out due to updateSubBuild executing millions of times in a few seconds

      A bug was introduced in version 1.11 of this plugin and is still reproduced in 1.12 of the plugin. This prevents us from updating to the latest version of the plugin.

      When we upgraded to the new version of the plugin, our CPU of the jenkins master was pegged and would not go down. We attempted to add more CPUs but did not help. Turns out from profiling the updateSubBuild function in the plugin is being called millions of times in a few seconds.

      We are extensively using this plugin for parallel testing of our product, and have many multi job plugin jobs with many parallel steps.

      Please let us know if we need to collect any other data, or you would like to discuss more of the design and how we use the plugin

          [JENKINS-21798] Multi Job Plugin causes CPU to max out due to updateSubBuild executing millions of times in a few seconds

          Dan Dumont added a comment -

          Note: a manual downgrade to 1.10 fixed the issue for us.

          Dan Dumont added a comment - Note: a manual downgrade to 1.10 fixed the issue for us.

          Dan Dumont added a comment -

          Related: JENKINS-21649

          Dan Dumont added a comment - Related: JENKINS-21649

          Nicolas Morey added a comment -

          Seems like a simple sleep in the thread should do the trick. I'll quickly check it out. I'm suppose to reboot the integration server tomorrow morning so it'd be a good time to fix this

          Nicolas Morey added a comment - Seems like a simple sleep in the thread should do the trick. I'll quickly check it out. I'm suppose to reboot the integration server tomorrow morning so it'd be a good time to fix this

          Dan Dumont added a comment -

          Perhaps. There are already sleeps in there, though.

          I did not get a chance to look through the commits in 1.11 that introduced the regression... And I'm not familiar with the code at all, so...
          Glad someone is looking at it.

          Dan Dumont added a comment - Perhaps. There are already sleeps in there, though. I did not get a chance to look through the commits in 1.11 that introduced the regression... And I'm not familiar with the code at all, so... Glad someone is looking at it.

          Nicolas Morey added a comment -

          I'm not much familiar with this code or Java for any matter but it seems there wasn't actually any sleeps in the loop.
          I'm not sure what the 5 seconds above are for, but it didn't seem to stall the loop...
          I added a simple sleep at the end and the CPU goes down and all seems to work. I opened a pull request so someone who knows what he is doing can give some feedback

          Nicolas Morey added a comment - I'm not much familiar with this code or Java for any matter but it seems there wasn't actually any sleeps in the loop. I'm not sure what the 5 seconds above are for, but it didn't seem to stall the loop... I added a simple sleep at the end and the CPU goes down and all seems to work. I opened a pull request so someone who knows what he is doing can give some feedback

          Nicolas Morey added a comment -

          Actually the getStartCondition should only be waitining for the subtask to start while the overall loop waits for it to be finished.
          So once that subtask is started, the getStartCondition has no more effect.

          At least that's it looks like to me.

          Nicolas Morey added a comment - Actually the getStartCondition should only be waitining for the subtask to start while the overall loop waits for it to be finished. So once that subtask is started, the getStartCondition has no more effect. At least that's it looks like to me.

          Dan Dumont added a comment - - edited

          Nicolas, Is there a way to link the pull request to the jira? Not sure how this project operates.

          Dan Dumont added a comment - - edited Nicolas, Is there a way to link the pull request to the jira? Not sure how this project operates.

          Nicolas Morey added a comment -

          I tagged the issue in my commit but I don't know how JIRA likns the pull request....
          I guess it only gets linked when my pull request will be integrated in master

          But FYI, it's here:
          https://github.com/jenkinsci/tikal-multijob-plugin/pull/49

          Nicolas Morey added a comment - I tagged the issue in my commit but I don't know how JIRA likns the pull request.... I guess it only gets linked when my pull request will be integrated in master But FYI, it's here: https://github.com/jenkinsci/tikal-multijob-plugin/pull/49

          Code changed in jenkins
          User: Nicolas Morey-Chaisemartin
          Path:
          src/main/java/com/tikal/jenkins/plugins/multijob/MultiJobBuilder.java
          http://jenkins-ci.org/commit/tikal-multijob-plugin/f9a88fd8a6b094b4e817bcc4a0fc19ab91be2ddb
          Log:
          MultiJobBuilder.java: Sleep in the subtask polling loop

          Without the sleep, the thread keeps polling and updating the subtask which causes a very high CPU usage.
          This fixes JENKINS-21649 and JENKINS-21798

          Signed-off-by: Nicolas Morey-Chaisemartin <nmorey@kalray.eu>

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nicolas Morey-Chaisemartin Path: src/main/java/com/tikal/jenkins/plugins/multijob/MultiJobBuilder.java http://jenkins-ci.org/commit/tikal-multijob-plugin/f9a88fd8a6b094b4e817bcc4a0fc19ab91be2ddb Log: MultiJobBuilder.java: Sleep in the subtask polling loop Without the sleep, the thread keeps polling and updating the subtask which causes a very high CPU usage. This fixes JENKINS-21649 and JENKINS-21798 Signed-off-by: Nicolas Morey-Chaisemartin <nmorey@kalray.eu>

          hagzag added a comment -

          Merged PR #49 from nmorey/update-subbuild-poll

          hagzag added a comment - Merged PR #49 from nmorey/update-subbuild-poll

            Unassigned Unassigned
            lorelei Lorelei McCollum
            Votes:
            4 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: