Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34201

Pipeline plugin can't handle large numbers of parallel build jobs

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • pipeline
    • None
    • Jenkins 1.656 on RHEL7 with Pipeline plugin 2.0

      Consider following snippet:

      stage name: 'foo', concurrency: 10
      foo = [:]
      foo['failFast'] = true
      
      for (int i = 0; i < 25; i++) {
          for (int j = 0; j < 4; j++) {
              foo["branch${i}-${j}"] = {
                  node {
                      build job: 'job1', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
      
                  node {
                      build job: 'job2', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
              }
          }
      
          parallel foo
      }
      

      It starts to build the requested jobs just fine.

      INFO: job1 #125 main build action completed: SUCCES
      INFO: job1 #127 main build action completed: SUCCES
      INFO: job1 #126 main build action completed: SUCCES
      INFO: job2 #124 main build action completed: SUCCES
      ...

      However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

      Now when I remove the outermost for loop things run just fine.

      for (int j = 0; j < 4; j++) {
              foo["branch${j}"] = {
                      ...
              }
      }
      

      Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

      for (int i = 0; i < 100; j++) {
              foo["branch${i}"] = {
                      ...
              }
      }
      

      There's probably a better way to handle this.
      Any pointers how to get there?

          [JENKINS-34201] Pipeline plugin can't handle large numbers of parallel build jobs

          Jesse Glick added a comment -

          Why are you running build inside node? That makes no sense—just wasting an executor slot for no reason.

          Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though.

          Reproducible from scratch somehow?

          Jesse Glick added a comment - Why are you running build inside node ? That makes no sense—just wasting an executor slot for no reason. Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though. Reproducible from scratch somehow?

          Jesse Glick added a comment -

          Probably due to a known bug already fixed.

          Jesse Glick added a comment - Probably due to a known bug already fixed.

          We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at:

          https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a

          After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running. 

          Is there any way to better debug what's going on here?

           

          Jonas Schneider added a comment - We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at: https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end  . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running.  Is there any way to better debug what's going on here?  

          (see other comment)

          Jonas Schneider added a comment - (see other comment)

          Sorry, I meant to reopen JENKINS-37730. Closing this again.

          Jonas Schneider added a comment - Sorry, I meant to reopen JENKINS-37730 . Closing this again.

            jglick Jesse Glick
            tomdevylder Tom De Vylder
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: