Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34201

Pipeline plugin can't handle large numbers of parallel build jobs

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved (View Workflow)
    • Critical
    • Resolution: Duplicate
    • pipeline
    • None
    • Jenkins 1.656 on RHEL7 with Pipeline plugin 2.0

    Description

      Consider following snippet:

      stage name: 'foo', concurrency: 10
      foo = [:]
      foo['failFast'] = true
      
      for (int i = 0; i < 25; i++) {
          for (int j = 0; j < 4; j++) {
              foo["branch${i}-${j}"] = {
                  node {
                      build job: 'job1', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
      
                  node {
                      build job: 'job2', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
              }
          }
      
          parallel foo
      }
      

      It starts to build the requested jobs just fine.

      INFO: job1 #125 main build action completed: SUCCES
      INFO: job1 #127 main build action completed: SUCCES
      INFO: job1 #126 main build action completed: SUCCES
      INFO: job2 #124 main build action completed: SUCCES
      ...

      However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

      Now when I remove the outermost for loop things run just fine.

      for (int j = 0; j < 4; j++) {
              foo["branch${j}"] = {
                      ...
              }
      }
      

      Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

      for (int i = 0; i < 100; j++) {
              foo["branch${i}"] = {
                      ...
              }
      }
      

      There's probably a better way to handle this.
      Any pointers how to get there?

      Attachments

        Issue Links

          Activity

            jglick Jesse Glick added a comment -

            Why are you running build inside node? That makes no sense—just wasting an executor slot for no reason.

            Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though.

            Reproducible from scratch somehow?

            jglick Jesse Glick added a comment - Why are you running build inside node ? That makes no sense—just wasting an executor slot for no reason. Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though. Reproducible from scratch somehow?
            jglick Jesse Glick added a comment -

            Probably due to a known bug already fixed.

            jglick Jesse Glick added a comment - Probably due to a known bug already fixed.

            We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at:

            https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a

            After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running. 

            Is there any way to better debug what's going on here?

             

            jonasschneider Jonas Schneider added a comment - We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at: https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end  . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running.  Is there any way to better debug what's going on here?  

            (see other comment)

            jonasschneider Jonas Schneider added a comment - (see other comment)

            Sorry, I meant to reopen JENKINS-37730. Closing this again.

            jonasschneider Jonas Schneider added a comment - Sorry, I meant to reopen JENKINS-37730 . Closing this again.

            People

              jglick Jesse Glick
              tomdevylder Tom De Vylder
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: