Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-9913

Not obvious why some post-build tasks enforce serial behavior even when builds are concurrent

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • RedHat Enterprise Linux 4.8, Jenkins 1.414

      We're experiencing an issue with concurrent builds where Jenkins appears to be associating separate builds (run on different machines) such that they won't be marked as completed until all jobs are completed. For example, if we kick off 5 concurrent builds on 5 different nodes, builds 1-4 won't be marked as completed if build #5 is still running, even though builds 1-4 are finished. I've seen a report of someone experiencing this issue elsewhere:

      http://groups.google.com/group/jenkinsci-users/browse_thread/thread/e477e25910266d2a?fwc=1

      but a solution wasn't posted. We do not have the batch plugin or the locks and latches plugin installed. We've disabled all post-build processing and switched between different containers (Glassfish/Tomcat), but the problem persists. I couldn't find an issue logged for this other than the aforementioned posting.

          [JENKINS-9913] Not obvious why some post-build tasks enforce serial behavior even when builds are concurrent

          Philip Metting van Rijn created issue -

          tiainpa added a comment -

          We have a similar issue: The build which should finish first gets stuck due to a bug in our test framework (last line in build console is 'Recording test results'), and all the concurrent builds running at the same time get stuck on the same phase, finishing only when the first build is forcefully killed.

          This might have something to do with the JUnit test result report publishing?

          tiainpa added a comment - We have a similar issue: The build which should finish first gets stuck due to a bug in our test framework (last line in build console is 'Recording test results'), and all the concurrent builds running at the same time get stuck on the same phase, finishing only when the first build is forcefully killed. This might have something to do with the JUnit test result report publishing?

          We see the same thing - we have a test job that can be triggered with a few different upstream jobs to specify different sets of tests to run with different test parameters. We use the parameterized trigger plugin to kick off multiple concurrent instances of the downstream job. One of our "trigger" jobs specifies a set of long running performance tests (about 10 hours). Another is a regression test suite, which only takes a few minutes. We've seen instances at least twice where one of the regression tests will start after the long-running performance tests. The short-lived regression test job will remain running until the performance test completes, which can be ten hours later. The console log shows the regression test job as completed, having archived and recorded fingerprints, but it's still marked as running. Furthermore, an attempt to stop the regression test job caused the long-running performance test job to stop. Several regression tests would normally be queued during this time. This seems to have been recent in an upgrade to 1.448. I'm not sure what we were running immediately before that.

          Adam Hawthorne added a comment - We see the same thing - we have a test job that can be triggered with a few different upstream jobs to specify different sets of tests to run with different test parameters. We use the parameterized trigger plugin to kick off multiple concurrent instances of the downstream job. One of our "trigger" jobs specifies a set of long running performance tests (about 10 hours). Another is a regression test suite, which only takes a few minutes. We've seen instances at least twice where one of the regression tests will start after the long-running performance tests. The short-lived regression test job will remain running until the performance test completes, which can be ten hours later. The console log shows the regression test job as completed, having archived and recorded fingerprints, but it's still marked as running. Furthermore, an attempt to stop the regression test job caused the long-running performance test job to stop. Several regression tests would normally be queued during this time. This seems to have been recent in an upgrade to 1.448. I'm not sure what we were running immediately before that.

          Danny Staple added a comment -

          We were hoping that disabling the Junit post build task would stop this - it didn't do so. This has been killing us for some test jobs where we use a concurrent job to massively reduce duplication of configuration (and all the headaches that come with that).

          Danny Staple added a comment - We were hoping that disabling the Junit post build task would stop this - it didn't do so. This has been killing us for some test jobs where we use a concurrent job to massively reduce duplication of configuration (and all the headaches that come with that).

          Laura Neff added a comment -

          I saw this also this evening. I use the throttle concurrent builds plugin and had four instances of the same parameterized job running. The three that finished weren't released and hogged their executors until the fourth and last finished. Postbuild actions for all were: archive artifacts, Groovy postbuild script, fingerprint a file, set build description, and build other projects (extended) where the trigger was not satisfied, send email. No Junit tests.
          Jenkins 1.428, running on Windows 7, and the jobs in question were all running on the master.
          Throttle Concurrent Builds plugin v. 1.6

          Laura Neff added a comment - I saw this also this evening. I use the throttle concurrent builds plugin and had four instances of the same parameterized job running. The three that finished weren't released and hogged their executors until the fourth and last finished. Postbuild actions for all were: archive artifacts, Groovy postbuild script, fingerprint a file, set build description, and build other projects (extended) where the trigger was not satisfied, send email. No Junit tests. Jenkins 1.428, running on Windows 7, and the jobs in question were all running on the master. Throttle Concurrent Builds plugin v. 1.6

          I have the same problem.
          I am using parameterized trigger where the parameters is a SVN branch.
          I start :

          • job build 1 from branch1
          • job build 2 from branch2
          • job build 3 from branch3
          • job build 4 from branch4

          They are all running fine, but if job 2 or 3 or 4 finish before 1. They are really finish only when job 1 is finish.
          Then problem is the node resource not available then for starting a new job, and I am waiting the result of the build.

          Xavier Leprévost added a comment - I have the same problem. I am using parameterized trigger where the parameters is a SVN branch. I start : job build 1 from branch1 job build 2 from branch2 job build 3 from branch3 job build 4 from branch4 They are all running fine, but if job 2 or 3 or 4 finish before 1. They are really finish only when job 1 is finish. Then problem is the node resource not available then for starting a new job, and I am waiting the result of the build.

          Jenkins ver. 1.458
          Still have problem: concurrent build not finished while last build ended.

          Sergey Smirnov added a comment - Jenkins ver. 1.458 Still have problem: concurrent build not finished while last build ended.

          Any progress in resolve issue?

          Sergey Smirnov added a comment - Any progress in resolve issue?

          Logan Mattox added a comment -

          We had a very similar issue, and found the problem to be in the email part of core somewhere.

          When we disabled all plugins that extended email, and turned off build notification emails, our builds no longer block other concurrent builds that share the same name.

          The the work around isn't ideal, as we then had to go add email code to several build scripts (that we are trying to simplify and remove maintenance from).

          Logan Mattox added a comment - We had a very similar issue, and found the problem to be in the email part of core somewhere. When we disabled all plugins that extended email, and turned off build notification emails, our builds no longer block other concurrent builds that share the same name. The the work around isn't ideal, as we then had to go add email code to several build scripts (that we are trying to simplify and remove maintenance from).

          We have "Editable Email Notification".
          It is difficult to us to use shell script for notifications.
          Will be nice to fix this bug.

          Sergey Smirnov added a comment - We have "Editable Email Notification". It is difficult to us to use shell script for notifications. Will be nice to fix this bug.

            jglick Jesse Glick
            pomvr Philip Metting van Rijn
            Votes:
            19 Vote for this issue
            Watchers:
            34 Start watching this issue

              Created:
              Updated:
              Resolved: