Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-5125

Projects building at the same time despite "Block build when upstream is building" option

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None
    • Windows and hudson 1.337

      If you have multiple executors (or multiple nodes) then the "Block build when upstream project is building" advanced option is not blocking downstream project during the build of upstream project. Instead, it starts the downstream project immediatelly when upstream starts to build!

      This is the opposite behaviour than help describes: "When this option is checked, Hudson will prevent the project from building when a dependency of this project is in the queue, or building."

      How to reproduce this isse:
      1. Set up 2 or more executors (the same happens with multiple slave nodes) and set quiet period to 0 (to speed up the test)
      2. Create job1 whith these settings:
      -build periodically (or SCM poll), eg. */5 * * * *
      -add a lengthy build step (eg. ping 127.0.0.1 -w 1000 -n 600)
      3. Create job2 with these settings:
      -the same build period as job1 (or at least overlap the build step of job1)
      -set a lengthy build step (eg. ping 127.0.0.1 -w 1000 -n 600)
      -under "Advanced Project Options" check the "Block build when upstream project is building" option!
      -set the "Build after other projects are built" build trigger to job1
      4. Wait until job1 starts
      5. Check job2 build history! It will start building immediatelly!

      The very same happens when you have multiple slave nodes with 1-1 executors.

          [JENKINS-5125] Projects building at the same time despite "Block build when upstream is building" option

          balazsdan created issue -
          mdonohue made changes -
          Link New: This issue depends on JENKINS-1938 [ JENKINS-1938 ]

          Alan Harder added a comment -

          So in your steps are both job1 and job2 polling SCM and starting at the same time? That would explain your step 5, "start building immediately". Probably job2 is triggered by SCM polling before job1 has started up and started blocking job2.
          Try removing the SCM polling for job2.. when you see job1 startup, click "Build Now" for job2.. do you see the right behavior now?

          Alan Harder added a comment - So in your steps are both job1 and job2 polling SCM and starting at the same time? That would explain your step 5, "start building immediately". Probably job2 is triggered by SCM polling before job1 has started up and started blocking job2. Try removing the SCM polling for job2.. when you see job1 startup, click "Build Now" for job2.. do you see the right behavior now?

          mdonohue added a comment -

          If mutual exclusion is needed, then the 'locks and latches' plugin would be more appropriate.

          This feature to block downstream builds is more about efficiency, to avoid triggering downstream jobs more often than necessary. Because it's about efficiency, rather than correctness, I don't think this is a blocker.

          mdonohue added a comment - If mutual exclusion is needed, then the 'locks and latches' plugin would be more appropriate. This feature to block downstream builds is more about efficiency, to avoid triggering downstream jobs more often than necessary. Because it's about efficiency, rather than correctness, I don't think this is a blocker.
          mdonohue made changes -
          Priority Original: Blocker [ 1 ] New: Major [ 3 ]

          balazsdan added a comment -

          Ok, you are right. I've tried to simplify the steps, but in the simplification it's lost the meaning.. my config is much more complex and I think that config is not working as it should.

          Here is a better reproduction of my config:
          1. Set up 2 or more executors (the same happens with multiple slave nodes) and set quiet period to 120 secs!
          2. Create job1 with these settings:
          -SCM config
          -SCM poll, eg. */5 * * * *
          -add a lengthy build step (eg. ping 127.0.0.1 -w 1000 -n 600)
          3. Create job2 with these settings:
          -under "Advanced Project Options" check the "Block build when upstream project is building" option!
          -set the "Build after other projects are built" build trigger to job1
          4. Wait until job1 starts, it will show: "pending - In the quiet period."
          5. When job1 started and it is in quiet period, start job2 manually. It will display "pending - Upstream project job1 is already building."
          6. When quiet period elapsed, then both job will start building!

          So not required to start both jobs at the same time!
          I think the transition from "pending - quiet period" state to "building" state should not start (or let start) downstream jobs! Am I right or may I misunderstand something?

          balazsdan added a comment - Ok, you are right. I've tried to simplify the steps, but in the simplification it's lost the meaning.. my config is much more complex and I think that config is not working as it should. Here is a better reproduction of my config: 1. Set up 2 or more executors (the same happens with multiple slave nodes) and set quiet period to 120 secs! 2. Create job1 with these settings: -SCM config -SCM poll, eg. */5 * * * * -add a lengthy build step (eg. ping 127.0.0.1 -w 1000 -n 600) 3. Create job2 with these settings: -under "Advanced Project Options" check the "Block build when upstream project is building" option! -set the "Build after other projects are built" build trigger to job1 4. Wait until job1 starts, it will show: "pending - In the quiet period." 5. When job1 started and it is in quiet period, start job2 manually. It will display "pending - Upstream project job1 is already building." 6. When quiet period elapsed, then both job will start building! So not required to start both jobs at the same time! I think the transition from "pending - quiet period" state to "building" state should not start (or let start) downstream jobs! Am I right or may I misunderstand something?

          balazsdan added a comment -

          Guys, I've checked the latest version (1.351). I see this issue has no progress yet, but I was hoping a side effect of other fixes.

          So, if a downstream project is blocked by upstream project during quiet period, then at the end of quiet period both jobs start to build. You can start the downstream project manually during upstream quiet period, the effect is the same. I think it's a bug, am I right?

          balazsdan added a comment - Guys, I've checked the latest version (1.351). I see this issue has no progress yet, but I was hoping a side effect of other fixes. So, if a downstream project is blocked by upstream project during quiet period, then at the end of quiet period both jobs start to build. You can start the downstream project manually during upstream quiet period, the effect is the same. I think it's a bug, am I right?

          drazi added a comment -

          I think I may have a solution to this issue. This is a showstopper for me so I've been spending some time trying to work out what's going on.

          Queue.contains(Task) checks if the task is in blockedProjects, buildables or waitingList. But tasks are moved from buildables to popped before they start to execute, so there is a short window where Queue.contains(Task) returns false and Job.getLastBuild().isBuilding() also returns false.

          Adding the following code to Queue.contains(Task) fixes this for me. All that's needed is to consider anything in popped still to be in the queue. Tasks don't get removed from popped until after they've created their new Build, so this closes the window.

          for(Item item: popped.keySet()) {
              if(item.task == t)
                  return true;
              }
          }
          

          drazi added a comment - I think I may have a solution to this issue. This is a showstopper for me so I've been spending some time trying to work out what's going on. Queue.contains(Task) checks if the task is in blockedProjects , buildables or waitingList . But tasks are moved from buildables to popped before they start to execute, so there is a short window where Queue.contains(Task) returns false and Job.getLastBuild().isBuilding() also returns false. Adding the following code to Queue.contains(Task) fixes this for me. All that's needed is to consider anything in popped still to be in the queue. Tasks don't get removed from popped until after they've created their new Build, so this closes the window. for (Item item: popped.keySet()) { if (item.task == t) return true ; } }

          drazi added a comment -

          After more testing, I've found that there is a second bug at play here. In addition to the fix that I described in my previous comment, the following fix is also required:

          Job.isBuilding() needs to be changed to:

          public boolean isBuilding() {
              RunT b = getLastBuild();
              return b!=null && b.isLogUpdated();
          }
          

          Say we have three projects, A, B and C.
          A is configured to trigger B and C.
          B is configured to trigger C.

          A completes, so B and C are added to the queue. B starts running immediately, and C is blocked because it depends on B.

          When B completes, it changes its state to POST_PRODUCTION. Run.isBuilding() then returns false for B which allows C to start executing. But this all happens before B triggers C. When the trigger occurs, C is already building so it gets put back into the queue and builds for a second time.

          The fix above causes C not to start building until after B has run its triggers. So when B's triggers run, C is still in the queue and does not get rescheduled.

          With the combination of this change and the previous change to Queue.contains(Task), I can now trigger a build of a complex tree of interdependent modules, without any of them executing more than once.

          drazi added a comment - After more testing, I've found that there is a second bug at play here. In addition to the fix that I described in my previous comment, the following fix is also required: Job.isBuilding() needs to be changed to: public boolean isBuilding() { RunT b = getLastBuild(); return b!= null && b.isLogUpdated(); } Say we have three projects, A, B and C. A is configured to trigger B and C. B is configured to trigger C. A completes, so B and C are added to the queue. B starts running immediately, and C is blocked because it depends on B. When B completes, it changes its state to POST_PRODUCTION . Run.isBuilding() then returns false for B which allows C to start executing. But this all happens before B triggers C. When the trigger occurs, C is already building so it gets put back into the queue and builds for a second time. The fix above causes C not to start building until after B has run its triggers. So when B's triggers run, C is still in the queue and does not get rescheduled. With the combination of this change and the previous change to Queue.contains(Task), I can now trigger a build of a complex tree of interdependent modules, without any of them executing more than once.

          Alan Harder added a comment -

          For the first item you mentioned, see this recent discussion:
          http://hudson.361315.n4.nabble.com/Patch-to-fix-concurrent-build-problem-td2229136.html
          So that "popped" structure added in 1.360 may be refactored soon.

          Alan Harder added a comment - For the first item you mentioned, see this recent discussion: http://hudson.361315.n4.nabble.com/Patch-to-fix-concurrent-build-problem-td2229136.html So that "popped" structure added in 1.360 may be refactored soon.

            kohsuke Kohsuke Kawaguchi
            balazsdan balazsdan
            Votes:
            15 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated: