Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22800

Downstream build does not wait for upstream build to complete

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • build-pipeline-plugin
    • None
    • Windows Server, Jenkins is installed as Windows Service

      Hi, we have found an issue in our build pipeline. We are using Jenkins ver. 1.554.1

      The setup is as follows:

      • The master does not execute builds
      • There are multiple slave nodes connected to the master. All builds are executed on the slaves
      • Job A
        • is polling for SCM
        • is set to have a quiet period of 10 seconds
        • on build, is generating an artifact
      • Job B
        • is polling for SCM
        • is set to have a quiet period of 10 seconds
        • is set to be dependent on Job A ("Build after other projects are built")
        • is set to not build when upstream jobs are building ("Block build when upstream project is building")
        • on build, downloads $HUDSON_URL/job/A/lastSuccessfulBuild/[path-to-artifact]/artifact.zip

      Given that

      • Job A's last successful build is #9
      • Job B's last successful build is #19
      • User Eve checks in files to a location observed by both jobs
      • User Eve checks in files such that both jobs will detect an SCM change

      We can now observe the following sequence of events:

      1. Job A observes the changes and triggers build #10 that is put into the waiting queue
      2. Job B observes the changes and triggers build #20 that is put into the waiting queue
      3. The quiet period elapses, job A build #10 starts to build
      4. Job A build #10 is finished generating the artifact, it is uploaded to the master
      5. Job A build #10 tries to trigger a new build of Job B. Job B build #20 is already in the waiting queue, so no new build is triggered
      6. Job A build #10 does not complete for the next 30 seconds. we believe this is due to uploading the build log to the master, but we are uncertain about this. (correction: it was due to the disk usage plugin, which we now removed. There is no noticable delay anymore, but the problem described here persists)
      7. Job B build #20 starts to build
      8. Job B build #20 downloads $HUDSON_URL/job/A/lastSuccessfulBuild/[path-to-artifact]/artifact.zip (lastSuccessfulBuild seems to point to Job A build #9 at this time)
      9. Job A completes with the message "Finished: SUCCESS"
      10. Job B completes with the message "Finished: SUCCESS"
      11. No more builds of Jobs A and B are pending/will be running

      We are uncertain whether the problem lies in step 7 or step 8. Our requirement is that Job B build #20 downloads the artifact of the last succeeded build of job A, which should be build #10. This seems like a bug to us.

      This issue seems related to JENKINS-5125. Looking at the source code reveals that this code fragment proposed in JENKINS-5125 is not implemented:

      public boolean isBuilding() {
          RunT b = getLastBuild();
          return b!=null && b.isLogUpdated();
      }
      

      We think there might be a chance that implementing this code fragment would also fix our issue.

          [JENKINS-22800] Downstream build does not wait for upstream build to complete

          Might be related to JENKINS-20989
          Sounds like Job B is being triggered before Job A lastSuccessfulBuild is updated to point to Job A #10

          Geoff Cummings added a comment - Might be related to JENKINS-20989 Sounds like Job B is being triggered before Job A lastSuccessfulBuild is updated to point to Job A #10

          Bruce Rust added a comment -

          Reading this logic, it does not seem like a bug. Job B is triggered off of a SCM poll and so gets the latest Job A build at the time (which is build 9). That information is cached. If anything, the bug is that Job B should have two jobs running. The first one kicked off from the poll pointing to build 9 of job A and the second from the upstream Job A using build 10. So instead of only one job running, both should run.

          I am not sure why you would have Job B doing an SCM poll (unless the SCM repos were completely different from Job A) when it is dependent on Job A and it seems they both trigger off the same SCM changes. Just trigger Job A with a SCM poll and Job B will kick off as a downstream job and thus get the correct job artifacts from the upstream job.

          Just my $0.02

          Bruce Rust added a comment - Reading this logic, it does not seem like a bug. Job B is triggered off of a SCM poll and so gets the latest Job A build at the time (which is build 9). That information is cached. If anything, the bug is that Job B should have two jobs running. The first one kicked off from the poll pointing to build 9 of job A and the second from the upstream Job A using build 10. So instead of only one job running, both should run. I am not sure why you would have Job B doing an SCM poll (unless the SCM repos were completely different from Job A) when it is dependent on Job A and it seems they both trigger off the same SCM changes. Just trigger Job A with a SCM poll and Job B will kick off as a downstream job and thus get the correct job artifacts from the upstream job. Just my $0.02

          @bruce_rust: You said:

          Job B is triggered off of a SCM poll and so gets the latest Job A build at the time (which is build 9)

          This is in contrary to:

          • Job B
            • is set to be dependent on Job A ("Build after other projects are built")
            • is set to not build when upstream jobs are building ("Block build when upstream project is building")

          Job B may be triggered, as you say, off of an SCM poll, while Job A is building, but then the Build of Job B must remain in the wait queue. Job B is set to have the two properties above, so it must not start while Job A is running. No matter how many SCM polls/hooks/triggers there are.

          I corrected small things pointing into this direction in the story description. If you think I misread what you meant, I kindly ask you to clarify it referring to this comment.

          For your reference, the repositories of Job A and Job B are completely different. But from my point of view this does influence the core issue here, that is, Build B starts before Build A is completely finished.

          Benjamin Ernst added a comment - @bruce_rust: You said: Job B is triggered off of a SCM poll and so gets the latest Job A build at the time (which is build 9) This is in contrary to: Job B is set to be dependent on Job A ("Build after other projects are built") is set to not build when upstream jobs are building ("Block build when upstream project is building") Job B may be triggered, as you say, off of an SCM poll, while Job A is building, but then the Build of Job B must remain in the wait queue. Job B is set to have the two properties above, so it must not start while Job A is running. No matter how many SCM polls/hooks/triggers there are. I corrected small things pointing into this direction in the story description. If you think I misread what you meant, I kindly ask you to clarify it referring to this comment. For your reference, the repositories of Job A and Job B are completely different. But from my point of view this does influence the core issue here, that is, Build B starts before Build A is completely finished.

          I changed the "affected version" in my description to 1.554.1, as we've updated and can still observe the issue.

          Benjamin Ernst added a comment - I changed the "affected version" in my description to 1.554.1, as we've updated and can still observe the issue.

          Tim Wood added a comment - - edited

          As of v. 1.567, I observe the same problem without involvement of SCM-polling jobs.

          In my case, I have a job A that uses a system Groovy script internally to queue runs of job B (with different parameters). Then job A has a post-build step to trigger job C, which depends on the results of the queued runs of B. Instead, I observe that job C runs immediately after A, "jumping the queue" of all the runs for job B.

          The build trigger functions should always add the new run in priority order (if specified) to the run queue, unless "Block until..." is checked.

          Tim Wood added a comment - - edited As of v. 1.567, I observe the same problem without involvement of SCM-polling jobs. In my case, I have a job A that uses a system Groovy script internally to queue runs of job B (with different parameters). Then job A has a post-build step to trigger job C, which depends on the results of the queued runs of B. Instead, I observe that job C runs immediately after A, "jumping the queue" of all the runs for job B. The build trigger functions should always add the new run in priority order (if specified) to the run queue, unless "Block until..." is checked.

          Paul Draper added a comment -

          I have also observed this behavior.

          This issue is a year old.

          Did waiting for an upstream build ever work?

          Paul Draper added a comment - I have also observed this behavior. This issue is a year old. Did waiting for an upstream build ever work?

          We observed this behavior too.
          When we enabled the following, on the parent job (Job A)

          Post-build Actions
          Build other projects : Job B
           	Projects to build		
           		(*)Trigger only if build is stable
          

          Job B did wait until the upstream job finished its build.
          Hope this helps!

          Swathi Venkatachala added a comment - We observed this behavior too. When we enabled the following, on the parent job (Job A) Post-build Actions Build other projects : Job B Projects to build (*)Trigger only if build is stable Job B did wait until the upstream job finished its build. Hope this helps!

          Justin Rodante added a comment - - edited

          Three issues have been in this JIra for the longest time, all related, all major or critical, and it's still a problem to this day.  JENKINS-22800 JENKINS-5125 JENKINS-5150 

          This is a pretty commonly used feature in heavy dependency chains, without hacking my own hard to manage "Build X after passing" for every one of my jobs, can we fix this long standing problem?

           

          We are on Jenkins 2.68

          Justin Rodante added a comment - - edited Three issues have been in this JIra for the longest time, all related, all major or critical, and it's still a problem to this day.   JENKINS-22800   JENKINS-5125   JENKINS-5150   This is a pretty commonly used feature in heavy dependency chains, without hacking my own hard to manage "Build X after passing" for every one of my jobs, can we fix this long standing problem?   We are on Jenkins 2.68

            Unassigned Unassigned
            banjobenisma Benjamin Ernst
            Votes:
            3 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated: