Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-69850

Queue maintain falls in an infinite recursive loop - preventing all jobs to be executed

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • core
    • 2.375, 2.361.4

      Issue

      After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after stargin Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).

      Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.

      Hypothesis on the cause

      This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - https://github.com/jenkinsci/jenkins/pull/6675, which was introduced in 2.361

      We do have the priority-sorter-plugin that could be interfering with the Queue, but I verified and the Blocking Items are all traversed anyway in AbstractProject

      Technical explanation

      Prerequisites

      • Job has blockBuildWhenDownstreamBuilding or blockBuildWhenUpstreamBuilding enabled
      • Job is blocked without an assigned BlockedItem.causeOfBlockage (null)
        I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.

        **
      • However, it's clearly supported as I could see that null BlockedItem.causeOfBlockage is supported in the code but causes the infinite loop since the mentioned modification
        • UPDATE 2022-10-17: It doesn't change the fact that null causeOfBlockage is supported, but here are where it could emanate from:
          • Restored from the Queue.xml at startup
          • Instantiated indirectly
          • A plugin
          • Another mechanism?

      Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump

      Cleaned up Call chain leading to the issue (reconstituted)

      There must be a null BlockedItem.causeOfBlockage

      // Read from bottom to top like a stacktrace
      
      -- Again, and so on
      
      hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
      hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
      hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
      hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
      hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197) 
      
      -- Another recursion of the loop
      
      hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
      hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
      hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
      hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
      hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
       
      -- Start of infinite recursive loop
      
      hudson.model.Queue.maintain(Queue.java:1539)
      
      -- Starts here

      Here's the real stack trace of the stackoverflow

       

      {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
       at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
       at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
       at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
       at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
       at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
       at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
       at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
       at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
       at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
       at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
       at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
       at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
       at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
       at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
       at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
       at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
       at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
       at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
       at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
      
      And it goes on and on and on... until stackoverflow

       

          [JENKINS-69850] Queue maintain falls in an infinite recursive loop - preventing all jobs to be executed

          Louis-Rémi Paquet created issue -

          I just saw the same error on one of our instances (Jenkins 2.361.2).

          Rolf Offermanns added a comment - I just saw the same error on one of our instances (Jenkins 2.361.2).
          Louis-Rémi Paquet made changes -
          Description Original: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after stargin Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null)
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code.
          {color}{color:#172b4d}null BlockedItem.causeOfBlockage is supported in the code but causes the infinite loop since the mentioned modification.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           
          New: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after starting Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job/Taks has to be of a type inheriting AbstractProject. Ex: FreeStyleProject, MavenModuleSet, etc.
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null).
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code that a null {color}{color:#172b4d}BlockedItem.causeOfBlockage is supported.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           
          Louis-Rémi Paquet made changes -
          Description Original: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after starting Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job/Taks has to be of a type inheriting AbstractProject. Ex: FreeStyleProject, MavenModuleSet, etc.
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null).
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code that a null {color}{color:#172b4d}BlockedItem.causeOfBlockage is supported.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           
          New: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after starting Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job/Taks has to be of a type inheriting AbstractProject. Ex: FreeStyleProject, MavenModuleSet, etc.
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null).
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code that a null {color}{color:#172b4d}BlockedItem.causeOfBlockage is supported.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           

          Since this could come up at any time, this issue is preventing us to upgrade.
          Louis-Rémi Paquet made changes -
          Description Original: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after starting Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job/Taks has to be of a type inheriting AbstractProject. Ex: FreeStyleProject, MavenModuleSet, etc.
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null).
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code that a null {color}{color:#172b4d}BlockedItem.causeOfBlockage is supported.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           

          Since this could come up at any time, this issue is preventing us to upgrade.
          New: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after stargin Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null)
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code.{color}{color:#172b4d}null BlockedItem.causeOfBlockage is supported in the code but causes the infinite loop since the mentioned modification.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          // Read from bottom to top like a stacktrace

          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           

          Louis-Rémi Paquet added a comment - - edited

          simon_sohrtteilo, basil , raihaan

          As you were involved in this PR, I'd appreciate your input in this.

          Given the fact that this functionality introduced by PR-6675 relies on item.getCauseOfBlockage() never being null (which isn't the case),
          this cannot be trivially fixed by conserving the 2 behaviors (null causeOfBlockage and PR-6675) as is.

          Thanks!

          Louis-Rémi Paquet added a comment - - edited simon_sohrt ,  teilo , basil  , raihaan As you were involved in this PR, I'd appreciate your input in this. Given the fact that this functionality introduced by PR-6675 relies on item.getCauseOfBlockage() never being null (which isn't the case), this cannot be trivially fixed by conserving the 2 behaviors (null causeOfBlockage and PR-6675 ) as is. Thanks!
          Louis-Rémi Paquet made changes -
          Description Original: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after stargin Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null)
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.
          However, it's clearly supported as I could see in the code.{color}{color:#172b4d}null BlockedItem.causeOfBlockage is supported in the code but causes the infinite loop since the mentioned modification.{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          // Read from bottom to top like a stacktrace

          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           
          New: h1. {color:#4c9aff}Issue{color}

          {color:#172b4d}After upgrading from 2.340(jdk8 image) to 2.372(jdk11 image), just after stargin Jenkins, the Queue maintain gets into an infinite recursive loop and throws a stackoverflow, rendering the Queue unusable (jobs can't run).{color}

          {color:#172b4d}Same scenario occurred twice in prod. Everything was fine during tests but obviously without the same jobs in the Queue.{color}
          h2. {color:#4c9aff}Hypothesis on the cause{color}

          This looks like very much an edge-case not caught by the tests and validations of this change, JENKINS-68780 - [https://github.com/jenkinsci/jenkins/pull/6675], which was introduced in [2.361|https://www.jenkins.io/changelog/]

          We do have the [priority-sorter-plugin|https://github.com/jenkinsci/priority-sorter-plugin] that could be interfering with the Queue, but I verified and the [Blocking Items are all traversed anyway in AbstractProject|https://github.com/jenkinsci/jenkins/blob/jenkins-2.372/core/src/main/java/hudson/model/AbstractProject.java#L1114]
          h1. {color:#4c9aff}Technical explanation{color}
          h3. {color:#4c9aff}Prerequisites{color}
           * Job has _blockBuildWhenDownstreamBuilding_ or _blockBuildWhenUpstreamBuilding_ enabled
           * Job is blocked without an assigned {color:#172b4d}BlockedItem.causeOfBlockage{color}{color:#172b4d} (null)
          I cannot yet explain how the BlockedItem.causeOfBlockage was null. I'm still investigation on that.{color}{color:#172b4d}
          **{color}
           * {color:#172b4d}However, it's clearly supported as I could see that {color}{color:#172b4d}*null BlockedItem.causeOfBlockage is supported* in the code but causes the infinite loop since the mentioned modification{color}
           ** {color:#172b4d}UPDATE 2022-10-17: It doesn't change the fact that null *causeOfBlockage* is supported, but here are where it could emanate from:
          {color}
           *** {color:#172b4d}Restored from the Queue.xml at startup
          {color}
           *** {color:#172b4d}Instantiated indirectly{color}
           *** {color:#172b4d}A plugin {color}
           *** {color:#172b4d}Another mechanism?{color}

          {color:#172b4d}Issue comes from the fact that BlockedItem.causeOfBlockage can be null. This has been validated with a heap dump
          !image-2022-10-13-18-28-08-221.png|width=498,height=130!{color}
          h3. {color:#4c9aff}Cleaned up Call chain leading to the issue (reconstituted){color}

          {color:#172b4d}There must be a null BlockedItem.causeOfBlockage{color}
          {noformat}
          // Read from bottom to top like a stacktrace

          -- Again, and so on

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

          -- Another recursion of the loop

          hudson.model.Qeueue$BlockedItem.getCauseOfBlockage(Queue.java:2630) [This is where the null causeOfBlockage is important]
          hudson.model.AbstractProject.getBuildingUpstream(AbtractProject.java:1143)
          hudson.model.AbstractProject.getCauseOfBlockage(AbtractProject.java:1094)
          hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
          hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           
          -- Start of infinite recursive loop

          hudson.model.Queue.maintain(Queue.java:1539)

          -- Starts here{noformat}
          {color:#172b4d}Here's the real stack trace of the stackoverflow{color}

           
          {noformat}
          {"thread_name":"jenkins.util.Timer [#1]","message":"Timer task hudson.model.Queue$MaintainTask@73873351 failed","timestamp":"2022-10-12 23:26:54.557","level":"SEVERE","mdc":{},"container":"master","logger_name":"hudson.triggers.SafeTimerTask","source_host":"bdbf33cd8b7c","exception_class":"java.lang.StackOverflowError","stacktrace":"java.lang.StackOverflowError
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1077)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)
           at hudson.model.AbstractProject.getBuildingUpstream(AbstractProject.java:1143)
           at hudson.model.AbstractProject.getCauseOfBlockage(AbstractProject.java:1094)
           at hudson.model.Queue.getCauseOfBlockageForTask(Queue.java:1240)
           at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
           at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2630)

          And it goes on and on and on... until stackoverflow{noformat}
           

          Simon Sohrt added a comment -

          l_r: Thank you for the detailed bug report.

          One possible solution would be to add the following method to Queue.BlockedItem:

          public boolean isCauseOfBlockageNull() {
              return causeOfBlockage == null ? true : false;
          }
          

          Now we can react better to a possible null-value in AbstractProject.getBuildingDownstream().

          If a downstream project has a CauseOfBlockage that is null, we only have two options:

          1. The downstream project blocks the currently considered project. This may lead to deadlocks in the Queue.
          2. The downstream project does not block the currently considered project. This may breach the contract of the "Block build when downstream project is building"-flag

          I believe option 2. is by far the better option. Considering that the current version of AbstractProject.getBuildingDownstream() is much stricter than the old version (before PR-6675 was merged), I think it is okay to relax the strictness somewhat.

          Thus I would modify AbstractProject.getBuildingDownstream() as follows:

          if (item.isCauseOfBlockageNull() ||
              item.getCauseOfBlockage() instanceof AbstractProject.BecauseOfUpstreamBuildInProgress ||
              item.getCauseOfBlockage() instanceof AbstractProject.BecauseOfDownstreamBuildInProgress) {
          

          Simon Sohrt added a comment - l_r : Thank you for the detailed bug report. One possible solution would be to add the following method to Queue.BlockedItem: public boolean isCauseOfBlockageNull() { return causeOfBlockage == null ? true : false ; } Now we can react better to a possible null-value in AbstractProject.getBuildingDownstream(). If a downstream project has a CauseOfBlockage that is null, we only have two options: The downstream project blocks the currently considered project. This may lead to deadlocks in the Queue. The downstream project does not block the currently considered project. This may breach the contract of the "Block build when downstream project is building"-flag I believe option 2. is by far the better option. Considering that the current version of AbstractProject.getBuildingDownstream() is much stricter than the old version (before PR-6675 was merged), I think it is okay to relax the strictness somewhat. Thus I would modify AbstractProject.getBuildingDownstream() as follows: if (item.isCauseOfBlockageNull() || item.getCauseOfBlockage() instanceof AbstractProject.BecauseOfUpstreamBuildInProgress || item.getCauseOfBlockage() instanceof AbstractProject.BecauseOfDownstreamBuildInProgress) {

          James Nord added a comment - - edited

          Thanks for the detailed report.

          IT looks like this occurs with the Maven project type.

          roffermanns do you also use this project type?

          Also for both do you use the "build modules in parallel" feature of Jenkins where it will build unrelated modules simultaneously (this is not the same as the maven -T option)

          James Nord added a comment - - edited Thanks for the detailed report. IT looks like this occurs with the Maven project type. roffermanns do you also use this project type? Also for both do you use the "build modules in parallel" feature of Jenkins where it will build unrelated modules simultaneously (this is not the same as the maven -T option)

          Simon Sohrt added a comment -

          I just submitted a PR with a test case for this scenario. It can be found here: https://github.com/jenkinsci/jenkins/pull/7273

          I was able to reproduce the issue by saving and then reloading a queue with blockedItems in it. 

          Simon Sohrt added a comment - I just submitted a PR with a test case for this scenario. It can be found here: https://github.com/jenkinsci/jenkins/pull/7273 I was able to reproduce the issue by saving and then reloading a queue with blockedItems in it. 

            Unassigned Unassigned
            l_r Louis-Rémi Paquet
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: