Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38514

CauseOfBlockage from QueueTaskDispatcher.canTake discarded

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core

      If you have a QueueTaskDispatcher which returns a CauseOfBlockage from canRun, that becomes BlockedItem.getCauseOfBlockage, which is displayed in the queue widget.

      But if it returns a CauseOfBlockage from canTake (AFAICT the same for Node.canTake), JobOffer.canTake sees that it is non-null, throws out the actual object with all of its diagnostics, and you wind up with a BuildableItem with CauseOfBlockage.BecauseNodeIsBusy which tells you nothing and may be totally misleading.

      By asking an implementation to return a @CheckForNull CauseOfBlockage rather than a simple boolean, the implication is that a non-null return value will be displayed to the user. Currently this is not the case.

      To add insult to injury, Support Core does not report the result of canTake.

          [JENKINS-38514] CauseOfBlockage from QueueTaskDispatcher.canTake discarded

          Jesse Glick added a comment -

          Not clear that support-core can do anything, since canTake requires a specific Node.

          Jesse Glick added a comment - Not clear that support-core can do anything, since canTake requires a specific Node .

          Jesse Glick added a comment -

          Unfortunately it is not obvious how the relevant CauseOfBlockage can be identified: there can be numerous JobOffer s which are considered, yet we would expect most of them to refuse canTake, for example because of Node.LabelMissing. The “buildable” item stays in queue when all of the offers are rejected, but how do we identify the one which we expected to be accepted?

          Can certainly improve detail-level logging to allow the issue to be tracked down, but it is less clear that BuildableItem.getWhy can be improved to display the ultimate problem in the UI (or in support bundles without a custom logger).

          Jesse Glick added a comment - Unfortunately it is not obvious how the relevant CauseOfBlockage can be identified: there can be numerous JobOffer s which are considered, yet we would expect most of them to refuse canTake , for example because of Node.LabelMissing . The “buildable” item stays in queue when all of the offers are rejected, but how do we identify the one which we expected to be accepted? Can certainly improve detail-level logging to allow the issue to be tracked down, but it is less clear that BuildableItem.getWhy can be improved to display the ultimate problem in the UI (or in support bundles without a custom logger).

          Jesse Glick added a comment -

          For those running current core builds who wish to diagnose such issues, try running in /script:

          for (i in Jenkins.instance.queue.buildableItems) {
            println "considering ${i}"
            for (c in Jenkins.instance.computers) {
              println "found computer ${c}"
              EXEC: for (e in c.executors) {
                if (e.interrupted || !e.parking) continue
                println "with executor ${e}"
                def o = new Queue.JobOffer(Jenkins.instance.queue, e, null)
                if (!o.canTake(i)) {
                  println "${o} refused ${i}"
                  def node = o.node
                  if (node == null) {
                    println "no node associated with ${c}"
                    continue
                  }
                  def cob = node.canTake(i)
                  if (cob != null) {
                    println "because of ${cob}"
                    continue
                  }
                  for (d in hudson.model.queue.QueueTaskDispatcher.all()) {
                    cob = d.canTake(node, i)
                    if (cob != null) {
                      println "because of ${cob} from ${d}"
                      continue EXEC
                    }
                  }
                  if (!o.available) {
                    println "${o} not available"
                    if (o.workUnit != null) println "has a workUnit ${o.workUnit}"
                    if (c.offline) println "${c} is offline"
                    if (!c.acceptingTasks) println "${c} is not accepting tasks"
                  }
                }
              }
            }
          }
          

          In one reported case, the root issue was that the Authorize Project plugin was configured, so Node.canTake was returning anonymous doesn’t have a permission to run on [sic]; yet the build queue (and support bundle) displayed only Waiting for next available executor.

          Jesse Glick added a comment - For those running current core builds who wish to diagnose such issues, try running in /script : for (i in Jenkins.instance.queue.buildableItems) { println "considering ${i}" for (c in Jenkins.instance.computers) { println "found computer ${c}" EXEC: for (e in c.executors) { if (e.interrupted || !e.parking) continue println "with executor ${e}" def o = new Queue.JobOffer(Jenkins.instance.queue, e, null ) if (!o.canTake(i)) { println "${o} refused ${i}" def node = o.node if (node == null ) { println "no node associated with ${c}" continue } def cob = node.canTake(i) if (cob != null ) { println "because of ${cob}" continue } for (d in hudson.model.queue.QueueTaskDispatcher.all()) { cob = d.canTake(node, i) if (cob != null ) { println "because of ${cob} from ${d}" continue EXEC } } if (!o.available) { println "${o} not available" if (o.workUnit != null ) println "has a workUnit ${o.workUnit}" if (c.offline) println "${c} is offline" if (!c.acceptingTasks) println "${c} is not accepting tasks" } } } } } In one reported case, the root issue was that the Authorize Project plugin was configured, so Node.canTake was returning anonymous doesn’t have a permission to run on [sic]; yet the build queue (and support bundle) displayed only Waiting for next available executor .

          Code changed in jenkins
          User: Jesse Glick
          Path:
          core/src/main/java/hudson/model/Node.java
          core/src/main/java/hudson/model/Queue.java
          core/src/main/java/hudson/model/queue/CauseOfBlockage.java
          core/src/main/java/jenkins/model/queue/CompositeCauseOfBlockage.java
          core/src/main/resources/hudson/model/Messages.properties
          core/src/main/resources/jenkins/model/queue/CompositeCauseOfBlockage/summary.jelly
          test/src/test/java/hudson/model/queue/QueueTaskDispatcherTest.java
          test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java
          http://jenkins-ci.org/commit/jenkins/8d23041d4b785947dee1bc02f54a41d86b59bdda
          Log:
          JENKINS-38514 Retain CauseOfBlockage from JobOffer (#2651)

          • Converted to JenkinsRule.
          • Improved messages from Node.canTake.
          • [FIXED JENKINS-38514] BuildableItem needs to retain information from JobOffer about why it is neither blocked nor building.
          • Converted to JenkinsRule.
          • Found an existing usage of BecauseNodeIsNotAcceptingTasks.
          • Ensure that a BuildableItem which is simply waiting for a free executor reports that as its CauseOfBlockage.
          • Review comments from @oleg-nenashev.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/model/Node.java core/src/main/java/hudson/model/Queue.java core/src/main/java/hudson/model/queue/CauseOfBlockage.java core/src/main/java/jenkins/model/queue/CompositeCauseOfBlockage.java core/src/main/resources/hudson/model/Messages.properties core/src/main/resources/jenkins/model/queue/CompositeCauseOfBlockage/summary.jelly test/src/test/java/hudson/model/queue/QueueTaskDispatcherTest.java test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java http://jenkins-ci.org/commit/jenkins/8d23041d4b785947dee1bc02f54a41d86b59bdda Log: JENKINS-38514 Retain CauseOfBlockage from JobOffer (#2651) Converted to JenkinsRule. Improved messages from Node.canTake. [FIXED JENKINS-38514] BuildableItem needs to retain information from JobOffer about why it is neither blocked nor building. Converted to JenkinsRule. Found an existing usage of BecauseNodeIsNotAcceptingTasks. Original JENKINS-6598 test was checking behavior we want amended by JENKINS-38514 . Ensure that a BuildableItem which is simply waiting for a free executor reports that as its CauseOfBlockage. Review comments from @oleg-nenashev.

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: