Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-30084

FlyWeightTasks tied to a label will not cause node provisioning and will be blocked forever.

      When a flyweighttask is limited to run on a specific label (e.g. matrix project set restrict where this project can run) if there are no nodes with that label available when it enters the queue then it will immediatly move to blocked.

      As it is blocked the Node provisioner will not attempt to create any slaves, so the project will sit in the queue forever (or until some other project allocates a slave with the correct label).

      Seems to be a regression introduced by JENKINS-24519

          [JENKINS-30084] FlyWeightTasks tied to a label will not cause node provisioning and will be blocked forever.

          James Nord created issue -

          James Nord added a comment -

          looks like it was caused by JENKINS-24519

          James Nord added a comment - looks like it was caused by JENKINS-24519
          James Nord made changes -
          Link New: This issue is related to JENKINS-24519 [ JENKINS-24519 ]

          James Nord added a comment -

          https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1439-L1446

          https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1539-L1549

          The queue appears to just shunt flywieght tasks to blocked which then the NodeProvisioner will not care about creating new nodes for unless a slave with the right label is online.

          Aug 20, 2015 6:08:00 AM hudson.model.Queue scheduleInternal	FINE: hudson.matrix.MatrixProject@a3c7785[myMatrixProject] added to queue	
          Aug 20, 2015 6:08:00 AM hudson.model.Queue maintain	FINE: Queue maintenance started hudson.model.Queue@111ab215	
          Aug 20, 2015 6:08:00 AM hudson.model.Queue$BlockedItem enter	FINE: hudson.model.Queue$BlockedItem:hudson.matrix.MatrixProject@a3c7785[myMatrixProject]:71 is blocked	
          Aug 20, 2015 6:08:00 AM hudson.model.Queue maintain	FINE: Queue maintenance started hudson.model.Queue@111ab215	
          Aug 20, 2015 6:08:00 AM com.cloudbees.jenkins.QuickProvision$1 run	FINE: standard.nodeProvisioner.suggestReviewNow() -> queue.length()=0	
          Aug 20, 2015 6:08:03 AM hudson.model.Queue maintain	FINE: Queue maintenance started hudson.model.Queue@111ab215	
          Aug 20, 2015 6:08:04 AM jenkins.metrics.api.Metrics$HealthChecker$3 run	FINE: Started jenkins.metrics.api.Metrics$HealthChecker	
          Aug 20, 2015 6:08:04 AM jenkins.metrics.api.Metrics$HealthChecker$3 run	FINE: Finished jenkins.metrics.api.Metrics$HealthChecker. 2 ms	
          Aug 20, 2015 6:08:08 AM hudson.model.Queue maintain	FINE: Queue maintenance started hudson.model.Queue@111ab215	
          Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run	FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required	
          Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run	FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required	
          Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run	FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required	
          ...
             a node is provisioned sometime later for a different project...
          ...
          Aug 20, 2015 8:50:54 AM hudson.model.Queue$BlockedItem leave    FINE: hudson.model.Queue$BlockedItem:hudson.matrix.MatrixProject@a3c7785[myMatrixProject]:71 no longer blocked 
          

          James Nord added a comment - https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1439-L1446 https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1539-L1549 The queue appears to just shunt flywieght tasks to blocked which then the NodeProvisioner will not care about creating new nodes for unless a slave with the right label is online. Aug 20, 2015 6:08:00 AM hudson.model.Queue scheduleInternal FINE: hudson.matrix.MatrixProject@a3c7785[myMatrixProject] added to queue Aug 20, 2015 6:08:00 AM hudson.model.Queue maintain FINE: Queue maintenance started hudson.model.Queue@111ab215 Aug 20, 2015 6:08:00 AM hudson.model.Queue$BlockedItem enter FINE: hudson.model.Queue$BlockedItem:hudson.matrix.MatrixProject@a3c7785[myMatrixProject]:71 is blocked Aug 20, 2015 6:08:00 AM hudson.model.Queue maintain FINE: Queue maintenance started hudson.model.Queue@111ab215 Aug 20, 2015 6:08:00 AM com.cloudbees.jenkins.QuickProvision$1 run FINE: standard.nodeProvisioner.suggestReviewNow() -> queue.length()=0 Aug 20, 2015 6:08:03 AM hudson.model.Queue maintain FINE: Queue maintenance started hudson.model.Queue@111ab215 Aug 20, 2015 6:08:04 AM jenkins.metrics.api.Metrics$HealthChecker$3 run FINE: Started jenkins.metrics.api.Metrics$HealthChecker Aug 20, 2015 6:08:04 AM jenkins.metrics.api.Metrics$HealthChecker$3 run FINE: Finished jenkins.metrics.api.Metrics$HealthChecker. 2 ms Aug 20, 2015 6:08:08 AM hudson.model.Queue maintain FINE: Queue maintenance started hudson.model.Queue@111ab215 Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required Aug 20, 2015 6:08:10 AM hudson.slaves.NodeProvisioner$2 run FINE: Queue length 0 is less than the available capacity 0. No provisioning strategy required ... a node is provisioned sometime later for a different project... ... Aug 20, 2015 8:50:54 AM hudson.model.Queue$BlockedItem leave FINE: hudson.model.Queue$BlockedItem:hudson.matrix.MatrixProject@a3c7785[myMatrixProject]:71 no longer blocked

          Jesse Glick added a comment -

          Does seem like a bug in general, but: what is the actual use case for setting a label on the matrix parent project? Does it not suffice to allow it to run on master and use node/label axes for the actual build?

          Jesse Glick added a comment - Does seem like a bug in general, but: what is the actual use case for setting a label on the matrix parent project? Does it not suffice to allow it to run on master and use node/label axes for the actual build?

          Tom Sackett added a comment -

          I think I have a use case, but it's related to jobs based on the Build Flow plugin (which also appears to be affected by this bug), rather than matrix jobs. A build flow job that polls Perforce has to be assigned to a machine that can run Perforce commands. Large organizations often put additional security on Perforce that includes limiting which machines can access Perforce servers using automation accounts, so we can't always configure the master to access Perforce.

          Tom Sackett added a comment - I think I have a use case, but it's related to jobs based on the Build Flow plugin (which also appears to be affected by this bug), rather than matrix jobs. A build flow job that polls Perforce has to be assigned to a machine that can run Perforce commands. Large organizations often put additional security on Perforce that includes limiting which machines can access Perforce servers using automation accounts, so we can't always configure the master to access Perforce.
          James Nord made changes -
          Assignee New: valentina armenise [ varmenise ]
          James Nord made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

          Jesse Glick added a comment -

          Need a new test case in QueueTest reproducing this.

          Jesse Glick added a comment - Need a new test case in QueueTest reproducing this.
          James Nord made changes -
          Remote Link New: This issue links to "PR 1815 (Web Link)" [ 13136 ]

            varmenise valentina armenise
            teilo James Nord
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: