Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24519

Flyweight tasks only use one-off executor when they can be scheduled immediately

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • None
    • Linux/ubuntu14.04

      I have a fresh downloaded jenkins whitout plugins running and configured
      one dumb slave (slave1) connected via ssh+user/password.
      I set up a matrix project (a minimal example is attached) to build on
      slave1 and tie the parent to that slave (in advanced project settings)
      When
      1) the slave has only 1 executor
      2) the slave is configured to get online when needed ("take this slave on-line when in demand and off-line when idle")
      then the deadlock happens if the slave is offline when the project is triggered. The parent build wakes up the slave, but then stucks saying

      Configuration tst ยป slave1 is still in the queue: Waiting for next available executor on slave1

      Note that 1) and 2) are necessary to reproduce the bug. If the slave is
      accidentally online at build start, the project builds fine.

      This is maybe JENKINS-22502 but the minimalistic setup should allow
      to reproduce the bug easily.

          [JENKINS-24519] Flyweight tasks only use one-off executor when they can be scheduled immediately

          felix schwitzer created issue -

          Daniel Beck added a comment -

          Confirmed on Jenkins 1.577 out of the box. May be an element of randomness to it, as I've had to try twice. The first one (waiting one minute for the slave to come online) didn't work, but the second try (impatiently launched slave manually via web UI) worked.

          Daniel Beck added a comment - Confirmed on Jenkins 1.577 out of the box. May be an element of randomness to it, as I've had to try twice. The first one (waiting one minute for the slave to come online) didn't work, but the second try (impatiently launched slave manually via web UI) worked.

          Daniel Beck added a comment -

          Source of the problem seems to be that Flyweight tasks (those running on one-off executors) effectively lose their "flyweight" property when they cannot be scheduled the moment they enter the queue:

          https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1097

          Daniel Beck added a comment - Source of the problem seems to be that Flyweight tasks (those running on one-off executors) effectively lose their "flyweight" property when they cannot be scheduled the moment they enter the queue: https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1097
          Daniel Beck made changes -
          Component/s Original: matrix [ 15501 ]
          Summary Original: matrix build deadlocks slave New: Flyweight tasks only use one-off executor when they can be scheduled immediately

          Daniel Beck added a comment -

          FWIW I think I've seen this before with Build Flow as well.

          Daniel Beck added a comment - FWIW I think I've seen this before with Build Flow as well.
          Daniel Beck made changes -
          Link New: This issue is duplicated by JENKINS-22072 [ JENKINS-22072 ]
          Daniel Beck made changes -
          Link New: This issue is duplicated by JENKINS-24748 [ JENKINS-24748 ]

          Jesse Glick added a comment -

          What is the code you are referring to? Line numbers in blob URLs using a branch name rather than a hash are meaningless if the file is significantly changed. Do you mean

          https://github.com/jenkinsci/jenkins/blob/7156e4470f05a9c41015be78d827dd1be0c02fca/core/src/main/java/hudson/model/Queue.java#L1097

          ?

          Jesse Glick added a comment - What is the code you are referring to? Line numbers in blob URLs using a branch name rather than a hash are meaningless if the file is significantly changed. Do you mean https://github.com/jenkinsci/jenkins/blob/7156e4470f05a9c41015be78d827dd1be0c02fca/core/src/main/java/hudson/model/Queue.java#L1097 ?
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-10944 [ JENKINS-10944 ]

          Daniel Beck added a comment -

          Jesse: Sorry about that. That commit was the last change to the file before Aug 31 when I posted the comment, so you're absolutely correct

          Daniel Beck added a comment - Jesse: Sorry about that. That commit was the last change to the file before Aug 31 when I posted the comment, so you're absolutely correct

            jglick Jesse Glick
            felixschwitzer felix schwitzer
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: