Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28936

Gearman plugin should not decide on which node a build should be executed

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • gearman-plugin
    • None

      Today when gearman schedules a build gearman it decides on which node a build should be executed instead of relying on Jenkins to decide the node.
      The Jenkins node balancer seems to be executed but the node is already predecided by gearman.

      This makes the user not being able to use the default Jenkins Load Balancer.
      Also if you use Jenkins plugins that affects scheduling they do not work properly since gearman has already decided on which node. I have verified that the following plugins does not work properly when gearman trigger builds:
      Scoring Load Balancer Plugin
      Least Load Plugin
      Throttle Concurrent Builds Plugin

      Example logs when using Scoring Load Balancer Plugin

      Scoring for ... JobTriggeredByJenkinsOnGenericLabel:
      generic-JENKINS06: 62
      generic-JENKINS12: 62
      generic-JENKINS15: 27
      generic-JENKINS13: 5
      generic-JENKINS10: 5
      generic-JENKINS09: 5
      generic-JENKINS14: 5
      generic-JENKINS07: -50
      generic-JENKINS16: -83

      Scoring for ... JobTriggeredByGearmanOnGenericLabel:
      generic-JENKINS09: 5

      NOTE: These logs shows that the Scoring Load Balancer gets executed but instead of evaluating which nodes in the label group to choose it only has the option to evaluate the node that gearman has choosen.

          [JENKINS-28936] Gearman plugin should not decide on which node a build should be executed

          Antoine Musso added a comment -

          I am facing with issue as well with the Throttle Concurrent Builds Plugin. My use case can be summarized down to:

          • permanent slaves having 2 executors. Slave 1 and Slave 2
          • A long job consuming lot of disk space.

          The job is allowed to run concurrently, but since it almost starve a slave, I am using the Throttle Concurrent Builds Plugin to only allow one instance of the job per node.

          I have no idea how the Zuul Gearman server schedule the jobs, it seems to be using round robin over all available workers with Gearman Plugin reassigning the worker to a Jenkins executor slot on the fly.

          Anyway I often end up with:

          • Slave 1 running the job
          • Slave 2 idling
          • The Jenkins queue showing the job waiting for next available executor on Slave 1.

          Looking at https://github.com/jenkinsci/throttle-concurrent-builds-plugin it implements the same extension point / method: QueueTaskDispatcher.canTake() which state:

          Vetos are additive. When multiple QueueTaskDispatchers are in the system, the task won't run on the given node if any one of them returns a non-null value. (This relationship is also the same with built-in check logic.)

          So it is blocked properly. But as noted by Christian, there is only one node to choose from anyway.

          Seems GearmanProxy.canTake() iterates all the node workers threads and simply rely on NodeAvailabilityMonitor.canTake() which seems to have its own locking not respecting QueueTaskDispatcher.canTake() which is altered by other plugins.

          Could GearmanProxy.canTake() ask the QueueTaskDispatcher instead?

          Antoine Musso added a comment - I am facing with issue as well with the Throttle Concurrent Builds Plugin. My use case can be summarized down to: permanent slaves having 2 executors. Slave 1 and Slave 2 A long job consuming lot of disk space. The job is allowed to run concurrently, but since it almost starve a slave, I am using the Throttle Concurrent Builds Plugin to only allow one instance of the job per node. I have no idea how the Zuul Gearman server schedule the jobs, it seems to be using round robin over all available workers with Gearman Plugin reassigning the worker to a Jenkins executor slot on the fly. Anyway I often end up with: Slave 1 running the job Slave 2 idling The Jenkins queue showing the job waiting for next available executor on Slave 1. Looking at https://github.com/jenkinsci/throttle-concurrent-builds-plugin it implements the same extension point / method: QueueTaskDispatcher.canTake() which state: Vetos are additive. When multiple QueueTaskDispatchers are in the system, the task won't run on the given node if any one of them returns a non-null value. (This relationship is also the same with built-in check logic.) So it is blocked properly. But as noted by Christian, there is only one node to choose from anyway. Seems GearmanProxy.canTake() iterates all the node workers threads and simply rely on NodeAvailabilityMonitor.canTake() which seems to have its own locking not respecting QueueTaskDispatcher.canTake() which is altered by other plugins. Could GearmanProxy.canTake() ask the QueueTaskDispatcher instead?

            Unassigned Unassigned
            ki82 Christian Bremer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: