-
New Feature
-
Resolution: Unresolved
-
Blocker
-
None
Today when gearman schedules a build gearman it decides on which node a build should be executed instead of relying on Jenkins to decide the node.
The Jenkins node balancer seems to be executed but the node is already predecided by gearman.
This makes the user not being able to use the default Jenkins Load Balancer.
Also if you use Jenkins plugins that affects scheduling they do not work properly since gearman has already decided on which node. I have verified that the following plugins does not work properly when gearman trigger builds:
Scoring Load Balancer Plugin
Least Load Plugin
Throttle Concurrent Builds Plugin
Example logs when using Scoring Load Balancer Plugin
Scoring for ... JobTriggeredByJenkinsOnGenericLabel:
generic-JENKINS06: 62
generic-JENKINS12: 62
generic-JENKINS15: 27
generic-JENKINS13: 5
generic-JENKINS10: 5
generic-JENKINS09: 5
generic-JENKINS14: 5
generic-JENKINS07: -50
generic-JENKINS16: -83
Scoring for ... JobTriggeredByGearmanOnGenericLabel:
generic-JENKINS09: 5
NOTE: These logs shows that the Scoring Load Balancer gets executed but instead of evaluating which nodes in the label group to choose it only has the option to evaluate the node that gearman has choosen.
I am facing with issue as well with the Throttle Concurrent Builds Plugin. My use case can be summarized down to:
The job is allowed to run concurrently, but since it almost starve a slave, I am using the Throttle Concurrent Builds Plugin to only allow one instance of the job per node.
I have no idea how the Zuul Gearman server schedule the jobs, it seems to be using round robin over all available workers with Gearman Plugin reassigning the worker to a Jenkins executor slot on the fly.
Anyway I often end up with:
Looking at https://github.com/jenkinsci/throttle-concurrent-builds-plugin it implements the same extension point / method: QueueTaskDispatcher.canTake() which state:
So it is blocked properly. But as noted by Christian, there is only one node to choose from anyway.
Seems GearmanProxy.canTake() iterates all the node workers threads and simply rely on NodeAvailabilityMonitor.canTake() which seems to have its own locking not respecting QueueTaskDispatcher.canTake() which is altered by other plugins.
Could GearmanProxy.canTake() ask the QueueTaskDispatcher instead?