I created a job which contains a groovy script. The groovy script's job is to put a large number of builds into the Jenkins queue which are set to wait until machines with a specific label are found to run them. Only one of these jobs is set to run at a time on each machine.
By default, machines are set with the label "uninstalled". Once this job runs on them, that label is changed from "uninstalled" to "installing" and when finished "installed". The job is only allowed to run on nodes with the label "uninstalled" on them.
Right now, if a single machine has the "uninstalled" label attached to it, all of the jobs in the queue will set themselves to run on that node. Once a single job in the queue starts on that machine, that "uninstalled" label is set to "installing", which SHOULD take it out of the running for the other jobs to run on it. When that job completes, however, the jobs in the queue will still run on that node, even though it no longer has an "uninstalled" label.
Jenkins' queue should be changed so that before a job is transferred from the "Buildable" queue into the executor, it should check first to make sure that it still complies with the labels on the node it plans to build on. At random times Jenkins will catch on before the build actually goes off, but more often than not, all of the jobs in the queue will run on this one node instead of waiting for other ones to come in.
I have been looking at the Jenkins code for the queue and the maintain() method seems to be what needs changing here. It appears as though the JobOffer occurs before a last minute check for blockers that occurs on line 62 of the following code. Can anyone confirm if what I am looking at is right?
If anyone knows any ways around this in the meantime through the use of Groovy or whatever, your advice would be greatly appreciated.