Since the upgrade to LTS 2.319.2 our Docker Cloud jobs stay in the queue for several minutes before they start because they claim that no running node matches their label expression. During this time it does start a matching node every ten seconds if the template allows multiple nodes to be started with that template.
Presumably this is related to this issue mentioned in the changelog for 2.319.2
Only apply trimLabels operation to affected nodes when adding or removing them. (issue 67099)
I have narrow it down to a difference between (qi is a Queue item as returned by Jenkins.instance.getQueue().getItems())
The latter is correct, the former is not (and used by Queue), apparently because
returns no nodes even though in the case where it does not cache its results in this.nodes it is based on Jenkins.get().getNodes() which does return nodes.
I haven't quite figured out what exactly changes after a few minutes to allow the job to run after all but presumably it has something to do with the periodic calling of trimLabels every 5 minutes mentioned in a few comments in Jenkins.java.