Status: Resolved (View Workflow)
I've been using matrix projects to perform multi-platform builds for some time. After upgrading to 1.570, the parent jobs for the matrix projects are consuming real executors (iow - they are not running as flyweight jobs) on my build boxes, which is causing deadlocks because the child jobs are waiting for free executors, while the parent is waiting for the child to complete. Very similar behavior to
JENKINS-10944 . My build slaves are pretty much always online, so I don't think it's the same problem as described in JENKINS-22502. What additional information can I provide that might help track this down?
JENKINS-23902 Matrix parent takes executor slot and blocks children
The issue could be related to https://github.com/jenkinsci/throttle-concurrent-builds-plugin/pull/22 (no JIRA issue)
Do you use throttle-concurrent-builds-plugin?
One possible pattern I've just noticed is that it seems that the parent job only consumes a real executor if there already is a flyweight job running on the node. I have all of my matrix parents running on a single node to ensure that the initial checkouts are as quick as possible, and in the past, many flyweight jobs would run on this node happily. Since the upgrade, I've only ever observed one running, and additional jobs that should be flyweight jobs, consume the executor on this node.
EDIT - Apparently only the second job that should be flyweight eats the executor. A third flyweight job correctly runs as a flyweight...