Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-26380

Deadlock between Queue and Jenkins model


      When a node is removed (synchronized Jenkins.removeNode()), then eventually Computer.setNumExecutors() is called, attempting to lock the Queue. If, at the same time, a CloudRetentionStrategy.check() runs and determines that a node should be terminated, it locks the Queue before calling Jenkins.removeNode() and attempting to get a lock on the Jenkins object.

      A thread dump (deadlock.tdump) is attached which shows the deadlock.

      We're using DockerComputers that use the OnceRetentionStrategy, which means that nodes are removed every time a task completes, so the potential for this deadlock occurring is quite high (we experience it multiple times per day).

            stephenconnolly Stephen Connolly
            bernie Bernie Schelberg
            0 Vote for this issue
            6 Start watching this issue