Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46248

Deadlock in queue maintenance + node removal

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Got a deadlock while developing a cloud plugin  I was attempting to have the plugin delete nodes as soon as they finish a task, in taskCompleted. Specifically in this case, the following happened:

      1. Cloud plugin started provisioning, added new nodes to Jenkins, and returned callable PlannedNode
      2. Node connected immediately upon adding node
      3. Job was scheduled and started running
      4. Job was extremely fast (just a pipeline node with hello world echo
      5. taskCompleted called
      6. Node removed in taskCompleted.

      My hunch here is that this has little to do with the cloud plugin, and more to do with simply having a job that executes extremely quickly with the addition of node removal in taskCompleted.

      WARNING: Some health checks are reporting as unhealthy: [thread-deadlock : [AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#64] locked on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@545ee67e (owned by Executor #0 for Windows.10.Jenkins.Amd64-0816090454259-0 : executing PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{runId=helix-agents-test#23,label=Windows.10.Jenkins.Amd64,context=CpsStepContext[11:node]:Owner[helix-agents-test/23:helix-agents-test #23],cookie=null}):
      at sun.misc.Unsafe.park(Native Method)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
      at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
      at hudson.model.Executor.isParking(Executor.java:640)
      at hudson.model.Queue.maintain(Queue.java:1442)
      at hudson.model.Queue$1.call(Queue.java:321)
      at hudson.model.Queue$1.call(Queue.java:318)
      at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:108)
      at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:98)
      at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
      at java.lang.Thread.run(Thread.java:748)
      , Executor #0 for Windows.10.Jenkins.Amd64-0816090454259-0 : executing PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{runId=helix-agents-test#23,label=Windows.10.Jenkins.Amd64,context=CpsStepContext[11:node]:Owner[helix-agents-test/23:helix-agents-test #23],cookie=null} locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@5a990285 (owned by AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#64]):
      at sun.misc.Unsafe.park(Native Method)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      at hudson.model.Queue._withLock(Queue.java:1340)
      at hudson.model.Queue.withLock(Queue.java:1219)
      at jenkins.model.Nodes.removeNode(Nodes.java:237)
      at jenkins.model.Jenkins.removeNode(Jenkins.java:2123)
      at com.microsoft.helix.helixagents.HelixComputer.taskCompleted(HelixComputer.java:69)
      at hudson.model.queue.WorkUnitContext.synchronizeEnd(WorkUnitContext.java:140)
      at hudson.model.Executor.finish1(Executor.java:451)
      at hudson.model.Executor.completedAsynchronous(Executor.java:473)
      at jenkins.model.queue.AsynchronousExecution.setExecutor(AsynchronousExecution.java:115)
      at hudson.model.Executor.run(Executor.java:409)
      

       

        Attachments

          Issue Links

            Activity

            mmitche Matthew Mitchell created issue -
            jglick Jesse Glick made changes -
            Field Original Value New Value
            Link This issue relates to JENKINS-50020 [ JENKINS-50020 ]
            jglick Jesse Glick made changes -
            Labels deadlock pipeline threads
            jglick Jesse Glick made changes -
            Labels deadlock pipeline threads deadlock pipeline queue threads
            jglick Jesse Glick made changes -
            Remote Link This issue links to "PR 3354 (Web Link)" [ 20794 ]
            jglick Jesse Glick made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            jglick Jesse Glick made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            jglick Jesse Glick made changes -
            Assignee Pavel Avgustinov [ pavgust ]
            oleg_nenashev Oleg Nenashev made changes -
            Resolution Fixed [ 1 ]
            Status In Review [ 10005 ] Resolved [ 5 ]
            oleg_nenashev Oleg Nenashev made changes -
            Labels deadlock pipeline queue threads deadlock lts-candidate pipeline queue threads
            olivergondza Oliver Gondža made changes -
            Labels deadlock lts-candidate pipeline queue threads 2.121.2-rejected deadlock pipeline queue threads
            olivergondza Oliver Gondža made changes -
            Labels 2.121.2-rejected deadlock pipeline queue threads 2.121.2-rejected deadlock lts-candidate pipeline queue threads
            olivergondza Oliver Gondža made changes -
            Labels 2.121.2-rejected deadlock lts-candidate pipeline queue threads 2.121.2-rejected 2.121.3-fixed deadlock pipeline queue threads

              People

              Assignee:
              pavgust Pavel Avgustinov
              Reporter:
              mmitche Matthew Mitchell
              Votes:
              6 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: