Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28690

Deadlock in hudson.model.Executor

    XMLWordPrintable

Details

    Description

      In very specific scenario, when build is running on slave, and PingThread detects slave as unavailable deadlock occurs in Executor thread of that slave.

      stacktrace:

      "Executor #0 for xxxx : executing xxxx #9" daemon prio=10 tid=0x00007f444248b800 nid=0x66e0 waiting on condition [0x00007f448a92f000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
      	at hudson.model.Executor.interrupt(Executor.java:183)
      	at hudson.model.Executor.interrupt(Executor.java:164)
      	at hudson.model.Executor.interrupt(Executor.java:158)
      	at hudson.model.Executor.interrupt(Executor.java:145)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.selfInterrupt(AbstractQueuedSynchronizer.java:825)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:959)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
      	at hudson.model.Executor.abortResult(Executor.java:208)
      	at hudson.model.Build$BuildExecution.doRun(Build.java:165)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
      	at hudson.model.Run.execute(Run.java:1744)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      	at hudson.model.ResourceController.execute(ResourceController.java:98)
      	at hudson.model.Executor.run(Executor.java:374)
      

      This alone is not very bad, but than maintain task of queue kicks in, blocks on Executor's lock and leads to deadlock on Queue lock.
      Stacktrace:

      "AtmostOneTaskExecutor[hudson.model.Queue$1@6a9812a3] [#6684]" daemon prio=10 tid=0x00007f44bf7af000 nid=0x74ec waiting on condition [0x00007f44c827b000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
      	at hudson.model.Executor.isParking(Executor.java:609)
      	at hudson.model.Queue.maintain(Queue.java:1282)
      	at hudson.model.Queue$1.call(Queue.java:334)
      	at hudson.model.Queue$1.call(Queue.java:331)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:101)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:91)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
      	at java.lang.Thread.run(Thread.java:745)
      

      This blocks all actions on Jenkins, as no new builds can be scheduled and you cannot access Jenkins main page.

      After downgrade to versions 1.606 before https://issues.jenkins-ci.org/browse/JENKINS-27565 all is working good.

      Attachments

        Issue Links

          Activity

            People

              stephenconnolly Stephen Connolly
              szubster Tomasz Szuba
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: