In very specific scenario, when build is running on slave, and PingThread detects slave as unavailable deadlock occurs in Executor thread of that slave.
stacktrace:
"Executor #0 for xxxx : executing xxxx #9" daemon prio=10 tid=0x00007f444248b800 nid=0x66e0 waiting on condition [0x00007f448a92f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945) at hudson.model.Executor.interrupt(Executor.java:183) at hudson.model.Executor.interrupt(Executor.java:164) at hudson.model.Executor.interrupt(Executor.java:158) at hudson.model.Executor.interrupt(Executor.java:145) at java.util.concurrent.locks.AbstractQueuedSynchronizer.selfInterrupt(AbstractQueuedSynchronizer.java:825) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:959) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731) at hudson.model.Executor.abortResult(Executor.java:208) at hudson.model.Build$BuildExecution.doRun(Build.java:165) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537) at hudson.model.Run.execute(Run.java:1744) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:374)
This alone is not very bad, but than maintain task of queue kicks in, blocks on Executor's lock and leads to deadlock on Queue lock.
Stacktrace:
"AtmostOneTaskExecutor[hudson.model.Queue$1@6a9812a3] [#6684]" daemon prio=10 tid=0x00007f44bf7af000 nid=0x74ec waiting on condition [0x00007f44c827b000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731) at hudson.model.Executor.isParking(Executor.java:609) at hudson.model.Queue.maintain(Queue.java:1282) at hudson.model.Queue$1.call(Queue.java:334) at hudson.model.Queue$1.call(Queue.java:331) at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:101) at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:91) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110) at java.lang.Thread.run(Thread.java:745)
This blocks all actions on Jenkins, as no new builds can be scheduled and you cannot access Jenkins main page.
After downgrade to versions 1.606 before https://issues.jenkins-ci.org/browse/JENKINS-27565 all is working good.
- is related to
-
JENKINS-27565 Nodes can be removed as idle before the assigned tasks have started
- Closed
-
JENKINS-37034 Deadlock in hudson.model.Executor
- Resolved