I noticed the following "dead lock" that prevents NioChannelHub from serving any channels, which breaks all the slaves.
NioChannelHub thread is blocked: "NioChannelHub keys=2 gen=185197: Computer.threadPoolForRemoting [#3]" daemon prio=10 tid=0x00007f872c021800 nid=0x1585 waiting for monitor entry [0x00007f86ce2ba000] java.lang.Thread.State: BLOCKED (on object monitor) at hudson.remoting.Channel.terminate(Channel.java:792) - waiting to lock <0x00007f874ef76658> (a hudson.remoting.Channel) at hudson.remoting.Channel$2.terminate(Channel.java:483) at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:203) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:597) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ... because of this guy: "Computer.threadPoolForRemoting [#216] for mac" daemon prio=10 tid=0x00007f86dc0d6800 nid=0x3f34 in Object.wait() [0x00007f87442f1000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00007f874ef76810> (a org.jenkinsci.remoting.nio.FifoBuffer) at java.lang.Object.wait(Object.java:485) at org.jenkinsci.remoting.nio.FifoBuffer.write(FifoBuffer.java:336) - locked <0x00007f874ef76810> (a org.jenkinsci.remoting.nio.FifoBuffer) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.writeBlock(NioChannelHub.java:215) at hudson.remoting.AbstractByteArrayCommandTransport.write(AbstractByteArrayCommandTransport.java:83) at hudson.remoting.Channel.send(Channel.java:545) - locked <0x00007f874ef76658> (a hudson.remoting.Channel) at hudson.remoting.Request$2.run(Request.java:342) - locked <0x00007f874ef76658> (a hudson.remoting.Channel) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
Full thread dump is here
- is duplicated by
-
JENKINS-32825 Deadlock in Channel Abort
- Resolved
-
JENKINS-23043 Build hangs while copying artifacts to slave
- Closed
- is related to
-
JENKINS-32825 Deadlock in Channel Abort
- Resolved
-
JENKINS-20947 Failed to monitor for Free Swap Space
- Open
-
JENKINS-28826 Connection aborted on restarting the client
- Open
-
JENKINS-24155 Jenkins Slaves Go Offline In Large Quantities and Don't Reconnect Until Reboot
- Open
-
JENKINS-23043 Build hangs while copying artifacts to slave
- Closed