[JENKINS-25218] Channel hangs due to the infinite loop in FifoBuffer within the lock - Jenkins Jira

Type: Bug
Resolution: Fixed
Priority: Major
Component/s: core, remoting
Labels:
None

Similar Issues:
Powered by SuggestiMate

Show

I noticed the following "dead lock" that prevents NioChannelHub from serving any channels, which breaks all the slaves.

NioChannelHub thread is blocked:

    "NioChannelHub keys=2 gen=185197: Computer.threadPoolForRemoting [#3]" daemon prio=10 tid=0x00007f872c021800 nid=0x1585 waiting for monitor entry [0x00007f86ce2ba000]
       java.lang.Thread.State: BLOCKED (on object monitor)
	    at hudson.remoting.Channel.terminate(Channel.java:792)
	    - waiting to lock <0x00007f874ef76658> (a hudson.remoting.Channel)
	    at hudson.remoting.Channel$2.terminate(Channel.java:483)
	    at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72)
	    at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:203)
	    at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:597)
	    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	    at java.lang.Thread.run(Thread.java:662)

... because of this guy:

    "Computer.threadPoolForRemoting [#216] for mac" daemon prio=10 tid=0x00007f86dc0d6800 nid=0x3f34 in Object.wait() [0x00007f87442f1000]
       java.lang.Thread.State: WAITING (on object monitor)
	    at java.lang.Object.wait(Native Method)
	    - waiting on <0x00007f874ef76810> (a org.jenkinsci.remoting.nio.FifoBuffer)
	    at java.lang.Object.wait(Object.java:485)
	    at org.jenkinsci.remoting.nio.FifoBuffer.write(FifoBuffer.java:336)
	    - locked <0x00007f874ef76810> (a org.jenkinsci.remoting.nio.FifoBuffer)
	    at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.writeBlock(NioChannelHub.java:215)
	    at hudson.remoting.AbstractByteArrayCommandTransport.write(AbstractByteArrayCommandTransport.java:83)
	    at hudson.remoting.Channel.send(Channel.java:545)
	    - locked <0x00007f874ef76658> (a hudson.remoting.Channel)
	    at hudson.remoting.Request$2.run(Request.java:342)
	    - locked <0x00007f874ef76658> (a hudson.remoting.Channel)
	    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
	    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
	    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	    at java.lang.Thread.run(Thread.java:662)

Full thread dump is here

is duplicated by

JENKINS-32825 Deadlock in Channel Abort

Resolved

JENKINS-23043 Build hangs while copying artifacts to slave

Closed

is related to

JENKINS-32825 Deadlock in Channel Abort

Resolved

JENKINS-20947 Failed to monitor for Free Swap Space

Open

JENKINS-28826 Connection aborted on restarting the client

Open

JENKINS-24155 Jenkins Slaves Go Offline In Large Quantities and Don't Reconnect Until Reboot

Open

JENKINS-23043 Build hangs while copying artifacts to slave

Closed

(2 is related to)

Assignee:: Oleg Nenashev

Reporter:: Kohsuke Kawaguchi

Votes:: 7 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2014-10-20 14:36

Updated:: 2017-06-25 20:47

Resolved:: 2017-06-25 20:47

Details

Description

Attachments

Issue Links

Activity

People

Dates