Jenkins slaves running on Windows have a tendency to deadlock when programs are sending a lot of data simultaneously.
Reading the threaddump from VisualVM we can see that proc.java tries to
write to deadlock stream resulting in hanging builds
String msg = "Process leaked file descriptors. See http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build for more information"; Throwable e = new Exception().fillInStackTrace(); LOGGER.log(Level.WARNING,msg,e);
"pool-1-thread-833" - Thread t@1926010 java.lang.Thread.State: BLOCKED at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <1502546> (a java.util.logging.ConsoleHandler) owned by "pool-1-thread-927" t@1926604 at java.util.logging.ConsoleHandler.publish(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at hudson.Proc$LocalProc.join(Proc.java:330) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:993) at sun.reflect.GeneratedMethodAccessor127.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:275) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:256) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source)
[JENKINS-18011] Deadlock issue remote slaves
Full thread dump Java HotSpot(TM) Client VM (23.7-b01 mixed mode): "RMI TCP Connection(7)-10.158.226.24" - Thread t@1933099 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) - locked <9adbe1> (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(Unknown Source) at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - locked <1647392> (a java.util.concurrent.ThreadPoolExecutor$Worker) "pool-1-thread-1016" - Thread t@1933098 java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <9aa718> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(Unknown Source) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(Unknown Source) at java.util.concurrent.SynchronousQueue.poll(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI TCP Connection(6)-10.158.226.24" - Thread t@1933097 java.lang.Thread.State: RUNNABLE at sun.management.ThreadImpl.dumpThreads0(Native Method) at sun.management.ThreadImpl.dumpAllThreads(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(Unknown Source) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(Unknown Source) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(Unknown Source) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(Unknown Source) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source) at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source) at javax.management.StandardMBean.invoke(Unknown Source) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown Source) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown Source) at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source) at sun.rmi.transport.Transport$1.run(Unknown Source) at sun.rmi.transport.Transport$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Unknown Source) at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - locked <7d7a7c> (a java.util.concurrent.ThreadPoolExecutor$Worker) "JMX server connection timeout 1933096" - Thread t@1933096 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on <1e1170d> (a [I) at com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI TCP Connection(idle)" - Thread t@1933095 java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <20d484> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(Unknown Source) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(Unknown Source) at java.util.concurrent.SynchronousQueue.poll(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI Scheduler(0)" - Thread t@1933094 java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <b0742a> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI TCP Connection(2)-10.158.226.24" - Thread t@1933093 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) - locked <1ba7067> (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(Unknown Source) at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - locked <1541dc5> (a java.util.concurrent.ThreadPoolExecutor$Worker) "pool-1-thread-927" - Thread t@1926604 java.lang.Thread.State: RUNNABLE at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(Unknown Source) at java.io.BufferedOutputStream.write(Unknown Source) - locked <115b373> (a java.io.BufferedOutputStream) at java.io.PrintStream.write(Unknown Source) - locked <1b86674> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder.writeBytes(Unknown Source) at sun.nio.cs.StreamEncoder.implFlushBuffer(Unknown Source) at sun.nio.cs.StreamEncoder.implFlush(Unknown Source) at sun.nio.cs.StreamEncoder.flush(Unknown Source) - locked <14e53c9> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.flush(Unknown Source) at java.util.logging.StreamHandler.flush(Unknown Source) - locked <1502546> (a java.util.logging.ConsoleHandler) at java.util.logging.ConsoleHandler.publish(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at hudson.Proc$LocalProc.join(Proc.java:330) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:993) at sun.reflect.GeneratedMethodAccessor127.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:275) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:256) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - locked <983e27> (a java.util.concurrent.ThreadPoolExecutor$Worker) "pool-1-thread-833" - Thread t@1926010 java.lang.Thread.State: BLOCKED at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <1502546> (a java.util.logging.ConsoleHandler) owned by "pool-1-thread-927" t@1926604 at java.util.logging.ConsoleHandler.publish(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at hudson.Proc$LocalProc.join(Proc.java:330) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:993) at sun.reflect.GeneratedMethodAccessor127.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:275) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:256) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - locked <1f1b3ca> (a java.util.concurrent.ThreadPoolExecutor$Worker) "Ping thread for channel hudson.remoting.Channel@421208:channel" - Thread t@1925798 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at hudson.remoting.PingThread.run(PingThread.java:86) Locked ownable synchronizers: - None "Pipe writer thread: channel" - Thread t@1925797 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <bf405e> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "Channel reader thread: channel" - Thread t@1925795 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) - locked <1fee5ae> (a java.io.BufferedInputStream) at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Locked ownable synchronizers: - None "Thread-1" - Thread t@15 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <421208> (a hudson.remoting.Channel) at java.lang.Object.wait(Object.java:503) at hudson.remoting.Channel.join(Channel.java:800) at hudson.remoting.Engine.run(Engine.java:242) Locked ownable synchronizers: - None "RMI TCP Accept-0" - Thread t@13 java.lang.Thread.State: RUNNABLE at java.net.DualStackPlainSocketImpl.accept0(Native Method) at java.net.DualStackPlainSocketImpl.socketAccept(Unknown Source) at java.net.AbstractPlainSocketImpl.accept(Unknown Source) at java.net.PlainSocketImpl.accept(Unknown Source) - locked <dd39a> (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(Unknown Source) at java.net.ServerSocket.accept(Unknown Source) at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI TCP Accept-11000" - Thread t@12 java.lang.Thread.State: RUNNABLE at java.net.DualStackPlainSocketImpl.accept0(Native Method) at java.net.DualStackPlainSocketImpl.socketAccept(Unknown Source) at java.net.AbstractPlainSocketImpl.accept(Unknown Source) at java.net.PlainSocketImpl.accept(Unknown Source) - locked <1bf57ce> (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(Unknown Source) at java.net.ServerSocket.accept(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "RMI TCP Accept-0" - Thread t@11 java.lang.Thread.State: RUNNABLE at java.net.DualStackPlainSocketImpl.accept0(Native Method) at java.net.DualStackPlainSocketImpl.socketAccept(Unknown Source) at java.net.AbstractPlainSocketImpl.accept(Unknown Source) at java.net.PlainSocketImpl.accept(Unknown Source) - locked <11da089> (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(Unknown Source) at java.net.ServerSocket.accept(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Locked ownable synchronizers: - None "JDWP Event Helper Thread" - Thread t@7 java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "JDWP Transport Listener: dt_socket" - Thread t@6 java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Attach Listener" - Thread t@5 java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Signal Dispatcher" - Thread t@4 java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Finalizer" - Thread t@3 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <16a5d2> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(Unknown Source) at java.lang.ref.ReferenceQueue.remove(Unknown Source) at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source) Locked ownable synchronizers: - None "Reference Handler" - Thread t@2 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <e08c66> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:503) at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source) Locked ownable synchronizers: - None "main" - Thread t@1 java.lang.Thread.State: WAITING at java.lang.Object.wait(Native Method) - waiting on <56d071> (a hudson.remoting.Engine) at java.lang.Thread.join(Unknown Source) at java.lang.Thread.join(Unknown Source) at hudson.remoting.jnlp.Main.main(Main.java:125) at hudson.remoting.jnlp.Main._main(Main.java:118) at hudson.remoting.Launcher.run(Launcher.java:209) at hudson.remoting.Launcher.main(Launcher.java:180) Locked ownable synchronizers: - NoneĀ“
We found to root to the problem. We have a windows service that launches the slave and for some reason we had enabled piping of stdout and stderr and the output from the slave.jar caused the pipes to deadlock. By disable the stderr and stdout piping the deadlock issue was resolved.
We found the issue by running slave.jar in a command line window and noticed that the findbugs plugin writes a lot of data to stdout and this was to much for our service to handle.
So I take it that at java.io.FileOutputStream.writeBytes(Native Method) is actually writing to a local stream that is deadlocked.
Technically, I don't see a deadlock in the thread-dump.
One thread is waiting for a lock which another thread holds, but that other thread is still RUNNABLE, so no deadlock there - unless I'm missing something.
It may appear very slow, but it's not a deadlock.
Related to https://github.com/jenkinsci/jenkins/pull/782