-
Bug
-
Resolution: Incomplete
-
Blocker
-
Jenkins master 1.642.1 under centos 6.6, open jdk7u75. Jenkins master is a vm under ESXi 5.5. slave is windows 7 pro sp1, 64bit, jre8u71. slave.jar is v2.47. Master is located in Tigard, OR. Slave is in Ukraine. WAN is Sprint MPLS. See attachment installed_plugins.htm for a list of installed plugins.Jenkins master 1.642.1 under centos 6.6, open jdk7u75. Jenkins master is a vm under ESXi 5.5. slave is windows 7 pro sp1, 64bit, jre8u71. slave.jar is v2.47. Master is located in Tigard, OR. Slave is in Ukraine. WAN is Sprint MPLS. See attachment installed_plugins.htm for a list of installed plugins.
Chronic intermittent slave disconnect issues with many windows slaves. These jobs typically take 10 hours, disconnects occur around 4 to 5 hours into the job. stack trace follows:
Slave went offline during the build
ERROR: Connection was broken: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at hudson.remoting.SocketChannelStream$1.read(SocketChannelStream.java:35)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
I had the same issue here with 20 slaves on a Jenkins cluster running 107k JUnits all at once, the issue turned out to be the high ping time between the slave and master. After I put them onto a dedicated switch with the same vlan upstream my problem went away. I also noted that the high load on my build machine causes a considerably higher ping time in Jenkins only. Possibly to do with the way the Slave is written and nothing to do with the network.