Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Cannot Reproduce
-
None
-
Jenkins server/slave OS: Ubuntu 14.04.5 LTS
Jenkins server/slave openJDK: 8u141-b15-3~14.04
Jenkins: 2.89.2
SSH-slave-plugin: 1.23
Description
Related to: JENKINS-25858 and JENKINS-48810
Per suggestion from oleg_nenashev,
I'm openning a separate bug ticket for further investigation.
Jenkins Server log:
Dec 21, 2017 12:17:09 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel jenkins-smoke-slave03(192.168.100.94) java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Jenkins Slave log:
Dec 21, 2017 12:15:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 381.9±905.3/sec; rate(5min) = 363.6±923.4/sec; rate(15min) = 335.3±927.4/sec; rate(total) = 100.3±521.0/sec; N = 35,086 Dec 21, 2017 12:16:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 272.0±705.3/sec; rate(5min) = 324.8±863.5/sec; rate(15min) = 322.8±905.9/sec; rate(total) = 100.3±521.0/sec; N = 35,098 Dec 21, 2017 12:17:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 321.9±768.9/sec; rate(5min) = 333.2±865.8/sec; rate(15min) = 326.3±905.0/sec; rate(total) = 100.4±521.2/sec; N = 35,110 ERROR: Connection terminated ESC[8mha:////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) ERROR: Socket connection to SSH server was lost ESC[8mha:////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.IOException: Peer sent DISCONNECT message (reason code 2): Packet corrupt at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:779) at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502) at java.lang.Thread.run(Thread.java:748) Slave JVM has not reported exit code before the socket was lost [12/21/17 12:17:09] [SSH] Connection closed.
This "Unexpected termination of the channel" has happened everyday (3 days in a roll) to any of slaves randomly since I updated the Jenkins core and all the plugins to the latest on Dec 19. 2017.
The previous Jenkins core and plugin were updated back on April 2017:
Jenkins Core: 2.46.2
SSH-slave puglin: 1.16
Due to the more than usual of the random "Unexpected termination of the channel",
on "Dec 22. 2017" I downgraded Jenkins Core and SSH-slave plugin to:
Jenkins Core: 2.60.3 (which remoting should be the same as 2.46.2 based on changelog)
SSH-slave puglin: 1.16
The issue has been eased since the downgrade,
but the random "Unexpected termination of the channel" still happened a couple time so far.
Attachments
Issue Links
- relates to
-
JENKINS-48810 How to disable hudson.remoting.RemoteInvocationHandler$Unexporter reportStats?
-
- Resolved
-
marlowa, a lot of people have good success keeping channels alive over many builds or long builds. Certainly there are also a number of cases where people have reliability problems for a wide variety of reasons. Sometimes they're able to stabilize or strengthen their environment and these problems disappear. Most of the times they don't provide enough information for anyone who isn't local to diagnose anything. There isn't much reason to keep multiple, duplicate tickets open that all lack information or continued response.
With your acknowledged network unreliability, you may also want to give the remoting-kafka-plugin a try. I'd suggest running a test build environment or trying it with a few jobs. One of the reasons for creating the new plugin was a hope for improved reliability. Your unreliable network may be a good test case for that.
My network is pretty reliable so I can't reproduce any of these reports or give a good workout to the remoting-kafka-plugin.