Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-25858

java.io.IOException: Unexpected termination of the channel

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Minor Minor
    • remoting, ssh-slaves-plugin

      We are seeing slaves being displayed as offline with an Unexpected termination of channel IO exception. The slaves are actually running as seen in our AWS Management Console but the connection between jenkins master and the slave is broken. This behavior is fairly recent starting with maybe 1.584.

          [JENKINS-25858] java.io.IOException: Unexpected termination of the channel

          Luis Arias added a comment -

          Oops I forgot to include the stack trace on the slave page:

          java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          Note that clicking on Launch Slave Agent restablishes the connection.

          Luis Arias added a comment - Oops I forgot to include the stack trace on the slave page: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Note that clicking on Launch Slave Agent restablishes the connection.

          John Sposato added a comment - - edited

          Having same issue, our slaves are launched using the EC2 plugin.

          java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335)
          	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804)
          	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
          	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          CentOS 7

          John Sposato added a comment - - edited Having same issue, our slaves are launched using the EC2 plugin. java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) CentOS 7

          David Carlton added a comment -

          I'm seeing this as well - I lose one or two slaves a day. Sample backtrace:

          FATAL: java.io.IOException: Unexpected termination of the channel
          hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.Request.abort(Request.java:297)
                  at hudson.remoting.Channel.terminate(Channel.java:844)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
                  at ......remote call to Elastic Slave (i-4cc13bf9)(Native Method)
                  at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1413)
                  at hudson.remoting.Request.call(Request.java:172)
                  at hudson.remoting.Channel.call(Channel.java:777)
                  at hudson.FilePath.act(FilePath.java:978)
                  at hudson.FilePath.act(FilePath.java:967)
                  at hudson.FilePath.untar(FilePath.java:519)
                  at hudson.plugins.cloneworkspace.CloneWorkspacePublisher$WorkspaceSnapshotTar.restoreTo(CloneWorkspacePublisher.java:243)
                  at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:396)
                  at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:152)
                  at hudson.model.AbstractProject.checkout(AbstractProject.java:1274)
                  at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
                  at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
                  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:531)
                  at hudson.model.Run.execute(Run.java:1738)
                  at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531)
                  at hudson.model.ResourceController.execute(ResourceController.java:98)
                  at hudson.model.Executor.run(Executor.java:381)
          Caused by: java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
                  at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332)
                  at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801)
                  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
                  at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
                  at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
                  at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          I don't have the slave-side log for that specific error, but on earlier failures that I looked at, the slave said something like this:

          2016-03-03 16:30:38.483-0800 [id=10]	SEVERE	h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel channel
          java.io.IOException: Unexpected EOF
          	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:99)
          	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          I'm using the EC2 plugin to launch slaves; Ubuntu 14.04.2, Jenkins 1.609.3. I don't yet have solid evidence as to whether the network disconnects are caused by Jenkins code or by something in the networking stack, but it would be nice if Jenkins could handle this better somehow - not have the slave process die in this scenario and then reconnect to it, or something.

          David Carlton added a comment - I'm seeing this as well - I lose one or two slaves a day. Sample backtrace: FATAL: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:297) at hudson.remoting.Channel.terminate(Channel.java:844) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) at ......remote call to Elastic Slave (i-4cc13bf9)(Native Method) at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1413) at hudson.remoting.Request.call(Request.java:172) at hudson.remoting.Channel.call(Channel.java:777) at hudson.FilePath.act(FilePath.java:978) at hudson.FilePath.act(FilePath.java:967) at hudson.FilePath.untar(FilePath.java:519) at hudson.plugins.cloneworkspace.CloneWorkspacePublisher$WorkspaceSnapshotTar.restoreTo(CloneWorkspacePublisher.java:243) at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:396) at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:152) at hudson.model.AbstractProject.checkout(AbstractProject.java:1274) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:531) at hudson.model.Run.execute(Run.java:1738) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:381) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) I don't have the slave-side log for that specific error, but on earlier failures that I looked at, the slave said something like this: 2016-03-03 16:30:38.483-0800 [id=10] SEVERE h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel channel java.io.IOException: Unexpected EOF at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:99) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) I'm using the EC2 plugin to launch slaves; Ubuntu 14.04.2, Jenkins 1.609.3. I don't yet have solid evidence as to whether the network disconnects are caused by Jenkins code or by something in the networking stack, but it would be nice if Jenkins could handle this better somehow - not have the slave process die in this scenario and then reconnect to it, or something.

          David Carlton added a comment -

          Also, I see this bug marked as a duplicate of JENKINS-20101 but I don't see clear evidence that they're the same - that bug seems to be a situation where the master died for some reason, whereas in my situation (and, as far as I can tell, the other people commenting here), the master is running fine the whole time, it just loses its connection to a slave.

          David Carlton added a comment - Also, I see this bug marked as a duplicate of JENKINS-20101 but I don't see clear evidence that they're the same - that bug seems to be a situation where the master died for some reason, whereas in my situation (and, as far as I can tell, the other people commenting here), the master is running fine the whole time, it just loses its connection to a slave.

          joao cravo added a comment -

          Hey guys!

          I'm running Jenkins on AWS EC2 and I got this same problem. I was able to fix it with
          https://github.com/scala/scala-jenkins-infra/issues/26#issuecomment-73825006

          I don't know exactly why, but it fixed.. anyone can explain why?

          Cheers

          joao cravo added a comment - Hey guys! I'm running Jenkins on AWS EC2 and I got this same problem. I was able to fix it with https://github.com/scala/scala-jenkins-infra/issues/26#issuecomment-73825006 I don't know exactly why, but it fixed.. anyone can explain why? Cheers

          David Carlton added a comment -

          This bug is tagged as ssh-slaves-plugin, but now that I understand the EC2 plugin, I realize that it replaces the SSH Slaves Plugin; I don't know for sure which plugin the original poster was using, but it seems like the rest of us are using EC2. I've been experimenting for the last week with setting "Connect by SSH Process" (under the advanced options) to true, and I haven't had any problems since I've done that, so I would recommend that to other people who are running into this symptom with the EC2 option. (It was added for exactly this reason, see https://github.com/jenkinsci/ec2-plugin/pull/139 )

          There's still presumably a collection of potential underlying non-Jenkins issues - maybe kernel problems, maybe security groups, who knows, probably different for different installations - but it seems to me like the external ssh program is noticeably better at dealing with temporary network glitches than Trilead is.

          David Carlton added a comment - This bug is tagged as ssh-slaves-plugin, but now that I understand the EC2 plugin, I realize that it replaces the SSH Slaves Plugin; I don't know for sure which plugin the original poster was using, but it seems like the rest of us are using EC2. I've been experimenting for the last week with setting "Connect by SSH Process" (under the advanced options) to true, and I haven't had any problems since I've done that, so I would recommend that to other people who are running into this symptom with the EC2 option. (It was added for exactly this reason, see https://github.com/jenkinsci/ec2-plugin/pull/139 ) There's still presumably a collection of potential underlying non-Jenkins issues - maybe kernel problems, maybe security groups, who knows, probably different for different installations - but it seems to me like the external ssh program is noticeably better at dealing with temporary network glitches than Trilead is.

          Real Name added a comment -

          davidcarltonsumo but I'm getting the same with just *ssh-slaves-plugin*.

          Real Name added a comment - davidcarltonsumo but I'm getting the same with just * ssh-slaves-plugin *.

          Shannon Kerr added a comment -

          We just started getting this in the last couple of weeks. I don't remember seeing this before, but now we seem to see a failure every day or every other day over the last week. We are not using EC2 plugin. We are only using "ssh-slaves-plugin". The slave is a macOS VM that has not had any recent updates and we've been using it for years.

          ERROR: Connection was broken: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
          	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
          	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
          	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
          

          Shannon Kerr added a comment - We just started getting this in the last couple of weeks. I don't remember seeing this before, but now we seem to see a failure every day or every other day over the last week. We are not using EC2 plugin. We are only using " ssh-slaves-plugin ". The slave is a macOS VM that has not had any recent updates and we've been using it for years. ERROR: Connection was broken: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)

          HemangLavana added a comment -

          We have also started seeing such issue recently and it happens very often now (almost daily). We are using ssh slave plugin=1.20, ssh agent plugin=1.15, jenkins LTS 2.60.1 and all slaves are connected to linux hosts.

          Here's the stack trace:
          java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
          Caused: java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
          Caused: java.io.IOException
          at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:169)
          at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
          at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
          at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
          at java.io.InputStreamReader.read(InputStreamReader.java:184)
          at java.io.Reader.read(Reader.java:140)
          at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2001)
          at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1980)
          at org.apache.commons.io.IOUtils.copy(IOUtils.java:1957)
          at org.apache.commons.io.IOUtils.copy(IOUtils.java:1907)
          at org.apache.commons.io.IOUtils.toString(IOUtils.java:778)
          at org.apache.commons.io.IOUtils.toString(IOUtils.java:803)
          at org.jenkinsci.plugins.workflow.steps.ReadFileStep$Execution.run(ReadFileStep.java:97)
          at org.jenkinsci.plugins.workflow.steps.ReadFileStep$Execution.run(ReadFileStep.java:86)
          at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1$1.call(SynchronousNonBlockingStepExecution.java:49)
          at hudson.security.ACL.impersonate(ACL.java:260)
          at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1.run(SynchronousNonBlockingStepExecution.java:46)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          at java.lang.Thread.run(Thread.java:745)

          HemangLavana added a comment - We have also started seeing such issue recently and it happens very often now (almost daily). We are using ssh slave plugin=1.20, ssh agent plugin=1.15, jenkins LTS 2.60.1 and all slaves are connected to linux hosts. Here's the stack trace: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73) Caused: java.io.IOException at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:169) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.Reader.read(Reader.java:140) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2001) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1980) at org.apache.commons.io.IOUtils.copy(IOUtils.java:1957) at org.apache.commons.io.IOUtils.copy(IOUtils.java:1907) at org.apache.commons.io.IOUtils.toString(IOUtils.java:778) at org.apache.commons.io.IOUtils.toString(IOUtils.java:803) at org.jenkinsci.plugins.workflow.steps.ReadFileStep$Execution.run(ReadFileStep.java:97) at org.jenkinsci.plugins.workflow.steps.ReadFileStep$Execution.run(ReadFileStep.java:86) at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1$1.call(SynchronousNonBlockingStepExecution.java:49) at hudson.security.ACL.impersonate(ACL.java:260) at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1.run(SynchronousNonBlockingStepExecution.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

          Rick Liu added a comment -

          Jenkins server/slave OS: Ubuntu 14.04.5 LTS
          Jenkins server/slave openJDK: 8u141-b15-3~14.04
          Jenkins: 2.89.2
          SSH-slave-plugin: 1.23

          Jenkins.log:

          Dec 21, 2017 12:17:09 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
          SEVERE: I/O error in channel jenkins-smoke-slave03(192.168.100.94)
          java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
          Caused by: java.io.EOFException
                  at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
                  at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
                  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
                  at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
                  at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
                  at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          

          Slave log:

          Dec 21, 2017 12:15:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats
          INFO: rate(1min) = 381.9±905.3/sec; rate(5min) = 363.6±923.4/sec; rate(15min) = 335.3±927.4/sec; rate(total) = 100.3±521.0/sec; N = 35,086
          Dec 21, 2017 12:16:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats
          INFO: rate(1min) = 272.0±705.3/sec; rate(5min) = 324.8±863.5/sec; rate(15min) = 322.8±905.9/sec; rate(total) = 100.3±521.0/sec; N = 35,098
          Dec 21, 2017 12:17:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats
          INFO: rate(1min) = 321.9±768.9/sec; rate(5min) = 333.2±865.8/sec; rate(15min) = 326.3±905.0/sec; rate(total) = 100.4±521.2/sec; N = 35,110
          ERROR: Connection terminated
          ESC[8mha:////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.EOFException
                  at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
                  at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
                  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
                  at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
                  at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
                  at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          Caused: java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
          ERROR: Socket connection to SSH server was lost
          ESC[8mha:////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.IOException: Peer sent DISCONNECT message (reason code 2): Packet corrupt
                  at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:779)
                  at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502)
                  at java.lang.Thread.run(Thread.java:748)
          Slave JVM has not reported exit code before the socket was lost
          [12/21/17 12:17:09] [SSH] Connection closed.
          

          This "Unexpected termination of the channel" has happened everyday (3 days in a roll) to any of slaves randomly
          I updated the Jenkins core and all the plugins to the latest on Dec 19. 2017.
          (the previous Jenkins core and plugin were updated back on April 2017):
          Jenkins Core: 2.46.2
          SSH-slave puglin: 1.16

          Rick Liu added a comment - Jenkins server/slave OS: Ubuntu 14.04.5 LTS Jenkins server/slave openJDK: 8u141-b15-3~14.04 Jenkins: 2.89.2 SSH-slave-plugin: 1.23 Jenkins.log: Dec 21, 2017 12:17:09 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel jenkins-smoke-slave03(192.168.100.94) java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Slave log: Dec 21, 2017 12:15:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 381.9±905.3/sec; rate(5min) = 363.6±923.4/sec; rate(15min) = 335.3±927.4/sec; rate(total) = 100.3±521.0/sec; N = 35,086 Dec 21, 2017 12:16:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 272.0±705.3/sec; rate(5min) = 324.8±863.5/sec; rate(15min) = 322.8±905.9/sec; rate(total) = 100.3±521.0/sec; N = 35,098 Dec 21, 2017 12:17:09 PM hudson.remoting.RemoteInvocationHandler$Unexporter reportStats INFO: rate(1min) = 321.9±768.9/sec; rate(5min) = 333.2±865.8/sec; rate(15min) = 326.3±905.0/sec; rate(total) = 100.4±521.2/sec; N = 35,110 ERROR: Connection terminated ESC[8mha: ////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) ERROR: Socket connection to SSH server was lost ESC[8mha: ////4Cm+u8BY/EgsbhzNlnUfOXWprV5tRETZDv4u6647BaROAAAAVx+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0mV3NWzufebKBsTA0NFEYMUmgZnCA1SyAABjCCFBQC2xNaiYAAAAA==ESC[0mjava.io.IOException: Peer sent DISCONNECT message (reason code 2): Packet corrupt at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:779) at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502) at java.lang. Thread .run( Thread .java:748) Slave JVM has not reported exit code before the socket was lost [12/21/17 12:17:09] [SSH] Connection closed. This "Unexpected termination of the channel" has happened everyday (3 days in a roll) to any of slaves randomly I updated the Jenkins core and all the plugins to the latest on Dec 19. 2017. (the previous Jenkins core and plugin were updated back on April 2017): Jenkins Core: 2.46.2 SSH-slave puglin: 1.16

          Oleg Nenashev added a comment -

          totoroliu in your case there was a packet corruption in the channel: "Peer sent DISCONNECT message (reason code 2): Packet corrupt". I would suggest creating a separate issue for that so paladox and mc1arke could investigate it. They are current Trilead SSH maintainers

          Oleg Nenashev added a comment - totoroliu in your case there was a packet corruption in the channel: "Peer sent DISCONNECT message (reason code 2): Packet corrupt". I would suggest creating a separate issue for that so paladox and mc1arke could investigate it. They are current Trilead SSH maintainers

          Rick Liu added a comment -

          Thank you very much oleg_nenashev
          I'm very appriciaged your help and explanation!

          I have created a separate bug: JENKINS-48850

          Rick Liu added a comment - Thank you very much oleg_nenashev I'm very appriciaged your help and explanation! I have created a separate bug: JENKINS-48850

          Oleg Nenashev added a comment -

          Bulk issue update: The plugin connectivity is still unstable from what I see in this and other reports. Probably the recent patches in 1.24-1.25 caused some extra instability by getting rid of interlocks between agent connection and termination logic. Apparently it impacts some reconnection scenarios due to the race conditions.

          Unfortunately I do not have capacity to work on the plugin in medium-term. So for now I am unassigning issues from myself. ifernandezcalvo was very kind to take ownership of the plugin and to handle some workload in it. Probably he will have some capacity to review the backlog I was unable to triage.

          Oleg Nenashev added a comment - Bulk issue update: The plugin connectivity is still unstable from what I see in this and other reports. Probably the recent patches in 1.24-1.25 caused some extra instability by getting rid of interlocks between agent connection and termination logic. Apparently it impacts some reconnection scenarios due to the race conditions. Unfortunately I do not have capacity to work on the plugin in medium-term. So for now I am unassigning issues from myself. ifernandezcalvo was very kind to take ownership of the plugin and to handle some workload in it. Probably he will have some capacity to review the backlog I was unable to triage.

          Using Jenkins ver. 2.113 with SSH Slaves plugin 1.26. All our UNIX, Windows and OSX slaves are using SSH connection.
          We are seeing a lot of these. Is it the same issue? 

           

          ============================

          Slave JVM has not reported exit code. Is it still running?
          ERROR: Connection terminated
          java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          Caused: java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)

           

          ============================

           

          ERROR: Connection terminated
          java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          Caused: java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
          ERROR: Socket connection to SSH server was lost
          java.net.SocketException: Connection reset
          at java.net.SocketInputStream.read(SocketInputStream.java:210)
          at java.net.SocketInputStream.read(SocketInputStream.java:141)
          at com.trilead.ssh2.crypto.cipher.CipherInputStream.fill_buffer(CipherInputStream.java:41)
          at com.trilead.ssh2.crypto.cipher.CipherInputStream.internal_read(CipherInputStream.java:52)
          at com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:79)
          at com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
          at com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
          at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:706)
          at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502)
          at java.lang.Thread.run(Thread.java:748)
          Slave JVM has not reported exit code before the socket was lost
          [04/05/18 17:10:24] [SSH] Connection closed.

          Vassilena Treneva added a comment - Using Jenkins ver. 2.113 with SSH Slaves plugin 1.26. All our UNIX, Windows and OSX slaves are using SSH connection. We are seeing a lot of these. Is it the same issue?    ============================ Slave JVM has not reported exit code. Is it still running? ERROR: Connection terminated java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)   ============================   ERROR: Connection terminated java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) ERROR: Socket connection to SSH server was lost java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at com.trilead.ssh2.crypto.cipher.CipherInputStream.fill_buffer(CipherInputStream.java:41) at com.trilead.ssh2.crypto.cipher.CipherInputStream.internal_read(CipherInputStream.java:52) at com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:79) at com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108) at com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232) at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:706) at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502) at java.lang.Thread.run(Thread.java:748) Slave JVM has not reported exit code before the socket was lost [04/05/18 17:10:24] [SSH] Connection closed.

          Overall recommendations:

          Ivan Fernandez Calvo added a comment - Overall recommendations: It is recommended to use JDK nearest and in the same major version of Jenkins instance and Agents It is recommended to tune the TCP stack on of Jenkins instance and Agents On Linux http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html On Windows https://blogs.technet.microsoft.com/nettracer/2010/06/03/things-that-you-may-want-to-know-about-tcp-keepalives/ On Mac https://www.gnugk.org/keepalive.html You should check for hs_err_pid error files in the root fs of the agent http://www.oracle.com/technetwork/java/javase/felog-138657.html#gbwcy It is recommended to set the initial heap of the Agent to at least 512M (-Xmx512m -Xms512m), you could start with 512m and lower the value until you find a proper value to your Agents.

          Hi Ivan

           

          I'm curious why your pinpointing in the various thread regarding this issue that we all need to review our TCP stack optimization whereas it used to work and stop after some update of either your code or the Linux kernel

           

          As it also impacts Windows and Mac OS machines, I guess the culprit might not be the kernel or the TCP stack optim.

          Francois Aichelbaum added a comment - Hi Ivan   I'm curious why your pinpointing in the various thread regarding this issue that we all need to review our TCP stack optimization whereas it used to work and stop after some update of either your code or the Linux kernel   As it also impacts Windows and Mac OS machines, I guess the culprit might not be the kernel or the TCP stack optim.

          Those are overall recommendations, check it your Agents satisfy those recommendations, they resolve the 90% of Agents issues

          Ivan Fernandez Calvo added a comment - Those are overall recommendations, check it your Agents satisfy those recommendations, they resolve the 90% of Agents issues

          As said on the other thread, we already went through those that you can find on Internet and they solved nothing as .... it was working before a change in the code

          Francois Aichelbaum added a comment - As said on the other thread, we already went through those that you can find on Internet and they solved nothing as .... it was working before a change in the code

          faichelbaum, do you know which version of the server/ssh plugin does NOT have this issue?

          Vassilena Treneva added a comment - faichelbaum , do you know which version of the server/ssh plugin does NOT have this issue?

          Francois Aichelbaum added a comment - - edited

          The ones prior to December, 12th as this was the apparent last update without issue we did.

          The next one (for us) was beginning for January and things got even worse following the rush for Kernel updates, hardware firmware updates and even more with VMware patches.

          Francois Aichelbaum added a comment - - edited The ones prior to December, 12th as this was the apparent last update without issue we did. The next one (for us) was beginning for January and things got even worse following the rush for Kernel updates, hardware firmware updates and even more with VMware patches.

          This morning, we ran those new TCP optim and failed as usual ... 

          Francois Aichelbaum added a comment - This morning, we ran those new TCP optim and failed as usual ... 

          faichelbaum Do you use vMotion on those VMWare VMs? we know that Trilead library it is really sensitive to network performance/issues, we are trying to improve this to make the SSH connection more robust

          Ivan Fernandez Calvo added a comment - faichelbaum Do you use vMotion on those VMWare VMs? we know that Trilead library it is really sensitive to network performance/issues, we are trying to improve this to make the SSH connection more robust

          we tried all that as far, for instance, as dedicating a VMware ESX host with higher ressources that needed and stick all the VM to that host. Anyhow, the LAN between our hosts is 10 Gbps and we have no connection issues between neither the hosts or the VM for the various services we host, besides Jenkins.

           

          As said, it worked and went poof with the Jenkins updates and the Spectre/Meltdown patches that came at the same time

          Francois Aichelbaum added a comment - we tried all that as far, for instance, as dedicating a VMware ESX host with higher ressources that needed and stick all the VM to that host. Anyhow, the LAN between our hosts is 10 Gbps and we have no connection issues between neither the hosts or the VM for the various services we host, besides Jenkins.   As said, it worked and went poof with the Jenkins updates and the Spectre/Meltdown patches that came at the same time

          We just saw this one as well. So far it's only been on our AWS based Ubuntu 16.04 nodes. We're running Jenkins 2.73.3 and ssh-slaves plugin 1.21. 

          06:11:18 java.io.EOFException
          06:11:18 	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)
          06:11:18 	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)
          06:11:18 	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)
          06:11:18 	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)
          06:11:18 	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          06:11:18 	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
          06:11:18 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
          06:11:18 Caused: java.io.IOException: Unexpected termination of the channel
          06:11:18 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
          06:11:18 Caused: hudson.remoting.ChannelClosedException: Remote call on AWS-LIN-18 failed. The channel is closing down or has closed down
          06:11:18 	at hudson.remoting.Channel.call(Channel.java:898)
          06:11:18 	at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1079)
          06:11:18 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:502)
          06:11:18 	at hudson.model.Run.execute(Run.java:1737)
          06:11:18 	at hudson.matrix.MatrixBuild.run(MatrixBuild.java:314)
          06:11:18 	at hudson.model.ResourceController.execute(ResourceController.java:97)
          06:11:18 	at hudson.model.Executor.run(Executor.java:421)
          

          We can try upgrading the plugin to 1.23, but it looks like it may go back farther then that release.

          Vernon Lingley added a comment - We just saw this one as well. So far it's only been on our AWS based Ubuntu 16.04 nodes. We're running Jenkins 2.73.3 and ssh-slaves plugin 1.21.  06:11:18 java.io.EOFException 06:11:18 at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671) 06:11:18 at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146) 06:11:18 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858) 06:11:18 at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354) 06:11:18 at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) 06:11:18 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) 06:11:18 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59) 06:11:18 Caused: java.io.IOException: Unexpected termination of the channel 06:11:18 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73) 06:11:18 Caused: hudson.remoting.ChannelClosedException: Remote call on AWS-LIN-18 failed. The channel is closing down or has closed down 06:11:18 at hudson.remoting.Channel.call(Channel.java:898) 06:11:18 at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1079) 06:11:18 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:502) 06:11:18 at hudson.model.Run.execute(Run.java:1737) 06:11:18 at hudson.matrix.MatrixBuild.run(MatrixBuild.java:314) 06:11:18 at hudson.model.ResourceController.execute(ResourceController.java:97) 06:11:18 at hudson.model.Executor.run(Executor.java:421) We can try upgrading the plugin to 1.23, but it looks like it may go back farther then that release.

          I am aslo facing the same issue of agent went offline during build.

          I am using jenkins v2.105 and jre 1.8

          I am using Linux as master and IBM AIX and windows server 2K12 as slaves. we are executing nightly builds on slaves but sometimes due to agent goes offline that build won't get complete, so anybody has any workarround for this issue then please let me know.

          Thanks in advance.

          shraddha Magar added a comment - I am aslo facing the same issue of agent went offline during build. I am using jenkins v2.105 and jre 1.8 I am using Linux as master and IBM AIX and windows server 2K12 as slaves. we are executing nightly builds on slaves but sometimes due to agent goes offline that build won't get complete, so anybody has any workarround for this issue then please let me know. Thanks in advance.

          shrapm Without the exception trace it is not possible to diagnose anything, Do you have the error trace on the Job and the corresponding error trace on the Jenkins logs? on the next version, the plugin will support by default the remoting debug I hope this would help to diagnose this kind of errors.

          Ivan Fernandez Calvo added a comment - shrapm Without the exception trace it is not possible to diagnose anything, Do you have the error trace on the Job and the corresponding error trace on the Jenkins logs? on the next version, the plugin will support by default the remoting debug I hope this would help to diagnose this kind of errors.

          Below is the error I can see in build log

          Agent went offline during the build
          ERROR: Connection was broken: java.io.EOFException
           at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)
           at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)
           at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)
           at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)
           at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
           at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
           at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          Caused: java.io.IOException: Unexpected termination of the channel
           at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)

          shraddha Magar added a comment - Below is the error I can see in build log Agent went offline during the build ERROR: Connection was broken: java.io.EOFException  at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)  at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)  at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)  at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)  at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)

          This is on Jenkins side, right? this only said that the channel is broken but not the root cause, Do you have any logs on the Agent workdir folder? also, check the latest executed build in that Agent to see if there is more info, see https://github.com/jenkinsci/remoting/blob/master/docs/logging.md to enable logging on the Agent side.

          Ivan Fernandez Calvo added a comment - This is on Jenkins side, right? this only said that the channel is broken but not the root cause, Do you have any logs on the Agent workdir folder? also, check the latest executed build in that Agent to see if there is more info, see https://github.com/jenkinsci/remoting/blob/master/docs/logging.md to enable logging on the Agent side.

          shraddha Magar added a comment - - edited

          Thanks for your quick reply.

          Yes. Those logs were from Jenkins side.

          Actually, slave goes offline for very short time and comes back online immediately. so not able to get the logs from agent. But due to that short time job is getting failed.

          shraddha Magar added a comment - - edited Thanks for your quick reply. Yes. Those logs were from Jenkins side. Actually, slave goes offline for very short time and comes back online immediately. so not able to get the logs from agent. But due to that short time job is getting failed.

          Which SSH Slaves Plugin version do you use? Did you add the `-workDir AGENT_WORK_DIR` parameter to your suffix command configuration to grab the logs?

          https://github.com/jenkinsci/remoting/blob/master/docs/workDir.md

          Ivan Fernandez Calvo added a comment - Which SSH Slaves Plugin version do you use? Did you add the `-workDir AGENT_WORK_DIR` parameter to your suffix command configuration to grab the logs? https://github.com/jenkinsci/remoting/blob/master/docs/workDir.md

          Also I have restarted Jenkins instance then agent which is configured on windows with JNLP got offline and its not coming online, its showing below error.

          Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Locating server among http://x.x.x.x:x/jenkins/
          Aug 29, 2018 1:43:04 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
          INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
          Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Agent discovery successful
            Agent address: x.x.x.x
            Agent port:    37091
            Identity:      51:a8:0d:a8:18:dd:db:d8:ea:18:b7:98:da:76:b2:ae
          Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Handshaking
          Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Connecting to x.x.x.x:37091
          Aug 29, 2018 1:43:35 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Connecting to x.x.x.x:37091 (retrying:2)
          java.io.IOException: Failed to connect to x.x.x.x:37091
           at org.jenkinsci.remoting.engine.JnlpAgentEndpoint.open(JnlpAgentEndpoint.java:242)
           at hudson.remoting.Engine.connect(Engine.java:686)
           at hudson.remoting.Engine.innerRun(Engine.java:547)
           at hudson.remoting.Engine.run(Engine.java:469)
          Caused by: java.net.ConnectException: Connection timed out: connect
           at sun.nio.ch.Net.connect0(Native Method)
           at sun.nio.ch.Net.connect(Unknown Source)
           at sun.nio.ch.Net.connect(Unknown Source)
           at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
           at java.nio.channels.SocketChannel.open(Unknown Source)
           at org.jenkinsci.remoting.engine.JnlpAgentEndpoint.open(JnlpAgentEndpoint.java:203)
           ... 3 more

           

          Could you please help me out please?

          just replaced master IP with x.x.x.x.

          shraddha Magar added a comment - Also I have restarted Jenkins instance then agent which is configured on windows with JNLP got offline and its not coming online, its showing below error. Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among http://x.x.x.x:x/jenkins/ Aug 29, 2018 1:43:04 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping] Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Agent discovery successful   Agent address: x.x.x.x   Agent port:    37091   Identity:      51:a8:0d:a8:18:dd:db:d8:ea:18:b7:98:da:76:b2:ae Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Aug 29, 2018 1:43:04 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to x.x.x.x:37091 Aug 29, 2018 1:43:35 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to x.x.x.x:37091 (retrying:2) java.io.IOException: Failed to connect to x.x.x.x:37091  at org.jenkinsci.remoting.engine.JnlpAgentEndpoint.open(JnlpAgentEndpoint.java:242)  at hudson.remoting.Engine.connect(Engine.java:686)  at hudson.remoting.Engine.innerRun(Engine.java:547)  at hudson.remoting.Engine.run(Engine.java:469) Caused by: java.net.ConnectException: Connection timed out: connect  at sun.nio.ch.Net.connect0(Native Method)  at sun.nio.ch.Net.connect(Unknown Source)  at sun.nio.ch.Net.connect(Unknown Source)  at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)  at java.nio.channels.SocketChannel.open(Unknown Source)  at org.jenkinsci.remoting.engine.JnlpAgentEndpoint.open(JnlpAgentEndpoint.java:203)  ... 3 more   Could you please help me out please? just replaced master IP with x.x.x.x.

          This Windows Agent is a JNLP Agent, so your issue is not related to SSH Slaves plugin.

          It seems a network/configuration issue, not a bug, ask on the google user groups see https://wiki.jenkins.io/display/JENKINS/How+to+report+an+issue

          Ivan Fernandez Calvo added a comment - This Windows Agent is a JNLP Agent, so your issue is not related to SSH Slaves plugin. It seems a network/configuration issue, not a bug, ask on the google user groups see https://wiki.jenkins.io/display/JENKINS/How+to+report+an+issue

          yes. this agent is configured through JNLP.

          and other issue is for AIX machine. we have agents configured on both AIX and Windows. we are using ssh slave plugin 1.26.

          shraddha Magar added a comment - yes. this agent is configured through JNLP. and other issue is for AIX machine. we have agents configured on both AIX and Windows. we are using ssh slave plugin 1.26.

          I'm posting here because this was the most prominent result in google and I wanna share my solution in case it also helps somebody else.

          The last lines in our jenkins node log on the master instance were:

          Sep 23, 2022 10:56:28 AM null
          INFO: Launching Jenkins agent via plugin SSH: java -jar /tmp/agent.jar
          Sep 23, 2022 10:56:28 AM null
          WARNING: Error:  Exception: java.io.EOFException: unexpected stream termination

          Running the command `java -jar /tmp/agent.jar` on the node revealed that the used java version could not run the new jar - which was probably introduced by some jenkins und plugin update.

          Upgrading the node host from OpenJDK 8 to 11 fixed the problem for us.

          Michael Spiegel added a comment - I'm posting here because this was the most prominent result in google and I wanna share my solution in case it also helps somebody else. The last lines in our jenkins node log on the master instance were: Sep 23, 2022 10:56:28 AM null INFO: Launching Jenkins agent via plugin SSH: java -jar /tmp/agent.jar Sep 23, 2022 10:56:28 AM null WARNING: Error:  Exception: java.io.EOFException: unexpected stream termination Running the command `java -jar /tmp/agent.jar` on the node revealed that the used java version could not run the new jar - which was probably introduced by some jenkins und plugin update. Upgrading the node host from OpenJDK 8 to 11 fixed the problem for us.

            ifernandezcalvo Ivan Fernandez Calvo
            balsamiqluis2 Luis Arias
            Votes:
            17 Vote for this issue
            Watchers:
            28 Start watching this issue

              Created:
              Updated:
              Resolved: