Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-25858

java.io.IOException: Unexpected termination of the channel

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Minor Minor
    • remoting, ssh-slaves-plugin

      We are seeing slaves being displayed as offline with an Unexpected termination of channel IO exception. The slaves are actually running as seen in our AWS Management Console but the connection between jenkins master and the slave is broken. This behavior is fairly recent starting with maybe 1.584.

          [JENKINS-25858] java.io.IOException: Unexpected termination of the channel

          Luis Arias created issue -

          Luis Arias added a comment -

          Oops I forgot to include the stack trace on the slave page:

          java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          Note that clicking on Launch Slave Agent restablishes the connection.

          Luis Arias added a comment - Oops I forgot to include the stack trace on the slave page: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Note that clicking on Launch Slave Agent restablishes the connection.
          Alex Java made changes -
          Link New: This issue is related to JENKINS-29550 [ JENKINS-29550 ]
          Alex Java made changes -
          Link New: This issue duplicates JENKINS-20101 [ JENKINS-20101 ]

          John Sposato added a comment - - edited

          Having same issue, our slaves are launched using the EC2 plugin.

          java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335)
          	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804)
          	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
          	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
          	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          CentOS 7

          John Sposato added a comment - - edited Having same issue, our slaves are launched using the EC2 plugin. java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) CentOS 7

          David Carlton added a comment -

          I'm seeing this as well - I lose one or two slaves a day. Sample backtrace:

          FATAL: java.io.IOException: Unexpected termination of the channel
          hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.Request.abort(Request.java:297)
                  at hudson.remoting.Channel.terminate(Channel.java:844)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
                  at ......remote call to Elastic Slave (i-4cc13bf9)(Native Method)
                  at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1413)
                  at hudson.remoting.Request.call(Request.java:172)
                  at hudson.remoting.Channel.call(Channel.java:777)
                  at hudson.FilePath.act(FilePath.java:978)
                  at hudson.FilePath.act(FilePath.java:967)
                  at hudson.FilePath.untar(FilePath.java:519)
                  at hudson.plugins.cloneworkspace.CloneWorkspacePublisher$WorkspaceSnapshotTar.restoreTo(CloneWorkspacePublisher.java:243)
                  at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:396)
                  at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:152)
                  at hudson.model.AbstractProject.checkout(AbstractProject.java:1274)
                  at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
                  at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
                  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:531)
                  at hudson.model.Run.execute(Run.java:1738)
                  at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531)
                  at hudson.model.ResourceController.execute(ResourceController.java:98)
                  at hudson.model.Executor.run(Executor.java:381)
          Caused by: java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
                  at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332)
                  at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801)
                  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
                  at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
                  at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
                  at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          I don't have the slave-side log for that specific error, but on earlier failures that I looked at, the slave said something like this:

          2016-03-03 16:30:38.483-0800 [id=10]	SEVERE	h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel channel
          java.io.IOException: Unexpected EOF
          	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:99)
          	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          I'm using the EC2 plugin to launch slaves; Ubuntu 14.04.2, Jenkins 1.609.3. I don't yet have solid evidence as to whether the network disconnects are caused by Jenkins code or by something in the networking stack, but it would be nice if Jenkins could handle this better somehow - not have the slave process die in this scenario and then reconnect to it, or something.

          David Carlton added a comment - I'm seeing this as well - I lose one or two slaves a day. Sample backtrace: FATAL: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:297) at hudson.remoting.Channel.terminate(Channel.java:844) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) at ......remote call to Elastic Slave (i-4cc13bf9)(Native Method) at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1413) at hudson.remoting.Request.call(Request.java:172) at hudson.remoting.Channel.call(Channel.java:777) at hudson.FilePath.act(FilePath.java:978) at hudson.FilePath.act(FilePath.java:967) at hudson.FilePath.untar(FilePath.java:519) at hudson.plugins.cloneworkspace.CloneWorkspacePublisher$WorkspaceSnapshotTar.restoreTo(CloneWorkspacePublisher.java:243) at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:396) at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:152) at hudson.model.AbstractProject.checkout(AbstractProject.java:1274) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:531) at hudson.model.Run.execute(Run.java:1738) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:381) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) I don't have the slave-side log for that specific error, but on earlier failures that I looked at, the slave said something like this: 2016-03-03 16:30:38.483-0800 [id=10] SEVERE h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel channel java.io.IOException: Unexpected EOF at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:99) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) I'm using the EC2 plugin to launch slaves; Ubuntu 14.04.2, Jenkins 1.609.3. I don't yet have solid evidence as to whether the network disconnects are caused by Jenkins code or by something in the networking stack, but it would be nice if Jenkins could handle this better somehow - not have the slave process die in this scenario and then reconnect to it, or something.

          David Carlton added a comment -

          Also, I see this bug marked as a duplicate of JENKINS-20101 but I don't see clear evidence that they're the same - that bug seems to be a situation where the master died for some reason, whereas in my situation (and, as far as I can tell, the other people commenting here), the master is running fine the whole time, it just loses its connection to a slave.

          David Carlton added a comment - Also, I see this bug marked as a duplicate of JENKINS-20101 but I don't see clear evidence that they're the same - that bug seems to be a situation where the master died for some reason, whereas in my situation (and, as far as I can tell, the other people commenting here), the master is running fine the whole time, it just loses its connection to a slave.

          joao cravo added a comment -

          Hey guys!

          I'm running Jenkins on AWS EC2 and I got this same problem. I was able to fix it with
          https://github.com/scala/scala-jenkins-infra/issues/26#issuecomment-73825006

          I don't know exactly why, but it fixed.. anyone can explain why?

          Cheers

          joao cravo added a comment - Hey guys! I'm running Jenkins on AWS EC2 and I got this same problem. I was able to fix it with https://github.com/scala/scala-jenkins-infra/issues/26#issuecomment-73825006 I don't know exactly why, but it fixed.. anyone can explain why? Cheers

          David Carlton added a comment -

          This bug is tagged as ssh-slaves-plugin, but now that I understand the EC2 plugin, I realize that it replaces the SSH Slaves Plugin; I don't know for sure which plugin the original poster was using, but it seems like the rest of us are using EC2. I've been experimenting for the last week with setting "Connect by SSH Process" (under the advanced options) to true, and I haven't had any problems since I've done that, so I would recommend that to other people who are running into this symptom with the EC2 option. (It was added for exactly this reason, see https://github.com/jenkinsci/ec2-plugin/pull/139 )

          There's still presumably a collection of potential underlying non-Jenkins issues - maybe kernel problems, maybe security groups, who knows, probably different for different installations - but it seems to me like the external ssh program is noticeably better at dealing with temporary network glitches than Trilead is.

          David Carlton added a comment - This bug is tagged as ssh-slaves-plugin, but now that I understand the EC2 plugin, I realize that it replaces the SSH Slaves Plugin; I don't know for sure which plugin the original poster was using, but it seems like the rest of us are using EC2. I've been experimenting for the last week with setting "Connect by SSH Process" (under the advanced options) to true, and I haven't had any problems since I've done that, so I would recommend that to other people who are running into this symptom with the EC2 option. (It was added for exactly this reason, see https://github.com/jenkinsci/ec2-plugin/pull/139 ) There's still presumably a collection of potential underlying non-Jenkins issues - maybe kernel problems, maybe security groups, who knows, probably different for different installations - but it seems to me like the external ssh program is noticeably better at dealing with temporary network glitches than Trilead is.
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 159839 ] New: JNJira + In-Review [ 180151 ]

            ifernandezcalvo Ivan Fernandez Calvo
            balsamiqluis2 Luis Arias
            Votes:
            17 Vote for this issue
            Watchers:
            28 Start watching this issue

              Created:
              Updated:
              Resolved: