• Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • ssh-slaves-plugin
    • None
    • Jenkins ver. 1.398

      Slave is a RedHat 5.2
      Slave workdir is /tmp/...

      ssh-slave 0.14

      Some builds randomly fail with this message:

      FATAL: L'exécution de la commande a échoué.
      hudson.util.IOException2: Failed to join the process
      at hudson.Proc$RemoteProc.join(Proc.java:359)
      at hudson.Launcher$ProcStarter.join(Launcher.java:280)
      at hudson.tasks.Ant.perform(Ant.java:216)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:624)
      at hudson.model.Build$RunnerImpl.build(Build.java:176)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:420)
      at hudson.model.Run.run(Run.java:1362)
      at hudson.matrix.MatrixRun.run(MatrixRun.java:137)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:145)
      Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request$1.get(Request.java:218)
      at hudson.remoting.Request$1.get(Request.java:172)
      at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
      at hudson.Proc$RemoteProc.join(Proc.java:351)
      ... 11 more
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.abort(Request.java:257)
      at hudson.remoting.Channel.terminate(Channel.java:680)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:971)
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:953)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)

      Here is the slave log:

      Slave successfully connected and online
      ERROR: Connection terminated
      java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:953)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      ERROR: [02/25/11 09:59:47] [SSH] Error deleting file.
      java.io.IOException: Sorry, this connection is closed.
      at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:637)
      at com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:582)
      at com.trilead.ssh2.Session.<init>(Session.java:40)
      at com.trilead.ssh2.Connection.openSession(Connection.java:1047)
      at com.trilead.ssh2.Connection.exec(Connection.java:1434)
      at hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:597)
      at hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:320)
      at hudson.remoting.Channel.terminate(Channel.java:695)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:971)
      Caused by: java.net.SocketException: Connection reset
      at java.net.SocketInputStream.read(SocketInputStream.java:168)
      at com.trilead.ssh2.crypto.cipher.CipherInputStream.fill_buffer(CipherInputStream.java:41)
      at com.trilead.ssh2.crypto.cipher.CipherInputStream.internal_read(CipherInputStream.java:52)
      at com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:79)
      at com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
      at com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
      at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:672)
      at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:470)
      at java.lang.Thread.run(Thread.java:662)
      [02/25/11 09:59:47] [SSH] Connection closed.
      ERROR: [02/25/11 09:59:47] lagent esclave a été terminé
      java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:953)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      FATAL: channel is already closed
      hudson.remoting.ChannelClosedException: channel is already closed
      at hudson.remoting.Channel.send(Channel.java:466)
      at hudson.remoting.Request.call(Request.java:105)
      at hudson.remoting.Channel.call(Channel.java:629)
      at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:443)
      at hudson.model.Run.run(Run.java:1362)
      at hudson.matrix.MatrixRun.run(MatrixRun.java:137)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:145)
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:953)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)

          [JENKINS-8883] Build fails because of slave error

          "Connection reset" indicates that the underlying TCP/IP connection of SSH was abruptly terminated from the slave side – for example, killing sshd would cause this. This can also happen if the router in the middle decides to terminate idle connections. Does one of those ring a bell?

          Beyond that it's hard to diagnose this problem. Maybe we can let you increase the ping frequency if we suspect that the problem is caused by an intermediate router shutting down the connection?

          Kohsuke Kawaguchi added a comment - "Connection reset" indicates that the underlying TCP/IP connection of SSH was abruptly terminated from the slave side – for example, killing sshd would cause this. This can also happen if the router in the middle decides to terminate idle connections. Does one of those ring a bell? Beyond that it's hard to diagnose this problem. Maybe we can let you increase the ping frequency if we suspect that the problem is caused by an intermediate router shutting down the connection?

          ebann added a comment -

          No neither of these ring a bell.

          If it was an "auto-disconnect" after idling, it would happen more often I think. Or only on very long jobs.
          But this happens randomly on any job, any node, anytime
          Further more this sometime happens on nodes freshly started too.

          I put an "disconnect node after idle > X" in the Jenkins configuration some weeks ago.
          But it did not changed anything.

          I noticed "ERROR: [02/25/11 09:59:47] [SSH] Error deleting file." in the slave log.
          Is it normal ?

          ebann added a comment - No neither of these ring a bell. If it was an "auto-disconnect" after idling, it would happen more often I think. Or only on very long jobs. But this happens randomly on any job, any node, anytime Further more this sometime happens on nodes freshly started too. I put an "disconnect node after idle > X" in the Jenkins configuration some weeks ago. But it did not changed anything. I noticed "ERROR: [02/25/11 09:59:47] [SSH] Error deleting file." in the slave log. Is it normal ?

          fgu added a comment - - edited

          We think that we have the same trouble with a Mac slave.

          We have avoided the problem doing verbose build, but tasks that have no output have the same problem and once tasks are finished slave become idle again. As happens to ebann, sometimes connection is abruptly terminated just opened.

          We have seen java process ending in slave meanwhile ssh connection keep alive for a while, until ssh ends later.

          fgu added a comment - - edited We think that we have the same trouble with a Mac slave. We have avoided the problem doing verbose build, but tasks that have no output have the same problem and once tasks are finished slave become idle again. As happens to ebann, sometimes connection is abruptly terminated just opened. We have seen java process ending in slave meanwhile ssh connection keep alive for a while, until ssh ends later.

          brianharris added a comment -

          Seems this is same as JENKINS-6817, please close as duplicate.

          brianharris added a comment - Seems this is same as JENKINS-6817 , please close as duplicate.

          ebann added a comment -

          Yes it looks like the same issue.

          Anyway I'm not having this problem anymore.
          (But I have changed a LOT of things in my Hudson configuration/slaves since that, so it might still exists)

          ebann added a comment - Yes it looks like the same issue. Anyway I'm not having this problem anymore. (But I have changed a LOT of things in my Hudson configuration/slaves since that, so it might still exists)

          Oleg Nenashev added a comment -

          Marked issue as a duplicate according to comments

          Oleg Nenashev added a comment - Marked issue as a duplicate according to comments

            kohsuke Kohsuke Kawaguchi
            ebann ebann
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: