Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24895

An existing connection was forcibly closed by the remote host

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • core, remoting
    • Master: Widows Server 2008 R2, Jenkins 1.565.1
      Slave: See this issue on Win 7/ Windows Server 2008R2 slave.

      We have a testing which will run for several hours. We meet intermittent failure like below which terminated the testing and the job just failed.

      12:42:10 FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: An existing connection was forcibly closed by the remote host
      12:42:10 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: An existing connection was forcibly closed by the remote host
      12:42:10 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      12:42:10 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      12:42:10 at hudson.remoting.Request.call(Request.java:174)
      12:42:10 at hudson.remoting.Channel.call(Channel.java:739)
      12:42:10 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168)
      12:42:10 at com.sun.proxy.$Proxy61.join(Unknown Source)
      12:42:10 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:956)
      12:42:10 at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
      12:42:10 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
      12:42:10 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
      12:42:10 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      12:42:10 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:756)
      12:42:10 at hudson.model.Build$BuildExecution.build(Build.java:198)
      12:42:10 at hudson.model.Build$BuildExecution.doRun(Build.java:159)
      12:42:10 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
      12:42:10 at hudson.model.Run.execute(Run.java:1706)
      12:42:10 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      12:42:10 at hudson.model.ResourceController.execute(ResourceController.java:88)
      12:42:10 at hudson.model.Executor.run(Executor.java:232)
      12:42:10 Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: An existing connection was forcibly closed by the remote host
      12:42:10 at hudson.remoting.Request.abort(Request.java:299)
      12:42:10 at hudson.remoting.Channel.terminate(Channel.java:802)
      12:42:10 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      12:42:10 Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
      12:42:10 at sun.nio.ch.SocketDispatcher.read0(Native Method)
      12:42:10 at sun.nio.ch.SocketDispatcher.read(Unknown Source)
      12:42:10 at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
      12:42:10 at sun.nio.ch.IOUtil.read(Unknown Source)
      12:42:10 at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
      12:42:10 at hudson.remoting.SocketChannelStream$1.read(SocketChannelStream.java:33)
      12:42:10 at sun.nio.ch.ChannelInputStream.read(Unknown Source)
      12:42:10 at sun.nio.ch.ChannelInputStream.read(Unknown Source)
      12:42:10 at sun.nio.ch.ChannelInputStream.read(Unknown Source)
      12:42:10 at java.io.InputStream.read(Unknown Source)
      12:42:10 at sun.nio.ch.ChannelInputStream.read(Unknown Source)
      12:42:10 at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
      12:42:10 at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      12:42:10 at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      12:42:10 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      12:42:10 at java.io.ObjectInputStream.readObject0(Unknown Source)
      12:42:10 at java.io.ObjectInputStream.readObject(Unknown Source)
      12:42:10 at hudson.remoting.Command.readFrom(Command.java:92)
      12:42:10 at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:70)
      12:42:10 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          [JENKINS-24895] An existing connection was forcibly closed by the remote host

          G. Ancona added a comment - - edited

          You can have it even if the Slave is not launched as a service: I made a test launching an infinite loop always with the same job (approx 1.5hrs), after some loops I took the error.
          At the same time ping works perfectly (so it's not a network problem), and all the process where alive (no trace of crash/restart/etc in logs).
          I got all the logs if you need.

          Update: We where really unlucky with the previous test, we had a bigger problem with the machines it was not a Jenkins one.

          G. Ancona added a comment - - edited You can have it even if the Slave is not launched as a service: I made a test launching an infinite loop always with the same job (approx 1.5hrs), after some loops I took the error. At the same time ping works perfectly (so it's not a network problem), and all the process where alive (no trace of crash/restart/etc in logs). I got all the logs if you need. Update: We where really unlucky with the previous test, we had a bigger problem with the machines it was not a Jenkins one.

          G. Ancona added a comment -

          Seems we solved the problem launching the Jenkins Slave within a batch command (so not anymore as a Windows Service).
          I'm posting the batch file if anybody need it: it creates a new numbered log file each time it restart the Slave.

          @echo OFF

          setlocal enableextensions enabledelayedexpansion

          set /a "x = 1"
          :while1

          call "C:\Java\jre6\bin\java.exe" -Xrs -mx512m -jar %SLAVE_HOME%\slave.jar -jnlpUrl http://HudsonServer.domain.local:8080/computer/SLAVE_1.domain.local/slave-agent.jnlp -secret 51487626528q528582529522572rgfegcb733202d7b > %SLAVE_HOME%\jenkins-slave.%x%.log 2>&1
          set /a "x = x + 1"

          goto :while1

          endlocal

          G. Ancona added a comment - Seems we solved the problem launching the Jenkins Slave within a batch command (so not anymore as a Windows Service). I'm posting the batch file if anybody need it: it creates a new numbered log file each time it restart the Slave. @echo OFF setlocal enableextensions enabledelayedexpansion set /a "x = 1" :while1 call "C:\Java\jre6\bin\java.exe" -Xrs -mx512m -jar %SLAVE_HOME%\slave.jar -jnlpUrl http://HudsonServer.domain.local:8080/computer/SLAVE_1.domain.local/slave-agent.jnlp -secret 51487626528q528582529522572rgfegcb733202d7b > %SLAVE_HOME%\jenkins-slave.%x%.log 2>&1 set /a "x = x + 1" goto :while1 endlocal

          Leslie Klein added a comment -

          Hi Graziano,

          I am trying to use your script above. It does seem to delay the time until the connection is closed but still it closes the connection before the job has finished running. Are there command line parameters that I can play with that might lengthen the time before closing the connection?

          Thanks for your help

          Leslie Klein added a comment - Hi Graziano, I am trying to use your script above. It does seem to delay the time until the connection is closed but still it closes the connection before the job has finished running. Are there command line parameters that I can play with that might lengthen the time before closing the connection? Thanks for your help

          G. Ancona added a comment -

          Hi Leslie,
          no, it's all there (obviously you changed your "secret" parameter!)
          Anyway I had some different problem not connected directly with Jenkins too: Disk space, Windows Firewall, Network Failures, Windows Update...
          I suggest you to look at the Event Viewer of Windows (or /var/log/messages on *nix) to see what happened at the closing time.
          Try to exclude all this side problems and ...Good Luck!

          G. Ancona added a comment - Hi Leslie, no, it's all there (obviously you changed your "secret" parameter!) Anyway I had some different problem not connected directly with Jenkins too: Disk space, Windows Firewall, Network Failures, Windows Update... I suggest you to look at the Event Viewer of Windows (or /var/log/messages on *nix) to see what happened at the closing time. Try to exclude all this side problems and ...Good Luck!

          Hi, I recently migrated from Hudson to Jenkins and form that time am facing this issues. It just randomly takes out some of the slave nodes.

          Slave logs look something like this:
          Failed to establish the connection with the slave fpm
          java.io.EOFException: unexpected stream termination
          at hudson.remoting.ChannelBuilder.negotiate(ChannelBuilder.java:331)
          at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:280)
          at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:290)
          at org.jenkinsci.remoting.nio.NioChannelBuilder.build(NioChannelBuilder.java:36)
          at org.jenkinsci.remoting.nio.NioChannelBuilder.build(NioChannelBuilder.java:52)
          at jenkins.slaves.JnlpSlaveAgentProtocol$Handler.jnlpConnect(JnlpSlaveAgentProtocol.java:120)
          at jenkins.slaves.DefaultJnlpSlaveReceiver.handle(DefaultJnlpSlaveReceiver.java:63)
          at jenkins.slaves.JnlpSlaveAgentProtocol2$Handler2.run(JnlpSlaveAgentProtocol2.java:57)
          at jenkins.slaves.JnlpSlaveAgentProtocol2.handle(JnlpSlaveAgentProtocol2.java:31)
          at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:156)

          I never saw these issues when i was using Hudson(Same Master and Slave machines were used even at that time).

          Anuvrath Joshi added a comment - Hi, I recently migrated from Hudson to Jenkins and form that time am facing this issues. It just randomly takes out some of the slave nodes. Slave logs look something like this: Failed to establish the connection with the slave fpm java.io.EOFException: unexpected stream termination at hudson.remoting.ChannelBuilder.negotiate(ChannelBuilder.java:331) at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:280) at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:290) at org.jenkinsci.remoting.nio.NioChannelBuilder.build(NioChannelBuilder.java:36) at org.jenkinsci.remoting.nio.NioChannelBuilder.build(NioChannelBuilder.java:52) at jenkins.slaves.JnlpSlaveAgentProtocol$Handler.jnlpConnect(JnlpSlaveAgentProtocol.java:120) at jenkins.slaves.DefaultJnlpSlaveReceiver.handle(DefaultJnlpSlaveReceiver.java:63) at jenkins.slaves.JnlpSlaveAgentProtocol2$Handler2.run(JnlpSlaveAgentProtocol2.java:57) at jenkins.slaves.JnlpSlaveAgentProtocol2.handle(JnlpSlaveAgentProtocol2.java:31) at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:156) I never saw these issues when i was using Hudson(Same Master and Slave machines were used even at that time).

          G. Ancona added a comment -

          Hi Anuvrath,
          we had the same error randomly using the windows services, since we launched Jenkins via batch we don't have it anymore.

          G. Ancona added a comment - Hi Anuvrath, we had the same error randomly using the windows services, since we launched Jenkins via batch we don't have it anymore.

          Hello Graziano,

          Thanks for your suggestion. But one more question, are you talking about launching the slave nodes using batch or Main Jenkins itself?

          Anuvrath Joshi added a comment - Hello Graziano, Thanks for your suggestion. But one more question, are you talking about launching the slave nodes using batch or Main Jenkins itself?

          Leslie Klein added a comment -

          Hi Graziano/anynoe, can you contribute some insight to our findings?
          1. We run the Jenkins job after the slave has been activated as a slave-agent launched from browser - result, the job runs to completion with successful termination.
          2.We run the slave using the batch command that you attached above - result, the job runs for 7 minutes and then terminates on the server with the "SocketException: connection reset" error below (the actual job on the slave continues to run and completes successfully after about another 3 minutes).
          3.If we run the slave as a Service, we receive the connection reset error after about 2-3 minutes.
          4.There are no error messages in the Jenkins Server Event Viewer coinciding with the connection reset error under any of the above cases.
          5.We did notice a peak on Jenkins CPU usage when the connection was reset.

          Error trace
          -----------
          FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          hudson.remoting.RequestAbortedExcepton: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          at hudson.remoting.RequestAbortedExcepton.wrapForRethrow(RequestAbortedException.java:41)
          at hudson.remoting.RequestAbortedExcepton.wrapForRethrow(RequestAbortedException.java:34)
          at hudson.remoting.Request.call(Request.java:174)
          at hudson.remoting.Request.call(Request.java:722)

          and so on

          Leslie Klein added a comment - Hi Graziano/anynoe, can you contribute some insight to our findings? 1. We run the Jenkins job after the slave has been activated as a slave-agent launched from browser - result, the job runs to completion with successful termination. 2.We run the slave using the batch command that you attached above - result, the job runs for 7 minutes and then terminates on the server with the "SocketException: connection reset" error below (the actual job on the slave continues to run and completes successfully after about another 3 minutes). 3.If we run the slave as a Service, we receive the connection reset error after about 2-3 minutes. 4.There are no error messages in the Jenkins Server Event Viewer coinciding with the connection reset error under any of the above cases. 5.We did notice a peak on Jenkins CPU usage when the connection was reset. Error trace ----------- FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset hudson.remoting.RequestAbortedExcepton: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.RequestAbortedExcepton.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedExcepton.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Request.call(Request.java:722) and so on

          G. Ancona added a comment -

          Hi
          @Anuvrath: I launched only the slaves via batch, the server is still running as a service
          @Leslie: Seems more this problem: https://issues.jenkins-ci.org/browse/JENKINS-18781

          G. Ancona added a comment - Hi @Anuvrath: I launched only the slaves via batch, the server is still running as a service @Leslie: Seems more this problem: https://issues.jenkins-ci.org/browse/JENKINS-18781

          Hello @Graziano,

          Am already running them as just an application using slave-agent instead of Widows service due network issues. But Master is running as Windows service.

          Anuvrath Joshi added a comment - Hello @Graziano, Am already running them as just an application using slave-agent instead of Widows service due network issues. But Master is running as Windows service.

            Unassigned Unassigned
            sharon_xia sharon xia
            Votes:
            8 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: