Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-12235

FATAL, Unable to delete script file, IOException2, remote file operation failed, unexpected termination of channel

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • core, remoting
    • None

      Below is the stacktrace.

      It happened when I ran two jobs on a master. After running a while, both jobs crashed with this exception.
      I think this might be caused by a small flip-flop connectivity of the network, but I didn't noticed any disconnection.
      Another cause may be the huge load of jenkins:

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      25942 hudson 15 0 6902m 5.8g 5720 S 0.3 74.3 401:22.30 java

      Does the jenkins runs its own garbage collector at some specified time?
      We have to restart every few days because it's getting slower and slower until hangs out.

      FATAL: Unable to delete script file /tmp/hudson8303731085225956739.sh
      hudson.util.IOException2: remote file operation failed: /tmp/hudson8303731085225956739.sh at hudson.remoting.Channel@30e472f4:build@autom-1
      at hudson.FilePath.act(FilePath.java:781)
      at hudson.FilePath.act(FilePath.java:767)
      at hudson.FilePath.delete(FilePath.java:1022)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:695)
      at hudson.model.Build$RunnerImpl.build(Build.java:178)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:461)
      at hudson.model.Run.run(Run.java:1404)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:230)
      Caused by: hudson.remoting.ChannelClosedException: channel is already closed
      at hudson.remoting.Channel.send(Channel.java:499)
      at hudson.remoting.Request.call(Request.java:110)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.FilePath.act(FilePath.java:774)
      ... 13 more
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.call(Request.java:149)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
      at $Proxy29.join(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:859)
      at hudson.Launcher$ProcStarter.join(Launcher.java:345)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:695)
      at hudson.model.Build$RunnerImpl.build(Build.java:178)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:461)
      at hudson.model.Run.run(Run.java:1404)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:230)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.abort(Request.java:273)
      at hudson.remoting.Channel.terminate(Channel.java:732)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1139)
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)

          [JENKINS-12235] FATAL, Unable to delete script file, IOException2, remote file operation failed, unexpected termination of channel

          This issues is bothering, in special, when running on windows slaves connected via JNLP agent.

          In windows slaves case, it seems that the jnlp socket connection is quite sensitive to connection even it's not used at 100%.

          Maybe the solution for this is to use a ssh server on windows?

          Thanks

          Ghenadie Dumitru added a comment - This issues is bothering, in special, when running on windows slaves connected via JNLP agent. In windows slaves case, it seems that the jnlp socket connection is quite sensitive to connection even it's not used at 100%. Maybe the solution for this is to use a ssh server on windows? Thanks

          Erik Purins added a comment -

          Hitting this frequently on windows jenkins slaves. Similar call stack attached.

          Erik Purins added a comment - Hitting this frequently on windows jenkins slaves. Similar call stack attached.

          brianharris added a comment -

          Suspected duplicate: JENKINS-6817

          brianharris added a comment - Suspected duplicate: JENKINS-6817

          We have started getting this reciently. Difficult to search but I think about version 1.455.

          • If there is no obvious fix can the exception be caught so that it does fail an otherwise successful build?

          I think this would be an adequate workaround for most people, atm this issue is causing random builds to fail which is a significant annoyance for the developers.

          Thanks.
          Rich.

          Richard Taylor added a comment - We have started getting this reciently. Difficult to search but I think about version 1.455. If there is no obvious fix can the exception be caught so that it does fail an otherwise successful build? I think this would be an adequate workaround for most people, atm this issue is causing random builds to fail which is a significant annoyance for the developers. Thanks. Rich.

          Kristian Karl added a comment - - edited

          I get this problem even so often on Windows machines. I run Jenkins ver. 1.463
          See attached stacktrace.txt

          Kristian Karl added a comment - - edited I get this problem even so often on Windows machines. I run Jenkins ver. 1.463 See attached stacktrace.txt

          Brian Harris added a comment -

          For us, the cause of this error was our build slaves (VMs) running out of memory and self-rebooting.

          Brian Harris added a comment - For us, the cause of this error was our build slaves (VMs) running out of memory and self-rebooting.

          Erik Purins added a comment -

          Disconnects with large stack traces still occurring in latest 1.471. So far have encountered this on Windows, but historically we have seen this also on OSX and Linux. It looks slightly different in my latest, I get a socket reset exception, but failure still first hit in deleting the script file after a long build (over 3 hours).

          Erik Purins added a comment - Disconnects with large stack traces still occurring in latest 1.471. So far have encountered this on Windows, but historically we have seen this also on OSX and Linux. It looks slightly different in my latest, I get a socket reset exception, but failure still first hit in deleting the script file after a long build (over 3 hours).

          Erik Purins added a comment -

          This is intermittent, but if it's a sign of a client problem (out of memory or whatever), a more useful error caught earlier on would be an improvement.

          Erik Purins added a comment - This is intermittent, but if it's a sign of a client problem (out of memory or whatever), a more useful error caught earlier on would be an improvement.

          jvoegele added a comment -

          We have also been seeing this problem intermittently. It is not only Windows for us, but our Suse and Red Hat Linux slaves have also been suffering from the same problem.

          Can anything be done to prevent this problem from failing a build? The spurious failures are a distraction.

          jvoegele added a comment - We have also been seeing this problem intermittently. It is not only Windows for us, but our Suse and Red Hat Linux slaves have also been suffering from the same problem. Can anything be done to prevent this problem from failing a build? The spurious failures are a distraction.

          Zhijun Xu added a comment - - edited

          I encountered the problem these days, my job is running on a Linux slave, due to some reason, there have lots of 'wait' steps ,which need lots of time in the job procedure, I think during those wait steps, the master has no communication with the slave, then the slave(ssh server) disconnects the connection to the master(ssh client), then the problem happened.

          So I configured the ssh server send messages to ssh client every minutes to ensure the connection quality, the problem is resolved

          Zhijun Xu added a comment - - edited I encountered the problem these days, my job is running on a Linux slave, due to some reason, there have lots of 'wait' steps ,which need lots of time in the job procedure, I think during those wait steps, the master has no communication with the slave, then the slave(ssh server) disconnects the connection to the master(ssh client), then the problem happened. So I configured the ssh server send messages to ssh client every minutes to ensure the connection quality, the problem is resolved

          We have been facing this issue for some time. Yesterday, we upgraded to 1.466 and the issue persists.
          Jenkins master is running on Windows and the slaves are mainly Windows but a few are Linux. The issue randomly appears on any slave.

          Thanks
          Shobha

          script file c:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\hudson920936561807305456.bat
          hudson.util.IOException2: remote file operation failed: c:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\hudson920936561807305456.bat at hudson.remoting.Channel@5f1603a2: Slave123
          at hudson.FilePath.act(FilePath.java:835)
          at hudson.FilePath.act(FilePath.java:821)
          at hudson.FilePath.delete(FilePath.java:1126)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:710)
          at hudson.model.Build$RunnerImpl.build(Build.java:178)
          at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
          at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:480)
          at hudson.model.Run.run(Run.java:1438)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:239)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed

          Shobha Dashottar added a comment - We have been facing this issue for some time. Yesterday, we upgraded to 1.466 and the issue persists. Jenkins master is running on Windows and the slaves are mainly Windows but a few are Linux. The issue randomly appears on any slave. Thanks Shobha script file c:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\hudson920936561807305456.bat hudson.util.IOException2: remote file operation failed: c:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\hudson920936561807305456.bat at hudson.remoting.Channel@5f1603a2: Slave123 at hudson.FilePath.act(FilePath.java:835) at hudson.FilePath.act(FilePath.java:821) at hudson.FilePath.delete(FilePath.java:1126) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:710) at hudson.model.Build$RunnerImpl.build(Build.java:178) at hudson.model.Build$RunnerImpl.doRun(Build.java:139) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:480) at hudson.model.Run.run(Run.java:1438) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:239) Caused by: hudson.remoting.ChannelClosedException: channel is already closed

          Chris Welch added a comment -

          We consistently get this error on a Linux build slave consistently on jobs that have little output and take a long time (typically > 2 hours). We are using ssh and I suspect the problem is due to no traffic on the ssh link for this long period. Using Jenkins 1.434.

          Chris Welch added a comment - We consistently get this error on a Linux build slave consistently on jobs that have little output and take a long time (typically > 2 hours). We are using ssh and I suspect the problem is due to no traffic on the ssh link for this long period. Using Jenkins 1.434.

          Chris Welch added a comment -

          We are able to work around the problem by adding:

          ClientAliveInterval 60

          to /etc/ssh/sshd_config on the Jenkins host

          Chris Welch added a comment - We are able to work around the problem by adding: ClientAliveInterval 60 to /etc/ssh/sshd_config on the Jenkins host

          Yury Pukhalsky added a comment - - edited

          Jenkins 1.489 here and it happens too. The master and slave are RHEL5.8.
          The task runs for 15 minutes in my case. The output is being spewn and setting neither ClientAliveInterval and ClientAliveCountMax nor TCPKeepAlive helped.
          It started to happen after i've joined two "execute shell" steps into one. The sockets, processes numbers are well within limits

          The slave appears to exit (crash?). In the slave log there is:

          ...
          Evacuated stdout
          Slave successfully connected and online
          ERROR: Connection terminated
          java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException

          Probably it's somewhere in the JVM entrails? Now i'll try to play with different JVMs and settings as it's a blocking issue for me.

          Yury Pukhalsky added a comment - - edited Jenkins 1.489 here and it happens too. The master and slave are RHEL5.8. The task runs for 15 minutes in my case. The output is being spewn and setting neither ClientAliveInterval and ClientAliveCountMax nor TCPKeepAlive helped. It started to happen after i've joined two "execute shell" steps into one. The sockets, processes numbers are well within limits The slave appears to exit (crash?). In the slave log there is: ... Evacuated stdout Slave successfully connected and online ERROR: Connection terminated java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException Probably it's somewhere in the JVM entrails? Now i'll try to play with different JVMs and settings as it's a blocking issue for me.

          Code changed in jenkins
          User: Nicolas De Loof
          Path:
          core/src/main/java/hudson/tasks/CommandInterpreter.java
          http://jenkins-ci.org/commit/jenkins/8e74242d8b961a78d5d498b55e1f3797f92bb8a1
          Log:
          JENKINS-12235 root cause is hidden by by script deletion failure

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nicolas De Loof Path: core/src/main/java/hudson/tasks/CommandInterpreter.java http://jenkins-ci.org/commit/jenkins/8e74242d8b961a78d5d498b55e1f3797f92bb8a1 Log: JENKINS-12235 root cause is hidden by by script deletion failure

          dogfood added a comment -

          Integrated in jenkins_main_trunk #2141
          JENKINS-12235 root cause is hidden by by script deletion failure (Revision 8e74242d8b961a78d5d498b55e1f3797f92bb8a1)

          Result = SUCCESS
          Nicolas De Loof : 8e74242d8b961a78d5d498b55e1f3797f92bb8a1
          Files :

          • core/src/main/java/hudson/tasks/CommandInterpreter.java

          dogfood added a comment - Integrated in jenkins_main_trunk #2141 JENKINS-12235 root cause is hidden by by script deletion failure (Revision 8e74242d8b961a78d5d498b55e1f3797f92bb8a1) Result = SUCCESS Nicolas De Loof : 8e74242d8b961a78d5d498b55e1f3797f92bb8a1 Files : core/src/main/java/hudson/tasks/CommandInterpreter.java

          Marc Seeger added a comment - - edited

          I get this on Linux -> Linux with Jenkins ver. 1.504
          Different data centers though, so probably not the most stable network connection.

          FATAL: Unable to delete script file /tmp/hudson9103641402954770242.sh
          hudson.util.IOException2: remote file operation failed: /tmp/hudson9103641402954770242.sh at hudson.remoting.Channel@3ce3262f:django
          	at hudson.FilePath.act(FilePath.java:861)
          	at hudson.FilePath.act(FilePath.java:838)
          	at hudson.FilePath.delete(FilePath.java:1223)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:814)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:593)
          	at hudson.model.Run.execute(Run.java:1567)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:237)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:494)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.FilePath.act(FilePath.java:854)
          	... 13 more
          Caused by: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:850)
          	at hudson.remoting.Channel$2.handle(Channel.java:435)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: Command close created at
          	at hudson.remoting.Command.<init>(Command.java:56)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:844)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:842)
          	at hudson.remoting.Channel.close(Channel.java:909)
          	at hudson.remoting.Channel.close(Channel.java:892)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:849)
          	... 2 more
          FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
          	at $Proxy46.join(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:814)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:593)
          	at hudson.model.Run.execute(Run.java:1567)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:237)
          Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:732)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:850)
          	at hudson.remoting.Channel$2.handle(Channel.java:435)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: hudson.remoting.Channel$OrderlyShutdown
          	... 3 more
          Caused by: Command close created at
          	at hudson.remoting.Command.<init>(Command.java:56)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:844)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:842)
          	at hudson.remoting.Channel.close(Channel.java:909)
          	at hudson.remoting.Channel.close(Channel.java:892)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:849)
          	... 2 more
          

          The file is still around:

          # ls -lash /tmp/hudson9103641402954770242.sh
          4.0K -rw-rw-r-- 1 jenkins jenkins 96 Mar  8 14:50 /tmp/hudson9103641402954770242.sh
          

          Marc Seeger added a comment - - edited I get this on Linux -> Linux with Jenkins ver. 1.504 Different data centers though, so probably not the most stable network connection. FATAL: Unable to delete script file /tmp/hudson9103641402954770242.sh hudson.util.IOException2: remote file operation failed: /tmp/hudson9103641402954770242.sh at hudson.remoting.Channel@3ce3262f:django at hudson.FilePath.act(FilePath.java:861) at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.delete(FilePath.java:1223) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:814) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:593) at hudson.model.Run.execute(Run.java:1567) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:237) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:494) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:854) ... 13 more Caused by: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:850) at hudson.remoting.Channel$2.handle(Channel.java:435) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:844) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:842) at hudson.remoting.Channel.close(Channel.java:909) at hudson.remoting.Channel.close(Channel.java:892) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:849) ... 2 more FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:672) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) at $Proxy46.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:814) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:593) at hudson.model.Run.execute(Run.java:1567) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:237) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:732) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:850) at hudson.remoting.Channel$2.handle(Channel.java:435) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:844) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:842) at hudson.remoting.Channel.close(Channel.java:909) at hudson.remoting.Channel.close(Channel.java:892) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:849) ... 2 more The file is still around: # ls -lash /tmp/hudson9103641402954770242.sh 4.0K -rw-rw-r-- 1 jenkins jenkins 96 Mar 8 14:50 /tmp/hudson9103641402954770242.sh

          Erik Purins added a comment -

          We still get this from time to time with debian squeeze master 1.504 jenkins, osx 10.8 client. It would be nice to resolve this, or at least handle this common error, with less stack trace, more human-readable text. Alternately, it would be nice if we could have a more fault-tolerant delete temporary file command, that either retries or schedules a cleanup of the temp file when it can.

          Erik Purins added a comment - We still get this from time to time with debian squeeze master 1.504 jenkins, osx 10.8 client. It would be nice to resolve this, or at least handle this common error, with less stack trace, more human-readable text. Alternately, it would be nice if we could have a more fault-tolerant delete temporary file command, that either retries or schedules a cleanup of the temp file when it can.

          shinsato added a comment -

          Have seen this issue for some time intermittently. It might make the jenkins slave feature unusable for many applications if we can't find a workaround.

          shinsato added a comment - Have seen this issue for some time intermittently. It might make the jenkins slave feature unusable for many applications if we can't find a workaround.

          x29a added a comment - - edited

          Unfortunately, i experience this (very prominent!) problem as well. Here are some more infos and a possible workaround.

          Setup:

          • Jenkins 1.512 on tomcat 6.0.36, in a VirtualBox Windows 7 Guest, JRE 1.7.0
          • Slave connected via JNLP client in a VirtualBox Windows 7 Guest, JRE 1.7.0

          The Slave is started via:
          java -Xmx512m -jar slave.jar -jnlpUrl http://jenkins/computer/slave/slave-agent.jnlp in order to see the log messages

          First off, setting various values to these variables (in catalina on tomcat) did not seem to improve the behaviour:
          -Dhudson.remoting.Launcher.pingTimeoutSec
          -Dhudson.remoting.Launcher.pingIntervalSec
          -Dhudson.slaves.ChannelPinger.pingInterval

          I was getting the "channel already closed" exception quite frequently and mostly at the same spot during script execution. The job (between 12h and 16h) on the slave (via windows batch file) generates large amounts of documentation via doxygen and pipes the output into a logfile, so it uses quite some CPU and does not echo progress. Throttling the CPU so that the NIC wont suffer from the overload, did not help the problem though. Also, i performed continuous pings to the slave (from the master and back) and ping requests only seldomly failed (normal network tolerances).

          To say this first: allthough jenkins failed with the above mentioned exception, the slave continued to perform its job "in the background", so if the exception came after 1h, i would see the updated documentation after 16h allthough jenkins already declared the job as failed.

          For the chronology, these are the log excerpts:

          In the live console on the jenkins WebUI i see (THE FIRST LINE IS THE LAST OUTPUT BY MY SCRIPT):

          Jenkins WebUI
          2013-05-03 18:36:50 - Processing: documentationA
          FATAL: Unable to delete script file c:\temp\hudson3125329676016517230.bat
          hudson.util.IOException2: remote file operation failed: c:\temp\hudson3125329676016517230.bat at hudson.remoting.Channel@1f12b9f:slave
          	at hudson.FilePath.act(FilePath.java:900)
          	at hudson.FilePath.act(FilePath.java:877)
          	at hudson.FilePath.delete(FilePath.java:1262)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
          	at hudson.model.Run.execute(Run.java:1575)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:237)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:494)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.FilePath.act(FilePath.java:893)
          	... 13 more
          Caused by: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
          	at $Proxy52.join(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
          	at hudson.model.Run.execute(Run.java:1575)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:237)
          Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:732)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
          Caused by: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          In the JNLP clientlog on my slave i got:

          JNLP Client on slave
          Mai 03, 2013 7:16:43 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
          SEVERE: I/O error in channel channel
          java.net.SocketTimeoutException: Read timed out
                  at java.net.SocketInputStream.socketRead0(Native Method)
                  at java.net.SocketInputStream.read(Unknown Source)
                  at java.net.SocketInputStream.read(Unknown Source)
                  at java.io.BufferedInputStream.fill(Unknown Source)
                  at java.io.BufferedInputStream.read(Unknown Source)
                  at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
                  at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
                  at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
                  at java.io.ObjectInputStream.readObject0(Unknown Source)
                  at java.io.ObjectInputStream.readObject(Unknown Source)
                  at hudson.remoting.Command.readFrom(Command.java:92)
                  at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          
          Mai 03, 2013 7:16:43 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Terminated
          Mai 03, 2013 7:16:56 PM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Locating server among [http://jenkins/]
          

          And on the Tomcat server:

          Tomcat server
          May 03, 2013 7:16:42 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
          SEVERE: I/O error in channel slave
          java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
                  at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
                  at java.io.ObjectInputStream.readObject0(Unknown Source)
                  at java.io.ObjectInputStream.readObject(Unknown Source)
                  at hudson.remoting.Command.readFrom(Command.java:92)
                  at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          
          May 03, 2013 7:16:42 PM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed
          WARNING: Channel reader thread: slave for + slave terminated
          java.io.IOException: Unexpected termination of the channel
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
                  at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
                  at java.io.ObjectInputStream.readObject0(Unknown Source)
                  at java.io.ObjectInputStream.readObject(Unknown Source)
                  at hudson.remoting.Command.readFrom(Command.java:92)
                  at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
                  at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          
          May 03, 2013 7:16:53 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
          INFO: Accepted connection #78 from /10.0.0.2:49300
          

          It seemed, that the channel gets closed, when there is no data going through the connection (hence playing with the ping settings mentioned above). I cant definately say how long it would stay open, but in the case shown above, it was about 45min without output. Therefor i modified the script to call doxygen in a thread and output a "." every 15s. So far, no more closed channels If you cant modify your script to generate continuous output, maybe pipe your command (in the batch file) to some program which outputs the output or triggers a continuous output. Also, i noticed that the dots generated from my script modification are not shown in the WebUI until a newline is sent. Nevertheless, the channel did not get closed.

          I hope that this investigation delivers some clues to fix this problem and make distributed working with jenkins more stable!

          x29a added a comment - - edited Unfortunately, i experience this (very prominent!) problem as well. Here are some more infos and a possible workaround. Setup: Jenkins 1.512 on tomcat 6.0.36, in a VirtualBox Windows 7 Guest, JRE 1.7.0 Slave connected via JNLP client in a VirtualBox Windows 7 Guest, JRE 1.7.0 The Slave is started via: java -Xmx512m -jar slave.jar -jnlpUrl http://jenkins/computer/slave/slave-agent.jnlp in order to see the log messages First off, setting various values to these variables (in catalina on tomcat) did not seem to improve the behaviour: -Dhudson.remoting.Launcher.pingTimeoutSec -Dhudson.remoting.Launcher.pingIntervalSec -Dhudson.slaves.ChannelPinger.pingInterval I was getting the "channel already closed" exception quite frequently and mostly at the same spot during script execution. The job (between 12h and 16h) on the slave (via windows batch file) generates large amounts of documentation via doxygen and pipes the output into a logfile, so it uses quite some CPU and does not echo progress. Throttling the CPU so that the NIC wont suffer from the overload, did not help the problem though. Also, i performed continuous pings to the slave (from the master and back) and ping requests only seldomly failed (normal network tolerances). To say this first: allthough jenkins failed with the above mentioned exception, the slave continued to perform its job "in the background", so if the exception came after 1h, i would see the updated documentation after 16h allthough jenkins already declared the job as failed. For the chronology, these are the log excerpts: In the live console on the jenkins WebUI i see ( THE FIRST LINE IS THE LAST OUTPUT BY MY SCRIPT ): Jenkins WebUI 2013-05-03 18:36:50 - Processing: documentationA FATAL: Unable to delete script file c:\temp\hudson3125329676016517230.bat hudson.util.IOException2: remote file operation failed: c:\temp\hudson3125329676016517230.bat at hudson.remoting.Channel@1f12b9f:slave at hudson.FilePath.act(FilePath.java:900) at hudson.FilePath.act(FilePath.java:877) at hudson.FilePath.delete(FilePath.java:1262) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584) at hudson.model.Run.execute(Run.java:1575) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:237) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:494) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:893) ... 13 more Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:672) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) at $Proxy52.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584) at hudson.model.Run.execute(Run.java:1575) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:237) Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:732) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) In the JNLP clientlog on my slave i got: JNLP Client on slave Mai 03, 2013 7:16:43 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel channel java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Mai 03, 2013 7:16:43 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Mai 03, 2013 7:16:56 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http: //jenkins/] And on the Tomcat server: Tomcat server May 03, 2013 7:16:42 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel slave java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) May 03, 2013 7:16:42 PM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed WARNING: Channel reader thread: slave for + slave terminated java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) May 03, 2013 7:16:53 PM hudson.TcpSlaveAgentListener$ConnectionHandler run INFO: Accepted connection #78 from /10.0.0.2:49300 It seemed, that the channel gets closed, when there is no data going through the connection (hence playing with the ping settings mentioned above). I cant definately say how long it would stay open, but in the case shown above, it was about 45min without output. Therefor i modified the script to call doxygen in a thread and output a "." every 15s. So far, no more closed channels If you cant modify your script to generate continuous output, maybe pipe your command (in the batch file) to some program which outputs the output or triggers a continuous output. Also, i noticed that the dots generated from my script modification are not shown in the WebUI until a newline is sent. Nevertheless, the channel did not get closed. I hope that this investigation delivers some clues to fix this problem and make distributed working with jenkins more stable!

          I got a this problem everyday.

          Jenkins 1.518 on tomcat 7.0.35, in Windows XP, JRE 1.7.0_13
          slave PC : windows xp 32bit.

          FATAL: Unable to delete script file C:\DOCUME~1\dg\LOCALS~1\Temp\hudson229281249267934971.bat
          hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\dg\LOCALS~1\Temp\hudson229281249267934971.bat at hudson.remoting.Channel@7ccfde:PC_068_LX760
          	at hudson.FilePath.act(FilePath.java:901)
          	at hudson.FilePath.act(FilePath.java:878)
          	at hudson.FilePath.delete(FilePath.java:1263)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1576)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:241)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:494)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.FilePath.act(FilePath.java:894)
          	... 13 more
          Caused by: java.net.SocketException: Connection reset
          	at java.net.SocketInputStream.read(Unknown Source)
          	at java.net.SocketInputStream.read(Unknown Source)
          	at java.io.BufferedInputStream.fill(Unknown Source)
          	at java.io.BufferedInputStream.read(Unknown Source)
          	at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
          	at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
          	at sun.proxy.$Proxy72.join(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1576)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:241)
          Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:732)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
          Caused by: java.net.SocketException: Connection reset
          	at java.net.SocketInputStream.read(Unknown Source)
          	at java.net.SocketInputStream.read(Unknown Source)
          	at java.io.BufferedInputStream.fill(Unknown Source)
          	at java.io.BufferedInputStream.read(Unknown Source)
          	at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
          	at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          Ryang Woo Park added a comment - I got a this problem everyday. Jenkins 1.518 on tomcat 7.0.35, in Windows XP, JRE 1.7.0_13 slave PC : windows xp 32bit. FATAL: Unable to delete script file C:\DOCUME~1\dg\LOCALS~1\Temp\hudson229281249267934971.bat hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\dg\LOCALS~1\Temp\hudson229281249267934971.bat at hudson.remoting.Channel@7ccfde:PC_068_LX760 at hudson.FilePath.act(FilePath.java:901) at hudson.FilePath.act(FilePath.java:878) at hudson.FilePath.delete(FilePath.java:1263) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1576) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:241) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:494) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:894) ... 13 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:672) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) at sun.proxy.$Proxy72.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1576) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:241) Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:732) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          We are hitting this problem around once a day as well.

          Jenkins: Jenkins ver. 1.517 on Windows Server 2008 R2, 8GB RAM, JRE 1.7.0_21
          Slave PC: Windows 7, 4GB RAM, JRE 1.7.0_21

          What's interesting is that the exception (when it does occur) always seems to happen on the exact same unit test that is executing. That test in particular spawns off a new process and then kills just that process and all child processes. Below is the stack trace:

          16:54:43 FATAL: Unable to delete script file C:\Users\****\AppData\Local\Temp\hudson8255606542971992250.ps1
          16:54:50 hudson.util.IOException2: remote file operation failed: C:\Users\****\AppData\Local\Temp\hudson8255606542971992250.ps1 at hudson.remoting.Channel@fde4a0:scheduler_tests
          16:54:51 	at hudson.FilePath.act(FilePath.java:901)
          16:54:55 	at hudson.FilePath.act(FilePath.java:878)
          16:54:55 	at hudson.FilePath.delete(FilePath.java:1263)
          16:54:55 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          16:55:19 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          16:55:19 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          16:55:19 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          16:55:19 	at hudson.model.Build$BuildExecution.build(Build.java:199)
          16:55:19 	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          16:55:19 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          16:55:19 	at hudson.model.Run.execute(Run.java:1576)
          16:55:19 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          16:55:19 	at hudson.model.ResourceController.execute(ResourceController.java:88)
          16:55:19 	at hudson.model.Executor.run(Executor.java:241)
          16:55:19 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          16:55:19 	at hudson.remoting.Channel.send(Channel.java:494)
          16:55:19 	at hudson.remoting.Request.call(Request.java:129)
          16:55:19 	at hudson.remoting.Channel.call(Channel.java:672)
          16:55:19 	at hudson.FilePath.act(FilePath.java:894)
          16:55:19 	... 13 more
          16:55:19 Caused by: java.net.SocketException: Connection reset
          16:55:19 	at java.net.SocketInputStream.read(Unknown Source)
          16:55:19 	at java.net.SocketInputStream.read(Unknown Source)
          16:55:19 	at java.io.BufferedInputStream.fill(Unknown Source)
          16:55:19 	at java.io.BufferedInputStream.read(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream.readObject0(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream.readObject(Unknown Source)
          16:55:19 	at hudson.remoting.Command.readFrom(Command.java:92)
          16:55:19 	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          16:55:19 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          16:55:19 FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          16:55:19 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          16:55:19 	at hudson.remoting.Request.call(Request.java:174)
          16:55:19 	at hudson.remoting.Channel.call(Channel.java:672)
          16:55:19 	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
          16:55:19 	at com.sun.proxy.$Proxy41.join(Unknown Source)
          16:55:19 	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          16:55:19 	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          16:55:19 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          16:55:19 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          16:55:19 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          16:55:19 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          16:55:19 	at hudson.model.Build$BuildExecution.build(Build.java:199)
          16:55:19 	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          16:55:19 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          16:55:19 	at hudson.model.Run.execute(Run.java:1576)
          16:55:19 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          16:55:19 	at hudson.model.ResourceController.execute(ResourceController.java:88)
          16:55:19 	at hudson.model.Executor.run(Executor.java:241)
          16:55:19 Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          16:55:19 	at hudson.remoting.Request.abort(Request.java:299)
          16:55:19 	at hudson.remoting.Channel.terminate(Channel.java:732)
          16:55:19 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
          16:55:19 Caused by: java.net.SocketException: Connection reset
          16:55:19 	at java.net.SocketInputStream.read(Unknown Source)
          16:55:19 	at java.net.SocketInputStream.read(Unknown Source)
          16:55:19 	at java.io.BufferedInputStream.fill(Unknown Source)
          16:55:19 	at java.io.BufferedInputStream.read(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream.readObject0(Unknown Source)
          16:55:19 	at java.io.ObjectInputStream.readObject(Unknown Source)
          16:55:19 	at hudson.remoting.Command.readFrom(Command.java:92)
          16:55:19 	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          16:55:19 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          David Riggleman added a comment - We are hitting this problem around once a day as well. Jenkins: Jenkins ver. 1.517 on Windows Server 2008 R2, 8GB RAM, JRE 1.7.0_21 Slave PC: Windows 7, 4GB RAM, JRE 1.7.0_21 What's interesting is that the exception (when it does occur) always seems to happen on the exact same unit test that is executing. That test in particular spawns off a new process and then kills just that process and all child processes. Below is the stack trace: 16:54:43 FATAL: Unable to delete script file C:\Users\****\AppData\Local\Temp\hudson8255606542971992250.ps1 16:54:50 hudson.util.IOException2: remote file operation failed: C:\Users\****\AppData\Local\Temp\hudson8255606542971992250.ps1 at hudson.remoting.Channel@fde4a0:scheduler_tests 16:54:51 at hudson.FilePath.act(FilePath.java:901) 16:54:55 at hudson.FilePath.act(FilePath.java:878) 16:54:55 at hudson.FilePath.delete(FilePath.java:1263) 16:54:55 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) 16:55:19 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) 16:55:19 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 16:55:19 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) 16:55:19 at hudson.model.Build$BuildExecution.build(Build.java:199) 16:55:19 at hudson.model.Build$BuildExecution.doRun(Build.java:160) 16:55:19 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) 16:55:19 at hudson.model.Run.execute(Run.java:1576) 16:55:19 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 16:55:19 at hudson.model.ResourceController.execute(ResourceController.java:88) 16:55:19 at hudson.model.Executor.run(Executor.java:241) 16:55:19 Caused by: hudson.remoting.ChannelClosedException: channel is already closed 16:55:19 at hudson.remoting.Channel.send(Channel.java:494) 16:55:19 at hudson.remoting.Request.call(Request.java:129) 16:55:19 at hudson.remoting.Channel.call(Channel.java:672) 16:55:19 at hudson.FilePath.act(FilePath.java:894) 16:55:19 ... 13 more 16:55:19 Caused by: java.net.SocketException: Connection reset 16:55:19 at java.net.SocketInputStream.read(Unknown Source) 16:55:19 at java.net.SocketInputStream.read(Unknown Source) 16:55:19 at java.io.BufferedInputStream.fill(Unknown Source) 16:55:19 at java.io.BufferedInputStream.read(Unknown Source) 16:55:19 at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) 16:55:19 at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) 16:55:19 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) 16:55:19 at java.io.ObjectInputStream.readObject0(Unknown Source) 16:55:19 at java.io.ObjectInputStream.readObject(Unknown Source) 16:55:19 at hudson.remoting.Command.readFrom(Command.java:92) 16:55:19 at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) 16:55:19 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) 16:55:19 FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 16:55:19 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 16:55:19 at hudson.remoting.Request.call(Request.java:174) 16:55:19 at hudson.remoting.Channel.call(Channel.java:672) 16:55:19 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) 16:55:19 at com.sun.proxy.$Proxy41.join(Unknown Source) 16:55:19 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) 16:55:19 at hudson.Launcher$ProcStarter.join(Launcher.java:360) 16:55:19 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) 16:55:19 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) 16:55:19 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 16:55:19 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) 16:55:19 at hudson.model.Build$BuildExecution.build(Build.java:199) 16:55:19 at hudson.model.Build$BuildExecution.doRun(Build.java:160) 16:55:19 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) 16:55:19 at hudson.model.Run.execute(Run.java:1576) 16:55:19 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 16:55:19 at hudson.model.ResourceController.execute(ResourceController.java:88) 16:55:19 at hudson.model.Executor.run(Executor.java:241) 16:55:19 Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 16:55:19 at hudson.remoting.Request.abort(Request.java:299) 16:55:19 at hudson.remoting.Channel.terminate(Channel.java:732) 16:55:19 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) 16:55:19 Caused by: java.net.SocketException: Connection reset 16:55:19 at java.net.SocketInputStream.read(Unknown Source) 16:55:19 at java.net.SocketInputStream.read(Unknown Source) 16:55:19 at java.io.BufferedInputStream.fill(Unknown Source) 16:55:19 at java.io.BufferedInputStream.read(Unknown Source) 16:55:19 at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) 16:55:19 at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) 16:55:19 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) 16:55:19 at java.io.ObjectInputStream.readObject0(Unknown Source) 16:55:19 at java.io.ObjectInputStream.readObject(Unknown Source) 16:55:19 at hudson.remoting.Command.readFrom(Command.java:92) 16:55:19 at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) 16:55:19 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          x29a added a comment -

          Can you try my possible workaround? Or give estimates of how long the individual steps take (and produce no output)? I think we have enough stacktraces and as long as they do not provide more detailed information, it would be better to collect background information on what led to this error like

          • does it fail always at the same spot in the job
          • does that spot take a long time (and how long is that time)
          • is output generated (and propagated back to the jenkins master)
          • does the possible fix work (producing output in between to keep the channel open)

          If you want to provide stacktraces, output from the tomcat server and the jnlp client (both with timestamps) along with the output from the jenkins master could also help.

          x29a added a comment - Can you try my possible workaround? Or give estimates of how long the individual steps take (and produce no output)? I think we have enough stacktraces and as long as they do not provide more detailed information, it would be better to collect background information on what led to this error like does it fail always at the same spot in the job does that spot take a long time (and how long is that time) is output generated (and propagated back to the jenkins master) does the possible fix work (producing output in between to keep the channel open) If you want to provide stacktraces, output from the tomcat server and the jnlp client (both with timestamps) along with the output from the jenkins master could also help.

          Doug Konrad added a comment -

          I was seeing this once or twice a day when our slaves were overloaded. Since I fixed the overload problem, I've only seen it once.

          In addition to fixing the overload, on all our slaves, I made the following changes to /etc/ssh/sshd_config:

          ClientAliveCountMax 99
          ClientAliveInterval 60
          

          On half of the slaves, I also set

          TCPKeepAlive no
          

          (It had been 'yes' on all the slaves.)

          The only failure I've seen since these changes has been on a machine with

          TCPKeepAlive yes
          

          Doug Konrad added a comment - I was seeing this once or twice a day when our slaves were overloaded. Since I fixed the overload problem, I've only seen it once. In addition to fixing the overload, on all our slaves, I made the following changes to /etc/ssh/sshd_config: ClientAliveCountMax 99 ClientAliveInterval 60 On half of the slaves, I also set TCPKeepAlive no (It had been 'yes' on all the slaves.) The only failure I've seen since these changes has been on a machine with TCPKeepAlive yes

          Ryang Woo Park added a comment - - edited

          I changed to connect to slave pc via openssh.
          But it's still occurred.

          same environment as above comment

          Jenkins 1.518 on tomcat 7.0.35, in Windows XP, JRE 1.7.0_13
          slave PC : windows xp 32bit.

          FATAL: Unable to delete script file C:\DOCUME~1\dg\LOCALS~1\Temp\hudson8470529757775576764.bat
          hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\dg\LOCALS~1\Temp\hudson8470529757775576764.bat at hudson.remoting.Channel@1293e35:PC_067_LX760
          	at hudson.FilePath.act(FilePath.java:901)
          	at hudson.FilePath.act(FilePath.java:878)
          	at hudson.FilePath.delete(FilePath.java:1263)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1576)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:241)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:494)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.FilePath.act(FilePath.java:894)
          	... 13 more
          Caused by: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:672)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
          	at sun.proxy.$Proxy70.join(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1576)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:241)
          Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:732)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
          Caused by: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
          Caused by: java.io.EOFException
          	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
          	at java.io.ObjectInputStream.readObject0(Unknown Source)
          	at java.io.ObjectInputStream.readObject(Unknown Source)
          	at hudson.remoting.Command.readFrom(Command.java:92)
          	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          

          Ryang Woo Park added a comment - - edited I changed to connect to slave pc via openssh. But it's still occurred. same environment as above comment Jenkins 1.518 on tomcat 7.0.35, in Windows XP, JRE 1.7.0_13 slave PC : windows xp 32bit. FATAL: Unable to delete script file C:\DOCUME~1\dg\LOCALS~1\Temp\hudson8470529757775576764.bat hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\dg\LOCALS~1\Temp\hudson8470529757775576764.bat at hudson.remoting.Channel@1293e35:PC_067_LX760 at hudson.FilePath.act(FilePath.java:901) at hudson.FilePath.act(FilePath.java:878) at hudson.FilePath.delete(FilePath.java:1263) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1576) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:241) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:494) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:894) ... 13 more Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:672) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) at sun.proxy.$Proxy70.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1576) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:241) Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:732) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          Marc Seeger added a comment - - edited

          I also run into this quite frequently with a job that is long-running and sometimes doesn't print anything to stdout for several minutes

          edit: I just noticed that I've already commented on this issue... it's late...

          Marc Seeger added a comment - - edited I also run into this quite frequently with a job that is long-running and sometimes doesn't print anything to stdout for several minutes edit: I just noticed that I've already commented on this issue... it's late...

          sanga added a comment -

          We just hit this in 1.509.2 i.e. the current jenkins stable release. And it's causing about half our builds to fail at the moment. So our CI system is at the moment pretty much screwed. Raising to critical

          sanga added a comment - We just hit this in 1.509.2 i.e. the current jenkins stable release. And it's causing about half our builds to fail at the moment. So our CI system is at the moment pretty much screwed. Raising to critical

          Marc Seeger added a comment -

          A small note:
          this happened for us when the master was heavily overloaded (swapping). I reduced the number of executors on the master and just started a slave with more CPU/RAM to take care of the jobs.

          Marc Seeger added a comment - A small note: this happened for us when the master was heavily overloaded (swapping). I reduced the number of executors on the master and just started a slave with more CPU/RAM to take care of the jobs.

          sanga added a comment - - edited

          I don't see our master swapping. And our master doesn't do much except for serve web pages and farm jobs out to the slaves. We do have many slaves though (on the order of 60 or so I guess).

          sanga added a comment - - edited I don't see our master swapping. And our master doesn't do much except for serve web pages and farm jobs out to the slaves. We do have many slaves though (on the order of 60 or so I guess).

          Guy Rozendorn added a comment - - edited

          This happens across all our slaves, windows, redhat, ubuntu.
          We have a little bit more than 100 jobs, and we keep logs for the last 10 runs of each job.
          Over the last 10 runs, we've seen this 14 times on Windows, 26 times on the other platforms.
          Everything runs over SSH (cygwin on windows) with the default settings:

          • TCPKeepAlive: yes
          • ClientAliveCountMax: 3
          • ClientAliveInterval: 0

          It doesn't look related to the output - this fails randomly in different steps, some with no output for minutes, some with no output for only a few seconds

          Guy Rozendorn added a comment - - edited This happens across all our slaves, windows, redhat, ubuntu. We have a little bit more than 100 jobs, and we keep logs for the last 10 runs of each job. Over the last 10 runs, we've seen this 14 times on Windows, 26 times on the other platforms. Everything runs over SSH (cygwin on windows) with the default settings: TCPKeepAlive: yes ClientAliveCountMax: 3 ClientAliveInterval: 0 It doesn't look related to the output - this fails randomly in different steps, some with no output for minutes, some with no output for only a few seconds

          Guy Rozendorn added a comment -

          I added a print every ten seconds, still happens:

          ( 2013-29-27 17:29:07 running )
          ( 2013-29-27 17:29:17 running )
          ( 2013-29-27 17:29:27 running )
          ( 2013-29-27 17:29:37 running )
          ( 2013-29-27 17:29:47 running )
          ( 2013-29-27 17:29:57 running )
          ( 2013-30-27 17:30:07 running )
          FATAL: Unable to delete script file C:\Users\Administrator\hudson7142602309142296785.py
          hudson.util.IOException2: remote file operation failed: C:\Users\Administrator\hudson7142602309142296785.py at hudson.remoting.Channel@6e47e0a6:host-ci38
          	at hudson.FilePath.act(FilePath.java:901)
          	at hudson.FilePath.act(FilePath.java:878)
          	at hudson.FilePath.delete(FilePath.java:1263)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.plugins.templateproject.ProxyBuilder.perform(ProxyBuilder.java:87)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1593)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:242)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:524)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:722)
          	at hudson.FilePath.act(FilePath.java:894)
          	... 14 more
          Caused by: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:900)
          	at hudson.remoting.Channel$2.handle(Channel.java:465)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: Command close created at
          	at hudson.remoting.Command.<init>(Command.java:56)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:894)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:892)
          	at hudson.remoting.Channel.close(Channel.java:975)
          	at hudson.remoting.Channel.close(Channel.java:958)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:899)
          	... 2 more
          FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:722)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:162)
          	at sun.proxy.$Proxy38.join(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
          	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
          	at hudson.plugins.templateproject.ProxyBuilder.perform(ProxyBuilder.java:87)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
          	at hudson.model.Build$BuildExecution.build(Build.java:199)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
          	at hudson.model.Run.execute(Run.java:1593)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:242)
          Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:782)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:900)
          	at hudson.remoting.Channel$2.handle(Channel.java:465)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: hudson.remoting.Channel$OrderlyShutdown
          	... 3 more
          Caused by: Command close created at
          	at hudson.remoting.Command.<init>(Command.java:56)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:894)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:892)
          	at hudson.remoting.Channel.close(Channel.java:975)
          	at hudson.remoting.Channel.close(Channel.java:958)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:899)
          	... 2 more
          

          Guy Rozendorn added a comment - I added a print every ten seconds, still happens: ( 2013-29-27 17:29:07 running ) ( 2013-29-27 17:29:17 running ) ( 2013-29-27 17:29:27 running ) ( 2013-29-27 17:29:37 running ) ( 2013-29-27 17:29:47 running ) ( 2013-29-27 17:29:57 running ) ( 2013-30-27 17:30:07 running ) FATAL: Unable to delete script file C:\Users\Administrator\hudson7142602309142296785.py hudson.util.IOException2: remote file operation failed: C:\Users\Administrator\hudson7142602309142296785.py at hudson.remoting.Channel@6e47e0a6:host-ci38 at hudson.FilePath.act(FilePath.java:901) at hudson.FilePath.act(FilePath.java:878) at hudson.FilePath.delete(FilePath.java:1263) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.plugins.templateproject.ProxyBuilder.perform(ProxyBuilder.java:87) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1593) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:242) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:524) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:722) at hudson.FilePath.act(FilePath.java:894) ... 14 more Caused by: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:900) at hudson.remoting.Channel$2.handle(Channel.java:465) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:894) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:892) at hudson.remoting.Channel.close(Channel.java:975) at hudson.remoting.Channel.close(Channel.java:958) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:899) ... 2 more FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:162) at sun.proxy.$Proxy38.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.plugins.templateproject.ProxyBuilder.perform(ProxyBuilder.java:87) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586) at hudson.model.Run.execute(Run.java:1593) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:242) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:900) at hudson.remoting.Channel$2.handle(Channel.java:465) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:894) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:892) at hudson.remoting.Channel.close(Channel.java:975) at hudson.remoting.Channel.close(Channel.java:958) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:899) ... 2 more

          Marc Seeger added a comment -

          We started printing "tweet" every minute or so, still happened.
          Since we took down the numbers of executors on the master (aws m1.large) to 2 and upsized the slave, I haven't seen another crash.
          (not that long of a timespan though)

          Marc Seeger added a comment - We started printing "tweet" every minute or so, still happened. Since we took down the numbers of executors on the master (aws m1.large) to 2 and upsized the slave, I haven't seen another crash. (not that long of a timespan though)

          Guy Rozendorn added a comment - - edited

          Although our master node has 10 executors (left for tied jobs only), this happens when there's a single job running all over the jenkins instance, and its running (and failing) on a different slave.

          Guy Rozendorn added a comment - - edited Although our master node has 10 executors (left for tied jobs only), this happens when there's a single job running all over the jenkins instance, and its running (and failing) on a different slave.

          sanga added a comment - - edited

          So as suggested earlier in this case, we switched from tcp keepalives to ssh keepalives i.e. set on the slaves (in /etc/ssh/sshd_config):

          #to work-around jenkins slave connection dropouts
          ClientAliveCountMax 10
          ClientAliveInterval 60

          and this appears to have fixed the problem for us.

          sanga added a comment - - edited So as suggested earlier in this case, we switched from tcp keepalives to ssh keepalives i.e. set on the slaves (in /etc/ssh/sshd_config): #to work-around jenkins slave connection dropouts ClientAliveCountMax 10 ClientAliveInterval 60 and this appears to have fixed the problem for us.

          Guy Rozendorn added a comment -

          We set the following on all our slaves:

          ClientAliveCountMax 99
          ClientAliveInterval 60
          TCPKeepAlive no
          

          rebooted them all, and still getting this exception

          Guy Rozendorn added a comment - We set the following on all our slaves: ClientAliveCountMax 99 ClientAliveInterval 60 TCPKeepAlive no rebooted them all, and still getting this exception

          Zhijun Xu added a comment -

          @Rozendorn, I think you should set TCPKeepAlive to yes, try it

          Zhijun Xu added a comment - @Rozendorn, I think you should set TCPKeepAlive to yes, try it

          Guy Rozendorn added a comment -

          forever_xt, TCPKeepAlive yes is the default, which doesn't work either

          Guy Rozendorn added a comment - forever_xt , TCPKeepAlive yes is the default, which doesn't work either

          sanga added a comment -

          As an update to this, we have a bug in a script of ours which redeployed our build slaves. But even after this (also with both TCP and SSH keep alives enabled) we're occasionally seeing this bug. One possible explanation is that it may be related to the load on the Jenkins master. That's something that's been mentioned earlier in this case as a possible cause and something that we noticed too - Updating from 1.489 to 1.509.2 caused significantly increased load on our jenkins master. So we've given the master some more resources and tweaked jvm opts a bit to see if that improves things at all.

          @Zhijun: out of curiosity, how is the load on your jenkins master? Is it at all swapping?

          sanga added a comment - As an update to this, we have a bug in a script of ours which redeployed our build slaves. But even after this (also with both TCP and SSH keep alives enabled) we're occasionally seeing this bug. One possible explanation is that it may be related to the load on the Jenkins master. That's something that's been mentioned earlier in this case as a possible cause and something that we noticed too - Updating from 1.489 to 1.509.2 caused significantly increased load on our jenkins master. So we've given the master some more resources and tweaked jvm opts a bit to see if that improves things at all. @Zhijun: out of curiosity, how is the load on your jenkins master? Is it at all swapping?

          Guy Rozendorn added a comment -

          Looking at our master while Jenkins is alive (no job is running), Jenkin's java process takes 100% of one of the CPUs

          Tasks:  84 total,   1 running,  83 sleeping,   0 stopped,   0 zombie
          %Cpu(s): 51.4 us,  0.0 sy,  0.0 ni, 48.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
          KiB Mem:   8178392 total,  7231500 used,   946892 free,   381796 buffers
          KiB Swap:  8386556 total,    20888 used,  8365668 free,  5538224 cached
          
            PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
          12077 jenkins   20   0 5229m 873m 7428 S  99.7 10.9  11621:35 java
          

          does anyone here know how/has references on how to debug this?

          Guy Rozendorn added a comment - Looking at our master while Jenkins is alive (no job is running), Jenkin's java process takes 100% of one of the CPUs Tasks: 84 total, 1 running, 83 sleeping, 0 stopped, 0 zombie %Cpu(s): 51.4 us, 0.0 sy, 0.0 ni, 48.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 8178392 total, 7231500 used, 946892 free, 381796 buffers KiB Swap: 8386556 total, 20888 used, 8365668 free, 5538224 cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12077 jenkins 20 0 5229m 873m 7428 S 99.7 10.9 11621:35 java does anyone here know how/has references on how to debug this?

          Guy Rozendorn added a comment -

          after:

          • applying the ssh settings mentioned above on all our slaves
          • adding more RAM and CPUs to the master
          • spanning our nightly runs over a longer timeframe so we'll run the minimum number of jobs concurrently as we can

          we're not seeing this issue. however, when we start running jobs in parallel (globally in jenkins, not on slaves (each has only 1 executor), we're seeing this issue

          Guy Rozendorn added a comment - after: applying the ssh settings mentioned above on all our slaves adding more RAM and CPUs to the master spanning our nightly runs over a longer timeframe so we'll run the minimum number of jobs concurrently as we can we're not seeing this issue. however, when we start running jobs in parallel (globally in jenkins, not on slaves (each has only 1 executor), we're seeing this issue

          Marc Seeger added a comment -

          I just witnessed it live on a slave today.
          Some findings:

          1. Once the slave started failing, following (different) jobs failed too. (Tested 3 jobs, all of them failed with the same error)
          2. Just disconnecting and reconnecting the slave made it work again

          Marc Seeger added a comment - I just witnessed it live on a slave today. Some findings: 1. Once the slave started failing, following (different) jobs failed too. (Tested 3 jobs, all of them failed with the same error) 2. Just disconnecting and reconnecting the slave made it work again

          Guy Rozendorn added a comment -

          We had some issues in our lab, which forced us to re-install all of our slaves (84 and counting).
          We are still experiencing this issue

          It seems that after this happens, the slave remains connected to Jenkins. However, I can't tell what happens if you try to run another job on it, because we revert the slave VM from snapshot after every run (whether it is successful or not)

          Guy Rozendorn added a comment - We had some issues in our lab, which forced us to re-install all of our slaves (84 and counting). We are still experiencing this issue It seems that after this happens, the slave remains connected to Jenkins. However, I can't tell what happens if you try to run another job on it, because we revert the slave VM from snapshot after every run (whether it is successful or not)

          Danny Staple added a comment -

          Ok - I've found something on this today. If you have very "chatty" jobs on the slaves which output a lot of console data, try to log/redirect it to a file - they aren't necessarily the root cause, but make it more prone.

          If a job is running, but quiet, you can unplug a slave network cable for a few seconds, put it back in and things will pretty much continue as before. However- a slave running a chatty job will die with an io error almost immediately.

          If you can redirect to file, you may see a big reduction in these.

          Danny Staple added a comment - Ok - I've found something on this today. If you have very "chatty" jobs on the slaves which output a lot of console data, try to log/redirect it to a file - they aren't necessarily the root cause, but make it more prone. If a job is running, but quiet, you can unplug a slave network cable for a few seconds, put it back in and things will pretty much continue as before. However- a slave running a chatty job will die with an io error almost immediately. If you can redirect to file, you may see a big reduction in these.

          Guy Rozendorn added a comment -

          After update all our jobs to yield output every 10 seconds this occurs less frequent, but it still happens few times a week.

          Guy Rozendorn added a comment - After update all our jobs to yield output every 10 seconds this occurs less frequent, but it still happens few times a week.

          Jesse Glick added a comment -

          Essentially a duplicate of JENKINS-1948.

          Jesse Glick added a comment - Essentially a duplicate of JENKINS-1948 .

            Unassigned Unassigned
            dumghen Ghenadie Dumitru
            Votes:
            38 Vote for this issue
            Watchers:
            47 Start watching this issue

              Created:
              Updated:
              Resolved: