-
Bug
-
Resolution: Fixed
-
Critical
-
Platform: All, OS: All
-
Powered by SuggestiMate
Another intermittent problem.
Master is linux, target is freebsd 4.9.
FATAL: Unable to delete script file /var/tmp/hudson60616.sh
hudson.util.IOException2: remote file operation failed
at hudson.FilePath.act(FilePath.java:313)
at hudson.FilePath.delete(FilePath.java:510)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:70)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:34)
at hudson.model.Build$RunnerImpl.build(Build.java:130)
at hudson.model.Build$RunnerImpl.doRun(Build.java:105)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:231)
at hudson.model.Run.run(Run.java:756)
at hudson.model.Build.run(Build.java:85)
at hudson.model.ResourceController.execute(ResourceController.java:70)
at hudson.model.Executor.run(Executor.java:82)
Caused by: java.io.IOException: already closed
at hudson.remoting.Channel.send(Channel.java:316)
at hudson.remoting.Request.call(Request.java:81)
at hudson.remoting.Channel.call(Channel.java:390)
at hudson.FilePath.act(FilePath.java:310)
... 10 more
Build was aborted
FATAL: null
java.lang.NullPointerException
at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:78)
at
hudson.model.AbstractBuild$AbstractRunner.performAllBuildStep(AbstractBuild.java:309)
at
hudson.model.AbstractBuild$AbstractRunner.performAllBuildStep(AbstractBuild.java:297)
at hudson.model.Build$RunnerImpl.post2(Build.java:118)
at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:282)
at hudson.model.Run.run(Run.java:774)
at hudson.model.Build.run(Build.java:85)
at hudson.model.ResourceController.execute(ResourceController.java:70)
at hudson.model.Executor.run(Executor.java:82)
- jenkins_fatal_io_exception.txt
- 4 kB
- Erik Purins
- is duplicated by
-
JENKINS-8700 "Node offline during build" and problems reconnecting.
-
- Resolved
-
-
JENKINS-12235 FATAL, Unable to delete script file, IOException2, remote file operation failed, unexpected termination of channel
-
- Resolved
-
-
JENKINS-5073 hudson.util.IOException2: Failed to join the process - on a Windows slave
-
- Resolved
-
-
JENKINS-7391 "hudson.util.IOException2: Failed to join the process" during MSBuild task
-
- Resolved
-
-
JENKINS-7690 Connection to slave is dropped in the middle of a build
-
- Resolved
-
- relates to
-
JENKINS-6817 FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
-
- Closed
-
-
JENKINS-14332 Repeated channel/timeout errors from Jenkins slave
-
- Closed
-
[JENKINS-1948] Intermittent slave disconnections with secondary symptoms
Same issue here with a master running under Ubuntu natty and slaves (virtualbox VMS) running under FreedBSD, gentoo, ubuntu hardy, lucid, maverick, natty and windows but also with a real host running OSX 10.6.
I think we can rule out an OS specific issue
So, it seems to be more than intermittent here, anything I can do to help diagnose ?
It now occurs on a regular basis across all the slaves and may even always for some of them.
More evidence on a specific setup:
One of the slave (macadam.local) is a real host (as opposed to a virtual machine running on the same host than the master).
It's configure with 'Launch slave via execution of command on the Master' with sh -c "${BABUNE_ROOT}/bin/start-slave-via-ssh.sh macadam.local"
The start-slave-via-ssh.sh is:
#!/bin/bash -x
- This script start a slave via ssh under some assumptions:
- - slave.jar needs to be up to date,
- - respect the master ssh config and forward the agent
NODE=$1
JAR_REMOTE_PATH='~/babune/slaves/${NODE}/slave.jar'
- We start in /, let's go to a friendler place
cd ${BABUNE_ROOT}
- Copy the slave.jar file
scp master/war/WEB-INF/slave.jar ${NODE}:${JAR_REMOTE_PATH}
- Start slave.jar
ssh ${NODE} java -jar ${JAR_REMOTE_PATH}
It's needed because I prefer to handle the ssh settings in ~/.ssh/config which is not respected when using 'Launch slave agents on UNIX machines via ssh'
Now, while this slave is started and a job is running, I started a wireshark on the master with a capture filter of 'host macadam.local and port 22' and also watch the job console displaying progress as tests are passing.
The suspicious thing there is that while packets are flowing (slowly but flowing) on the ssh connection (as shown by wireshark), no updates happen on the console. Tests generally takes far less than one second and even after waiting for minutes, nothing appears on the console.
Does that ring some bells ?
Is there some way to make jenkins a bit more verbose in this area ?
This seems to indicate that the connection between the master and the slave is broken at a higher level than the ssh one.
I suspect that the master finally gave up somehow but still try to use the connection to delete the remote file and fail.
Some more evidence, yet no solution... :-|
OpenBSD 5.0-current and Jenkins 1.427.
In the middle of a kernel build, suddenly there is no activity in
the console log for a long time and then I get this very similar stack
trace.
FATAL: Unable to delete script file /tmp/hudson2998673953817336628.sh hudson.util.IOException2: remote file operation failed: /tmp/hudson2998673953817336628.sh at hudson.remoting.Channel@652b32ea:openbsd at hudson.FilePath.act(FilePath.java:754) at hudson.FilePath.act(FilePath.java:740) at hudson.FilePath.delete(FilePath.java:995) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:682) at hudson.model.Build$RunnerImpl.build(Build.java:178) at hudson.model.Build$RunnerImpl.doRun(Build.java:139) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:448) at hudson.model.Run.run(Run.java:1376) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:230) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:486) at hudson.remoting.Request.call(Request.java:110) at hudson.remoting.Channel.call(Channel.java:668) at hudson.FilePath.act(FilePath.java:747) ... 13 more Caused by: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:813) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1049) Caused by: Command close created at [...]
It seems that the build process stalls trying to write to stdout
because the java slave process no longer consumes the output
of the build process. Before I the build was aborted, I had a
chance to look at the process table
$ pstree [...] | \-+= 07917 root sshd: hudson [priv] (sshd) | \-+- 26930 hudson sshd: hudson@notty (sshd) | \-+= 03191 hudson ksh -c ksh | \-+- 18833 hudson /home/hudson/jdk/bin/java -jar slave.jar | \-+- 30004 hudson /bin/sh -xe /tmp/hudson2998673953817336628.sh | \-+- 07697 hudson /bin/sh -xe /tmp/hudson2998673953817336628.sh | \--- 16885 hudson make [...] $ ps auxOwchan|grep make hudson 16885 0.0 0.2 2788 3740 ?? I 8:24AM 0:00.04 make pipewr $ sudo ktrace -p 18833 $ sudo kdump |head -15 18833 java EMUL "native" 18833 java RET read 51/0x33 18833 java CALL read(0,0x20cd76570,0x2000) 18833 java GIO fd 0 read 696 bytes "xr\0\^Whudson.remoting.Command\0\0\0\0\0\0\0\^A\^B\0\^AL\0 createdAtt\0\^ULjava/lang/Exception;xpsr\0\^^hudson.remoting.Command$Source\0\0\0\0\0\0\0\^A\^B\0\^AL\0\^Fthis$0t\0\^YLhudson/remoting/\ Command;xr\0\^Sjava.lang.Exception\M-P\M-}\^_>\^Z;\^\\M-D\^B\0\0xr\0\^Sjava.lang.Throwable\M-U\M-F5'9w\M-8\M-K\^C\0\^CL\0\^Ecauset\0\^ULjava/lang/Throwable;L\0\rdetailMessaget\0\^RLjava/lang/String;[\ \0 stackTracet\0\^^[Ljava/lang/StackTraceElement;xpq\0~\0\vpur\0\^^[Ljava.lang.StackTraceElement;\^BF*<<\M-}"9\^B\0\0xp\0\0\0\bsr\0\^[java.lang.StackTraceElementa \M-E\M^Z&6\M-]\M^E\^B\0\^DI\0 lineNumberL\0\^NdeclaringClassq\0~\0 L\0\bfileNameq\0~\0 L\0 methodNameq\0~\0 xp\0\0\0>t\0\^Whudson.remoting.Commandt\0\fCommand.javat\0\^F<init>sq\0~\0\^N\0\0\0/q\0~\0\^Pq\0~\0\^Qq\0~\0\^Rsq\0~\0\^N\0\0\^C)t\0$hudson.remoting.Channel$CloseCommandt\0\fC\ hannel.javaq\0~\0\^Rsq\0~\0\^N\0\0\^C)q\0~\0\^Uq\0~\0\^Vq\0~\0\^Rsq\0~\0\^N\0\0\^CV" 18833 java RET read 696/0x2b8 18833 java CALL gettimeofday(0x20cd77c40,0) 18833 java PSIG SIGPROF caught handler=0x2034ee1e0 mask=0x0 18833 java RET gettimeofday 0
More more food for thought.
I was accidentally still using pthreads even though I normally replace libpthread
with librthread during the build. However when the issue came up, it didn't help to
restart the slave - the issue was showing up consistently in every single build. But
now that I've actually enabled librthread again and rebooted the slave once more, it
builds just fine, as usual.
I have a very similar issue when I use a hudson job to do a very time-consuming svn merge. The console doesn't produce any output when this long svn merge is pending (this is expected). After 38 minutes the below error happens.
I triggered the job twice, and both failed after 38 minutes.
System spec
Jenkins 1.431
[root@srv-ind-dvm17zl ~]# cat /etc/*release Red Hat Enterprise Linux Server release 5.4 (Tikanga) [root@srv-ind-dvm17zl ~]# uname -a Linux srv-ind-dvm17zl.vanenburg.com 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@srv-ind-dvm17zl ~]# java -version java version "1.6.0_20" Java(TM) SE Runtime Environment (build 1.6.0_20-b02) Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
Console log
setendrevision:
[echo] Prepare DROP-merge command || revision : revision
[getmergecmd]
[getmergecmd] Merge command: svn merge -r 189757:199394 "http://srv-ind-scrat:8080/repos/bcp/branches/private/batic/batic-wip" "/opt/jenkins/workspace/batic-wip-dropmerge/trunk" --accept postpone --non-interactive
FATAL: Unable to delete script file /tmp/hudson8787918218652977262.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson8787918218652977262.sh at hudson.remoting.Channel@7c1f7480:srv-ind-dvm17zl
at hudson.FilePath.act(FilePath.java:754)
at hudson.FilePath.act(FilePath.java:740)
at hudson.FilePath.delete(FilePath.java:995)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:693)
at hudson.model.Build$RunnerImpl.build(Build.java:178)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:459)
at hudson.model.Run.run(Run.java:1376)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:230)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:499)
at hudson.remoting.Request.call(Request.java:110)
at hudson.remoting.Channel.call(Channel.java:681)
at hudson.FilePath.act(FilePath.java:747)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1093)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2570)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1087)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:149)
at hudson.remoting.Channel.call(Channel.java:681)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy38.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:850)
at hudson.Launcher$ProcStarter.join(Launcher.java:336)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:693)
at hudson.model.Build$RunnerImpl.build(Build.java:178)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:459)
at hudson.model.Run.run(Run.java:1376)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:230)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:273)
at hudson.remoting.Channel.terminate(Channel.java:732)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1117)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1093)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2570)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1087)
Same issue here with jenkins-1.408 on Centos - slaves on FreeBSD.
I think I'm seeing much the same issue here, ubuntu master, windows slave, jenkins 1.455.
09:50:06 [exec] 1>ClCompile: 09:50:06 [exec] 1> appsvcs.cpp 09:50:26 [exec] 1> All outputs are up-to-date. 09:52:39 FATAL: Unable to delete script file C:\Windows\TEMP\hudson7112635489663392827.bat 09:52:39 hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson7112635489663392827.bat at hudson.remoting.Channel@504f4373:Sue 09:52:39 at hudson.FilePath.act(FilePath.java:784) 09:52:39 at hudson.FilePath.act(FilePath.java:770) 09:52:39 at hudson.FilePath.delete(FilePath.java:1075) 09:52:39 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92) 09:52:39 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) 09:52:39 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 09:52:39 at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703) 09:52:39 at hudson.model.Build$RunnerImpl.build(Build.java:178) 09:52:39 at hudson.model.Build$RunnerImpl.doRun(Build.java:139) 09:52:39 at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:473) 09:52:39 at hudson.model.Run.run(Run.java:1410) 09:52:39 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 09:52:39 at hudson.model.ResourceController.execute(ResourceController.java:88) 09:52:39 at hudson.model.Executor.run(Executor.java:238) 09:52:39 Caused by: hudson.remoting.ChannelClosedException: channel is already closed 09:52:39 at hudson.remoting.Channel.send(Channel.java:499) 09:52:39 at hudson.remoting.Request.call(Request.java:110) 09:52:39 at hudson.remoting.Channel.call(Channel.java:681) 09:52:39 at hudson.FilePath.act(FilePath.java:777) 09:52:39 ... 13 more 09:52:39 Caused by: java.net.SocketException: Connection reset 09:52:39 at java.net.SocketInputStream.read(SocketInputStream.java:185) 09:52:39 at java.io.FilterInputStream.read(FilterInputStream.java:133) 09:52:39 at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) 09:52:39 at java.io.BufferedInputStream.read(BufferedInputStream.java:254) 09:52:39 at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265) 09:52:39 at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558) 09:52:39 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568) 09:52:39 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314) 09:52:39 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368) 09:52:39 at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127) 09:52:39 FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 09:52:39 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 09:52:39 at hudson.remoting.Request.call(Request.java:149) 09:52:39 at hudson.remoting.Channel.call(Channel.java:681) 09:52:39 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) 09:52:39 at $Proxy42.join(Unknown Source) 09:52:39 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:859) 09:52:39 at hudson.Launcher$ProcStarter.join(Launcher.java:345) 09:52:39 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82) 09:52:39 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) 09:52:39 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 09:52:39 at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703) 09:52:39 at hudson.model.Build$RunnerImpl.build(Build.java:178) 09:52:39 at hudson.model.Build$RunnerImpl.doRun(Build.java:139) 09:52:39 at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:473) 09:52:39 at hudson.model.Run.run(Run.java:1410) 09:52:39 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 09:52:39 at hudson.model.ResourceController.execute(ResourceController.java:88) 09:52:39 at hudson.model.Executor.run(Executor.java:238) 09:52:39 Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset 09:52:39 at hudson.remoting.Request.abort(Request.java:273) 09:52:39 at hudson.remoting.Channel.terminate(Channel.java:732) 09:52:39 at hudson.remoting.Channel$ReaderThread.run(Channel.java:1157) 09:52:39 Caused by: java.net.SocketException: Connection reset 09:52:39 at java.net.SocketInputStream.read(SocketInputStream.java:185) 09:52:39 at java.io.FilterInputStream.read(FilterInputStream.java:133) 09:52:39 at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) 09:52:39 at java.io.BufferedInputStream.read(BufferedInputStream.java:254) 09:52:39 at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265) 09:52:39 at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558) 09:52:39 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568) 09:52:39 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314) 09:52:39 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368) 09:52:39 at hudson.remoting.Channel$ReaderThread.run(Channel.java:1127)
Still an issue in jenkins 1.467
FATAL: Unable to delete script file /var/folders/8j/q0m30zk95rv6_58lnchztz680000gn/T/hudson6575302717651863102.sh
hudson.util.IOException2: remote file operation failed: /var/folders/8j/q0m30zk95rv6_58lnchztz680000gn/T/hudson6575302717651863102.sh at hudson.remoting.Channel@38526a51:ISTFrameworks-MacMini
at hudson.FilePath.act(FilePath.java:838)
at hudson.FilePath.act(FilePath.java:824)
at hudson.FilePath.delete(FilePath.java:1129)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:711)
at hudson.model.Build$RunnerImpl.build(Build.java:178)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:481)
at hudson.model.Run.run(Run.java:1438)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:239)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:475)
at hudson.remoting.Request.call(Request.java:110)
at hudson.remoting.Channel.call(Channel.java:646)
at hudson.FilePath.act(FilePath.java:831)
... 13 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2266)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2559)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2569)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
at hudson.remoting.Command.readFrom(Command.java:90)
at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.call(Request.java:149)
at hudson.remoting.Channel.call(Channel.java:646)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy141.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:861)
at hudson.Launcher$ProcStarter.join(Launcher.java:345)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:711)
at hudson.model.Build$RunnerImpl.build(Build.java:178)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:481)
at hudson.model.Run.run(Run.java:1438)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:239)
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:273)
at hudson.remoting.Channel.terminate(Channel.java:702)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2266)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2559)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2569)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
at hudson.remoting.Command.readFrom(Command.java:90)
at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
For us, the cause of this error was our build slaves (VMs) running out of memory and self-rebooting.
Your VM reboots the operating system? We are not doing that. If the build slave ran out of memory, maybe it is doing something in a buffer it doesn't need to (say, keeping long lines of output it could flush to disk) or has a leak. Since something similar has been happening to us across operating systems, I can help trouble shoot and fix. It's intermittent for us, but it makes our entire build system untrustworthy to our engineers, and they've started ignoring build e-mails.
I can confirm same issue in AIX and HPUX having failures intermittently. Any pointers for capturing debug information or workaround will be appreciated.
This is still occurring 1.504 debian squeeze master, osx 10.8 slave. If it's related to an unreliable connection and is this frequent, maybe a better error message (instead of the entire remote action callstack) or more fault-tolerent remote action retry system could be implemented?
JENKINS-5073 massaged the reporting a bit; should no longer give misleading error about deleting script file (which is merely an aftereffect of the slave connectivity issue).
I'm also seeing this only on an OSX slave. Windows and linux slaves are fine.
There are different ways of fixing this issue and in our case, rebooting slaves help. But the other people solved the issue by changing SSH settings on slaves, changing java version used for connecting slaves, etc. (as explained in JENKINS-12235 and other tickets on this issue.)
So it is crucial for everyone that this issue is solved.
Of course there are some things which Jenkins can not solve such as issues with the slaves themselves, etc. If this is the case, it would be good if Jenkins tells us a bit more. The error message printed to console could tell us what the "real" problem is, rather than just saying unable to delete the script file. This misleads us and we start troubleshooting unrelated things rather than just checking the slave and rebooting it if necessary. So, having an indication regarding what's the problem behind this preventing Jenkins from deleting the file could improve things. (is it connectivity issue, etc and so on.)
This week we changed all our 80± slaves from using the SSHLauncher to use the CommandLauncher, which launches strace -t -s 4096 ssh ..., which the following lines in .ssh/config:
TCPKeepAlive yes ServerAliveInterval 10 ServerAliveCountMax 10 LogLevel DEBUG
The reason using strace is to get a clue if the connection is dropped first, or the master decides it is dead.
In one of the job executions (which started at 00:00:56 we get this in the log:
Started by timer Building remotely on host-ci66 in workspace /root/jenkins/workspace/mainline-bdist-develop Deleting project workspace... Checkout:mainline-bdist-develop / /root/jenkins/workspace/mainline-bdist-develop - hudson.remoting.Channel@27ecbe67:host-ci66 Using strategy: Default Last Built Revision: Revision ff29df8b003dde47573ddfb0b463351baee6dea3 (origin/develop) FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.FilePath.act(FilePath.java:894) at hudson.FilePath.act(FilePath.java:878) at hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:942) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1108) at hudson.model.AbstractProject.checkout(AbstractProject.java:1369) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581) at hudson.model.Run.execute(Run.java:1593) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:242) Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2595) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
and the slave.log ends with this:
00:00:56 select(7, [3 4], [3 5], NULL, {10, 0}) = 4 (in [3 4], out [3 5], left {9, 999997}) 00:00:56 read(4, "\10\0\245\231\255B,\304\304\7\23\3\0\0000\7\0\0d\0\0\0hudson/plugins/git/browser/FisheyeGitRepositoryBrowser$FisheyeGitRepositoryBrowserDescriptor$1.class\265U{O\23A\20\377m)\\\251\247\205\"\240@\364\264UK\21\17|\240\370\226\332\"Z5\341\225h\214\361\350-\355\312q[\357\256\5\376\364c\370-0\0214\232\370\1\374P\306\331k!&\0264&m\2567\2733\263\363\370\315\314\336\217\237_\277\3\230\304\223n\34\303\250ze\343H`L\303\3058\242\30\217\23\347R\f\246\242\23qR\274\34\303\25E\257j\270\246\230S\32\256k\270\301p\254n9\351U\341Z\316\262\345\3248C\262\370\326\252[\246c\271es!\360\204[\276\305\320\25T\204\237\236\320@\353s\5\341W\370\26\237\25\301<\257J_\4\322\333\232\361\344\206\317\275\207\334/y\242J\34\6}\316u\271\227s,\337\347>\303\353b\245f\373\3225\253N\255,\\\337,\213\300\\i\0343\0171\231\376'w*\306\333\302\25\301]\6?\323^W\177\0024\272\314\20\315I\233\340K\24\205\313\237\325\326W\270\267h\2558!\240\262\244\320\365\204\3327\231Q\5(\3\30\336\2645\330\364$!\323Y\252\360\322\32\303\251\314\350\236\263Z \34\263 \275u\252\272\260\255@H\227\24\31E\326\337HOHs\356y~\263\304\253MY|\177\343k\270M\373\5Y\363J\274 T:\306!\301\\R\366\10\230\274[rH\342\226\237\362\240\"m\35wpWG\22\3:z\320\253c\20\367\250iB\347.\17\314\245\371\242\342\335\327\361\0003\f\232\362\220\337\342:rx\250!\257\243\0\203\341\352\"\341h\320c\31u\225\211A\307\214\225Z`\210\300\260%\367\335\v\201\341H\271f8b\215\33\3736fA}\322\327\"Q\35\2170G\315\332\336\2320\244\16\256C\232R\3105\n\26\333[jx\314\360\252\235AQw(86M\342\326\35\302\177\241A\367\221\241\212\3332\ff\311s\30&2-\306\340\360\356\352\310\2509\351ouP\315\217\254rr\222i\212\233-@6\367\2524S[]\345\36\267\347\271es5\3601\272\266\354E\276\0310\214g\16Rk\341\354%\303\320\301\2012D$A\337\311=O\241\322[\261\\\333\341\277\265\10C\276E\n-\247\346/\200L\375_Aa\320\5\236\240\373?\322\323\243F\7 Js\2448\364\37\304\t\272VN\322j\232\366\35D\23\331\261\35\260\354\305\35D\262\237\321\3611T\34\242w\27)\2\353\30\246\267\36\256\23\30\301)\242Q\234&7\221\320\314{t\206\206\307\263\331o\210\276\310~B\3443:w\321\225\324v\21\373\0\355\v\272\267\223\361/8\262]\314*\351\330.\216n\207G\206q\226\f\217\340LH\33N\7\310\34\360\216\242\364H\22\220\244\216\0246\302 \6H'\205x\250\257\302\31o\206\223\n\203\215\220\2154et.L\340<.\204\201\366!\203~Z\r\221\244\0177q\34\352{\330\370%\350\2237\35\355\376\5PK\3\4\n\0\0\10\10\0\245\231\255B\342o\20\6\230\5\0\0\314\f\0\0b\0\0\0hudson/plugins/git/browser/FisheyeGitRepositoryBrowser$FisheyeGitRepositoryBrowserDescriptor.class\265W{S\23W\24\377\255\0016\204E\20\37(\365\21-\266!\321\254\257\252m(-\362\20h\2H\20|\353&\271$\0276\273q\37H\332\332\367\373\365\1\374\24jg\"\326\231N\377\353L?T\247\347nB\0101\240\343\324\314\354}\236\347\357\236s\356\315?\377\376\361'\200\223\370-\200\3\30\363c\\\306\204\37\37\265!\216D\0\223\230\362c:\200K\230\221\221\f\240\25c\242\231\25\315e\321\314\265b?\346\3\270\202\253\1\\\303u?n\10\256\233\1\334\302\355Vj\356\210\221&#\25\300\36dD\303\2\304\262\340GVp\345dp\31\213\22\332.\317\304oO\17\316\316\216\314LJ\330\27_\324\2265\325u\270\256Z,\313V\324i\315q\230e\304$\264\364s\203;\3\22|\241\2769\tMCf\206I\350\210s\203M\272\371\24\263f\265\224N+]q3\255\351s\232\305\305\274\262\330\344\344\270-\341\350(\267s\254\310.rg\206\25L\233;\246U\274`\231\367lf\r3;m\361\2\255HP\306\r\203YC\272f\333\214\330n\305sn\3066\r\265\240\273Yn\330j\226;j\252\314\246n!\262\367\245\324\221o\333\263\314\31\346vA\327\212\223Z\236\354\335\25\352+C\241kFVM:\0267\262D\327f\260{\343\206\355hF\232\210\226Cq\323\312\252Kf\316v\227\230J\313\5\235\fJ\226\373\31v\327e\266\23\213\33\314Q\355\5uQ80\221\234\232\234J-\262\264\23\353{E\247\310\f\237\305\356J\350}\31\355\22\2BqY\247\204\275\233YCt#+iVp\270i\3302\226h\2361\207r,\275t\331\322%\234\10=\17G\325\1/\\FM+?\247\351<\243\t\21$\257yY\323]/\36\32\0\31X\340\206\10\22\242\220\221\227aH82\343\32\16\317\2639ns\212\231i\315\242\223\240\330\0334\f\323\361\204\332\233\371|\311eV\261\312@\342\375\v|e$_p\212>J2I\302\225\377\347\244\362\24\362\272Z\216\234\224\10lR\265m\345\204hN\222\322\376\264^I\221\326$\317\32\232\343Z\344\177\177#n\21w\325\r;\235W\237;\346\376\360@l@ \2254]+\3ERROR: Connection terminated ESC[8mha:AAAAWB+LCAAAAAAAAABb85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=ESC[0mjava.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2595) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
From what I understand, when the java process on the master node decides that the connection is terminated, the SSH connection was still up and running and with data flowing in both directions.
Would it help if we dump the stdout and stdin streams into files and attach them to this ticket?
The last report in this thread was in 2014. Since that Jenkins core and Remoting got lots of stability and diagnosability improvements. If somebody still sees the issue reported in the ticket, please reopen with new logs and other diagnostics information
Same issue on OpenBSD 4.8, both 32- and 64-bit.