-
New Feature
-
Resolution: Unresolved
-
Major
-
None
I have a lot of user workstation used as slaves in Hudson.
And I have big compile "task" splitted in severals steps AKA jobs.
Thanks to this I can speed up the global compile time
But when a slave dies (for exemple "system shutdown") if a job was running on it, it fails.
And the whole process is lost.
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:629)
at hudson.FilePath.act(FilePath.java:745)
at hudson.FilePath.act(FilePath.java:738)
at hudson.FilePath.mkdirs(FilePath.java:804)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1116)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:480)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:412)
at hudson.model.Run.run(Run.java:1362)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:680)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:971)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:953)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
That would be nice if Hudson detects that the slave died and then it submit the job to another slave "quietly".
- Without incrementing the build number
- Without triggering the downstream jobs
In other words, if a slave die its jobs are re-launched silently on other slaves.
- is related to
-
JENKINS-29550 exception thrown when automatically deleting old workers
- Open