Details
-
Bug
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
None
-
Slave Windows Server 2012 R
Master Windows Server 2012 R
Jenkins version 1.611
Description
Our Windows slaves are JNLP connected using scheduled tasks (like this: https://wiki.jenkins-ci.org/display/JENKINS/Launch+Java+Web+Start+slave+agent+via+Windows+Scheduler).
When a job starts working with a slave it suddenly fails because the connection was lost. It says something on slave side killed the connection. Few minutes afterwards the slave is back on-line. I am sure it is not a network issue as I tried pinging the network from system start-up till the problem occurs. I could not find find what kills the connection and disable TCP chimney (advice from google) did not work.
The exception is below.
I think proper behaviour would be not to fail the job but attempt to reconnect. Any hints of additional configuration that will help me resolve this issue is appreciated.
Building remotely on vmbam32.eur.ad.sag (R2 Server 2012 Enterprise Windows) in workspace c:\jenkins\workspace\optimize-install
FATAL: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@b7add1[name=vmbam32.eur.ad.sag]
hudson.remoting.RequestAbortedException: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@b7add1[name=vmbam32.eur.ad.sag]
at hudson.remoting.Request.abort(Request.java:296)
at hudson.remoting.Channel.terminate(Channel.java:815)
at hudson.remoting.Channel$2.terminate(Channel.java:492)
at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72)
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:628)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
at ......remote call to vmbam32.eur.ad.sag(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1361)
at hudson.remoting.Request.call(Request.java:171)
at hudson.remoting.Channel.call(Channel.java:752)
at hudson.FilePath.act(FilePath.java:980)
at hudson.FilePath.act(FilePath.java:969)
at hudson.FilePath.mkdirs(FilePath.java:1152)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1269)
at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:610)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:532)
at hudson.model.Run.execute(Run.java:1744)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:374)
Caused by: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@b7add1[name=vmbam32.eur.ad.sag]
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:628)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:561)
... 6 more
Hi,
my working solution is the following:
I use windows for master and for the slave as well.
On the slave I have an administrator user with autologin, when the user logged in it waits 1 minute, then starts jenkins slave with java webstart with the jnlp file:
javaws.lnk c:\CI\slave-agent.jnlp
(javaws.lnk is created to be on path)
I use java 1.7.1 because the javaws needs 1.7, but 1.7.89 has a security enhancement, and it does not allow to start the javaws automatically, it needs manual confirmation. (I did not check the versions between 1.7.1 and 1.7.89) 1.7.1 allows the starting.
This way the connection is stable.
BR,
George