Details
-
Bug
-
Status: Reopened (View Workflow)
-
Major
-
Resolution: Unresolved
-
Master - Jenkins 1.562, OSX Mavericks, Java 1.6
Slave - Windows Server 2008 R2, Java JRE 1.8
Description
When using a Windows Jenkins slave with an OSX Master (with the slave set up according to https://wiki.jenkins-ci.org/display/JENKINS/Step+by+step+guide+to+set+up+master+and+slave+machines) either disconnecting from the slave side or from the master (by selecting 'disconnect' from Nodes > NodeName), the slave then cannot reconnect until the master jenkins is restarted and an error is shown in the node information. This is extremely inconvenient as it means that the slave machine must be accessed every time the connection is interrupted (eg. a restart of jenkins or master machine). The following stack trace is seen on disconnect:
Connection was broken
java.io.IOException: Failed to abort
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184)
at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:599)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:481)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:663)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:430)
at org.jenkinsci.remoting.nio.Closeables$1.close(Closeables.java:20)
at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:289)
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:226)
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:224)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:474)
Attachments
Issue Links
- is duplicated by
-
JENKINS-22714 all slaves are terminated when javaw is stopped on one slave
-
- Resolved
-
- is related to
-
JENKINS-24050 All slaves disconnect and no new slaves can connect due to CancelledKeyException in org.jenkinsci.remoting
-
- Resolved
-
seeing this on windows master with 1.620, when adding new node, we typically connect via jnlp link, then install as service. We hit the issue onthe service client re-connect. Perhaps this helps: due to https secured master, the first service connect won't have valid cert info (and we suspect this triggers the issue master side), we update xml with certificate info then stop/restart the service, but at this stage the master is already in a bad state (not only the new slave cannot reconnect), the master actually loses connection to all other slaves as well. Our workaround so far is restarting master...
10:17:07 java.io.IOException: remote file operation failed: C:\JSBuilds\workspace****************** at hudson.remoting.Channel@1530a3e:********: hudson.remoting.ChannelClosedException: channel is already closed
10:17:07 at hudson.FilePath.act(FilePath.java:987)
10:17:07 at hudson.FilePath.act(FilePath.java:969)
10:17:07 at hudson.FilePath.mkdirs(FilePath.java:1152)
10:17:07 at hudson.model.AbstractProject.checkout(AbstractProject.java:1275)
10:17:07 at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:610)
10:17:07 at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
10:17:07 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:532)
10:17:07 at hudson.model.Run.execute(Run.java:1741)
10:17:07 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
10:17:07 at hudson.model.ResourceController.execute(ResourceController.java:98)
10:17:07 at hudson.model.Executor.run(Executor.java:381)
10:17:07 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
10:17:07 at hudson.remoting.Channel.send(Channel.java:550)
10:17:07 at hudson.remoting.Request.call(Request.java:129)
10:17:07 at hudson.remoting.Channel.call(Channel.java:752)
10:17:07 at hudson.FilePath.act(FilePath.java:980)
10:17:07 ... 10 more
10:17:07 Caused by: java.io.IOException
10:17:07 at hudson.remoting.Channel.close(Channel.java:1110)
10:17:07 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:118)
10:17:07 at hudson.remoting.PingThread.ping(PingThread.java:126)
10:17:07 at hudson.remoting.PingThread.run(PingThread.java:85)
10:17:07 Caused by: java.util.concurrent.TimeoutException: Ping started at 1441990735275 hasn't completed by 1441990975286