ECS build nodes disconnect from Master

This issue is archived. You can view it, but you can't modify it. Learn more

XMLWordPrintable

    • Type: Bug
    • Resolution: Cannot Reproduce
    • Priority: Major
    • Component/s: remoting
    • Environment:
      Jenkins version 2.121.1 running on Windows
      remoting version 3.17
      ECS build nodes running on CentOS 7

      We are running jenkins jobs remotely on ECS build nodes connected via JNLP using the remoting agent. These build nodes connect successfully every time but after some execution time has lapsed, the JNLP connection is terminated causing the build job to fail with the following stack trace:

      java.nio.channels.ClosedChannelException
      at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
      at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
      at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
      at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
      at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
      at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313)
      at hudson.remoting.Channel.close(Channel.java:1446)
      at hudson.remoting.Channel.close(Channel.java:1399)
      at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746)
      at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99)
      at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      at java.util.concurrent.FutureTask.run(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      at java.lang.Thread.run(Unknown Source)
      Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 10.43.3.65/10.43.3.65:57573' is disconnected.
      at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:209)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:278)
      at com.sun.proxy.$Proxy74.isAlive(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
      at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
      at hudson.model.Build$BuildExecution.build(Build.java:206)
      at hudson.model.Build$BuildExecution.doRun(Build.java:163)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
      at hudson.model.Run.execute(Run.java:1727)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      at hudson.model.ResourceController.execute(ResourceController.java:97)
      at hudson.model.Executor.run(Executor.java:429)
      Build step 'Execute shell' marked build as failure

       

      In the Jenkins logs we see the following stack trace:

      java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

       

      The closest post I found was from a user running the build nodes on windows and Jenkins was behind an AWS ELB. My build nodes are linux and jenkins is not behind an ELB.

            Assignee:
            Jeff Thompson
            Reporter:
            Fernando Fraticelli
            Archiver:
            Jenkins Service Account

              Created:
              Updated:
              Resolved:
              Archived: