Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33287

Jnlp slave agent die after timeout detected from slave side

      When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

      STDOUT of JNLP process:

      INFO: Ping failed. Terminating the channel.
      java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
      	at hudson.remoting.PingThread.ping(PingThread.java:125)
      	at hudson.remoting.PingThread.run(PingThread.java:86)
      
      Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
      SEVERE: I/O error in channel channel
      java.net.SocketException: Socket closed
      	at java.net.SocketInputStream.read(SocketInputStream.java:190)
      	at java.net.SocketInputStream.read(SocketInputStream.java:122)
      	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
      	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
      	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
      	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
      	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
      	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      
      Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
      INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: dev127-virt2
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/]
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins.acme.com:37003
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
      	at hudson.remoting.Engine.run(Engine.java:276)
      

      Slave log on master:

      JNLP agent connected from /10.16.180.145
      <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
      Connection terminated
      ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
      	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Broken pipe
      	at sun.nio.ch.FileDispatcher.write0(Native Method)
      	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
      	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
      	at sun.nio.ch.IOUtil.write(IOUtil.java:40)
      	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
      	at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
      	at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
      	... 7 more
      Slave.jar version: 2.47
      This is a Unix slave
      Slave successfully connected and online
      Connection terminated
      

      Master log:

      2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
      java.util.concurrent.TimeoutException
        at hudson.remoting.Request$1.get(Request.java:271)
        at hudson.remoting.Request$1.get(Request.java:206)
        at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
        at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
        at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
      
      ... (All monitors times out)
      
      2016-03-02 06:32:42,860 INFO  [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
      2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
      2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2
      
      ...
      
      2016-03-02 06:33:43,630 INFO  [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
      java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
        at hudson.remoting.PingThread.ping(PingThread.java:125)
        at hudson.remoting.PingThread.run(PingThread.java:86)
      
      ...
      
      2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
      hudson.remoting.ChannelClosedException: channel is already closed
        at hudson.remoting.Channel.send(Channel.java:549)
        at hudson.remoting.Request.callAsync(Request.java:204)
        at hudson.remoting.Channel.callAsync(Channel.java:778)
        at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
        at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
      Caused by: java.io.IOException
        at hudson.remoting.Channel.close(Channel.java:1105)
        at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
        at hudson.remoting.PingThread.ping(PingThread.java:125)
        at hudson.remoting.PingThread.run(PingThread.java:86)
      Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
        ... 2 more
      
      ... (All monitors fails with "channel is already closed")
      

          [JENKINS-33287] Jnlp slave agent die after timeout detected from slave side

          Oliver Gondža created issue -
          Oliver Gondža made changes -
          Description Original: When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

          STDOUT of JNLP process:
          {noformat}
          INFO: Ping failed. Terminating the channel.
          java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
          at hudson.remoting.PingThread.ping(PingThread.java:125)
          at hudson.remoting.PingThread.run(PingThread.java:86)

          Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
          SEVERE: I/O error in channel channel
          java.net.SocketException: Socket closed
          at java.net.SocketInputStream.read(SocketInputStream.java:190)
          at java.net.SocketInputStream.read(SocketInputStream.java:122)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
          at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
          at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Terminated
          Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
          INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
          INFO: Setting up slave: dev127-virt2
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
          INFO: Jenkins agent is running in headless mode.
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Locating server among [http://jenkins.mw.lab.eng.bos.redhat.com/hudson/, http://hudson.mw.lab.eng.bos.redhat.com/hudson/]
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Connecting to jenkins.mw.lab.eng.bos.redhat.com:37003
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Handshaking
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
          SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
          java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
          at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
          at hudson.remoting.Engine.run(Engine.java:276)
          {noformat}

          Slave log on master:
          {noformat}
          JNLP agent connected from /10.16.180.145
          <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
          Connection terminated
          ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
          at java.lang.Thread.run(Thread.java:662)
          Caused by: java.io.IOException: Broken pipe
          at sun.nio.ch.FileDispatcher.write0(Native Method)
          at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
          at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
          at sun.nio.ch.IOUtil.write(IOUtil.java:40)
          at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
          at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
          at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
          ... 7 more
          Slave.jar version: 2.47
          This is a Unix slave
          Slave successfully connected and online
          Connection terminated
          {noformat}

          Master log:
          {noformat}
          2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
          java.util.concurrent.TimeoutException
            at hudson.remoting.Request$1.get(Request.java:271)
            at hudson.remoting.Request$1.get(Request.java:206)
            at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
            at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)

          ... (All monitors times out)

          2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
          2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
          2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2

          ...

          2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
          java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)

          ...

          2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
          hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:549)
            at hudson.remoting.Request.callAsync(Request.java:204)
            at hudson.remoting.Channel.callAsync(Channel.java:778)
            at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
            at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
          Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:1105)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
            ... 2 more

          ... (All monitors fails with "channel is already closed")
          {noformat}
          New: When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

          STDOUT of JNLP process:
          {noformat}
          INFO: Ping failed. Terminating the channel.
          java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
          at hudson.remoting.PingThread.ping(PingThread.java:125)
          at hudson.remoting.PingThread.run(PingThread.java:86)

          Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
          SEVERE: I/O error in channel channel
          java.net.SocketException: Socket closed
          at java.net.SocketInputStream.read(SocketInputStream.java:190)
          at java.net.SocketInputStream.read(SocketInputStream.java:122)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
          at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
          at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

          Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Terminated
          Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
          INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
          INFO: Setting up slave: dev127-virt2
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
          INFO: Jenkins agent is running in headless mode.
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/]
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Connecting to jenkins.acme.com:37003
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
          INFO: Handshaking
          Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
          SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
          java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
          at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
          at hudson.remoting.Engine.run(Engine.java:276)
          {noformat}

          Slave log on master:
          {noformat}
          JNLP agent connected from /10.16.180.145
          <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
          Connection terminated
          ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
          at java.lang.Thread.run(Thread.java:662)
          Caused by: java.io.IOException: Broken pipe
          at sun.nio.ch.FileDispatcher.write0(Native Method)
          at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
          at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
          at sun.nio.ch.IOUtil.write(IOUtil.java:40)
          at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
          at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
          at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
          ... 7 more
          Slave.jar version: 2.47
          This is a Unix slave
          Slave successfully connected and online
          Connection terminated
          {noformat}

          Master log:
          {noformat}
          2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
          java.util.concurrent.TimeoutException
            at hudson.remoting.Request$1.get(Request.java:271)
            at hudson.remoting.Request$1.get(Request.java:206)
            at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
            at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)

          ... (All monitors times out)

          2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
          2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
          2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2

          ...

          2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
          java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)

          ...

          2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
          hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:549)
            at hudson.remoting.Request.callAsync(Request.java:204)
            at hudson.remoting.Channel.callAsync(Channel.java:778)
            at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
            at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
          Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:1105)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
            ... 2 more

          ... (All monitors fails with "channel is already closed")
          {noformat}
          Oliver Gondža made changes -
          Summary Original: Jnlp slave agent die after timeout from slave side New: Jnlp slave agent die after timeout detected from slave side

          If rejecting the connection is the correct behavior of the master then the agent should keep retrying so the system as a whole can recover.

          If the rejection is unwarranted because the old connection was dead and the master is to slow to realize this, then the issue lies with the code running on the master node.

          René de Groot added a comment - If rejecting the connection is the correct behavior of the master then the agent should keep retrying so the system as a whole can recover. If the rejection is unwarranted because the old connection was dead and the master is to slow to realize this, then the issue lies with the code running on the master node.
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 169185 ] New: JNJira + In-Review [ 183395 ]

          Oleg Nenashev added a comment -

          Merging the issue into JENKINS-28492. Likely it is being caused by the agent connection hanging on the master side

          Oleg Nenashev added a comment - Merging the issue into JENKINS-28492 . Likely it is being caused by the agent connection hanging on the master side
          Oleg Nenashev made changes -
          Link New: This issue duplicates JENKINS-28492 [ JENKINS-28492 ]
          Oleg Nenashev made changes -
          Resolution New: Duplicate [ 3 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          CloudBees Inc. made changes -
          Remote Link New: This issue links to "CloudBees Internal CJP-6377 (Web Link)" [ 19057 ]

            Unassigned Unassigned
            olivergondza Oliver Gondža
            Votes:
            4 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: