Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33287

Jnlp slave agent die after timeout detected from slave side

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

      STDOUT of JNLP process:

      INFO: Ping failed. Terminating the channel.
      java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
      	at hudson.remoting.PingThread.ping(PingThread.java:125)
      	at hudson.remoting.PingThread.run(PingThread.java:86)
      
      Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
      SEVERE: I/O error in channel channel
      java.net.SocketException: Socket closed
      	at java.net.SocketInputStream.read(SocketInputStream.java:190)
      	at java.net.SocketInputStream.read(SocketInputStream.java:122)
      	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
      	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
      	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
      	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
      	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
      	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      
      Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
      INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: dev127-virt2
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/]
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins.acme.com:37003
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
      	at hudson.remoting.Engine.run(Engine.java:276)
      

      Slave log on master:

      JNLP agent connected from /10.16.180.145
      <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
      Connection terminated
      ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
      	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Broken pipe
      	at sun.nio.ch.FileDispatcher.write0(Native Method)
      	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
      	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
      	at sun.nio.ch.IOUtil.write(IOUtil.java:40)
      	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
      	at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
      	at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
      	... 7 more
      Slave.jar version: 2.47
      This is a Unix slave
      Slave successfully connected and online
      Connection terminated
      

      Master log:

      2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
      java.util.concurrent.TimeoutException
        at hudson.remoting.Request$1.get(Request.java:271)
        at hudson.remoting.Request$1.get(Request.java:206)
        at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
        at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
        at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
      
      ... (All monitors times out)
      
      2016-03-02 06:32:42,860 INFO  [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
      2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
      2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2
      
      ...
      
      2016-03-02 06:33:43,630 INFO  [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
      java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
        at hudson.remoting.PingThread.ping(PingThread.java:125)
        at hudson.remoting.PingThread.run(PingThread.java:86)
      
      ...
      
      2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
      hudson.remoting.ChannelClosedException: channel is already closed
        at hudson.remoting.Channel.send(Channel.java:549)
        at hudson.remoting.Request.callAsync(Request.java:204)
        at hudson.remoting.Channel.callAsync(Channel.java:778)
        at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
        at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
      Caused by: java.io.IOException
        at hudson.remoting.Channel.close(Channel.java:1105)
        at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
        at hudson.remoting.PingThread.ping(PingThread.java:125)
        at hudson.remoting.PingThread.run(PingThread.java:86)
      Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
        ... 2 more
      
      ... (All monitors fails with "channel is already closed")
      

        Attachments

          Issue Links

            Activity

            olivergondza Oliver Gondža created issue -
            olivergondza Oliver Gondža made changes -
            Field Original Value New Value
            Description When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

            STDOUT of JNLP process:
            {noformat}
            INFO: Ping failed. Terminating the channel.
            java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)

            Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
            SEVERE: I/O error in channel channel
            java.net.SocketException: Socket closed
            at java.net.SocketInputStream.read(SocketInputStream.java:190)
            at java.net.SocketInputStream.read(SocketInputStream.java:122)
            at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
            at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
            at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
            at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
            at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
            at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

            Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Terminated
            Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
            INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
            INFO: Setting up slave: dev127-virt2
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
            INFO: Jenkins agent is running in headless mode.
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Locating server among [http://jenkins.mw.lab.eng.bos.redhat.com/hudson/, http://hudson.mw.lab.eng.bos.redhat.com/hudson/]
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Connecting to jenkins.mw.lab.eng.bos.redhat.com:37003
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Handshaking
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
            SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
            java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
            at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
            at hudson.remoting.Engine.run(Engine.java:276)
            {noformat}

            Slave log on master:
            {noformat}
            JNLP agent connected from /10.16.180.145
            <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
            Connection terminated
            ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
            at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
            at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
            at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
            at java.util.concurrent.FutureTask.run(FutureTask.java:138)
            at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
            at java.lang.Thread.run(Thread.java:662)
            Caused by: java.io.IOException: Broken pipe
            at sun.nio.ch.FileDispatcher.write0(Native Method)
            at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
            at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
            at sun.nio.ch.IOUtil.write(IOUtil.java:40)
            at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
            at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
            at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
            ... 7 more
            Slave.jar version: 2.47
            This is a Unix slave
            Slave successfully connected and online
            Connection terminated
            {noformat}

            Master log:
            {noformat}
            2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
            java.util.concurrent.TimeoutException
              at hudson.remoting.Request$1.get(Request.java:271)
              at hudson.remoting.Request$1.get(Request.java:206)
              at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
              at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
              at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)

            ... (All monitors times out)

            2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
            2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
            2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2

            ...

            2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
            java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
              at hudson.remoting.PingThread.ping(PingThread.java:125)
              at hudson.remoting.PingThread.run(PingThread.java:86)

            ...

            2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
            hudson.remoting.ChannelClosedException: channel is already closed
              at hudson.remoting.Channel.send(Channel.java:549)
              at hudson.remoting.Request.callAsync(Request.java:204)
              at hudson.remoting.Channel.callAsync(Channel.java:778)
              at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
              at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
            Caused by: java.io.IOException
              at hudson.remoting.Channel.close(Channel.java:1105)
              at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
              at hudson.remoting.PingThread.ping(PingThread.java:125)
              at hudson.remoting.PingThread.run(PingThread.java:86)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
              ... 2 more

            ... (All monitors fails with "channel is already closed")
            {noformat}
            When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way.

            STDOUT of JNLP process:
            {noformat}
            INFO: Ping failed. Terminating the channel.
            java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582
            at hudson.remoting.PingThread.ping(PingThread.java:125)
            at hudson.remoting.PingThread.run(PingThread.java:86)

            Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
            SEVERE: I/O error in channel channel
            java.net.SocketException: Socket closed
            at java.net.SocketInputStream.read(SocketInputStream.java:190)
            at java.net.SocketInputStream.read(SocketInputStream.java:122)
            at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
            at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
            at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
            at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
            at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
            at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

            Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Terminated
            Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
            INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine
            INFO: Setting up slave: dev127-virt2
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
            INFO: Jenkins agent is running in headless mode.
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/]
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Connecting to jenkins.acme.com:37003
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Handshaking
            Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error
            SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
            java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection.
            at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
            at hudson.remoting.Engine.run(Engine.java:276)
            {noformat}

            Slave log on master:
            {noformat}
            JNLP agent connected from /10.16.180.145
            <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated
            Connection terminated
            ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2]
            at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
            at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
            at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
            at java.util.concurrent.FutureTask.run(FutureTask.java:138)
            at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
            at java.lang.Thread.run(Thread.java:662)
            Caused by: java.io.IOException: Broken pipe
            at sun.nio.ch.FileDispatcher.write0(Native Method)
            at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
            at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
            at sun.nio.ch.IOUtil.write(IOUtil.java:40)
            at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
            at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130)
            at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622)
            ... 7 more
            Slave.jar version: 2.47
            This is a Unix slave
            Slave successfully connected and online
            Connection terminated
            {noformat}

            Master log:
            {noformat}
            2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference
            java.util.concurrent.TimeoutException
              at hudson.remoting.Request$1.get(Request.java:271)
              at hudson.remoting.Request$1.get(Request.java:206)
              at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
              at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97)
              at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)

            ... (All monitors times out)

            2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248
            2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection.
            2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2

            ...

            2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel.
            java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
              at hudson.remoting.PingThread.ping(PingThread.java:125)
              at hudson.remoting.PingThread.run(PingThread.java:86)

            ...

            2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space
            hudson.remoting.ChannelClosedException: channel is already closed
              at hudson.remoting.Channel.send(Channel.java:549)
              at hudson.remoting.Request.callAsync(Request.java:204)
              at hudson.remoting.Channel.callAsync(Channel.java:778)
              at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
              at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280)
            Caused by: java.io.IOException
              at hudson.remoting.Channel.close(Channel.java:1105)
              at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
              at hudson.remoting.PingThread.ping(PingThread.java:125)
              at hudson.remoting.PingThread.run(PingThread.java:86)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630
              ... 2 more

            ... (All monitors fails with "channel is already closed")
            {noformat}
            olivergondza Oliver Gondža made changes -
            Summary Jnlp slave agent die after timeout from slave side Jnlp slave agent die after timeout detected from slave side
            Hide
            rcgroot René de Groot added a comment -

            If rejecting the connection is the correct behavior of the master then the agent should keep retrying so the system as a whole can recover.

            If the rejection is unwarranted because the old connection was dead and the master is to slow to realize this, then the issue lies with the code running on the master node.

            Show
            rcgroot René de Groot added a comment - If rejecting the connection is the correct behavior of the master then the agent should keep retrying so the system as a whole can recover. If the rejection is unwarranted because the old connection was dead and the master is to slow to realize this, then the issue lies with the code running on the master node.
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 169185 ] JNJira + In-Review [ 183395 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Merging the issue into JENKINS-28492. Likely it is being caused by the agent connection hanging on the master side

            Show
            oleg_nenashev Oleg Nenashev added a comment - Merging the issue into JENKINS-28492 . Likely it is being caused by the agent connection hanging on the master side
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue duplicates JENKINS-28492 [ JENKINS-28492 ]
            oleg_nenashev Oleg Nenashev made changes -
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Resolved [ 5 ]
            cloudbees CloudBees Inc. made changes -
            Remote Link This issue links to "CloudBees Internal CJP-6377 (Web Link)" [ 19057 ]

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              olivergondza Oliver Gondža
              Votes:
              4 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: