Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49021

Stopped but not suspended Azure VM Agent is not restarted

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • _unsorted
    • Jenkins ver. 2.89.2 with Azure VM Agents Plugin ver. 0.6.0

      We're making use of the Azure VM Agents Plugin to create and maintain agents based on a custom .vhd file.
      The key info for that custom image is:

      • Windows OS
      • JNLP connection to Jenkins Master
      • "Shutdown Only (Do Not Delete) After Retention Time" option is enabled

      Such an agent node can have two flags shown in the Jenkins sidebar: "offline" and "suspended". While everything runs as expected, we do not manually start/stop the agent VMs.
      However, the stop (i.e. deallocate) command triggered by the Azure VM Agents Plugin once the retention time is up results in the JNLP connection to get closed before Jenkins marks the node as "suspended". When a job that is supposed to be built on that agents is triggered, the node (which is shown as "offline", but not "suspended") is never started by the plugin. The job waits indefinitely until it is cancelled or the agents is being started manually (i.e. via the Azure Portal or CLI).

      This doesn't happen all the time. Sometimes, the node is marked as "suspended" before the JNLP connection is closed and the agent is being started the next time it is required – i.e. as expected.

          [JENKINS-49021] Stopped but not suspended Azure VM Agent is not restarted

          Chenyang Liu added a comment -

          Does this issue happen only in the new version(0.6.0)?

          Chenyang Liu added a comment - Does this issue happen only in the new version(0.6.0)?

          The "Shutdown Only (Do Not Delete) After Retention Time" option didn't deallocate the VMs prior to version 0.6.0 (but only triggered an OS shutdown).
          Since that didn't help in avoiding costs, I never used the option with an older version of the plugin.

          Carsten Wickner added a comment - The "Shutdown Only (Do Not Delete) After Retention Time" option didn't deallocate the VMs prior to version 0.6.0 (but only triggered an OS shutdown). Since that didn't help in avoiding costs, I never used the option with an older version of the plugin.

          As we add more Azure VM's this issue start to be blocker for our CI. 

          Jakub Michalec added a comment - As we add more Azure VM's this issue start to be blocker for our CI. 

          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-27ba90, action: shutdown
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check
          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-d971b0, action: shutdown
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check
          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-dccd40, action: shutdown
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-27ba90
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-27ba90 is always shut down
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-dccd40
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-dccd40 is always shut down
          Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-d971b0 is always shut down

           

          but all VM's keep running, I need to manually close them down.

           

          Jakub Michalec added a comment - AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-27ba90, action: shutdown Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-d971b0, action: shutdown Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-dccd40, action: shutdown Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-27ba90 Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-27ba90 is always shut down Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-dccd40 Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0 Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-dccd40 is always shut down Mar 19, 2018 8:43:09 AM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-d971b0 is always shut down   but all VM's keep running, I need to manually close them down.  

          Chenyang Liu added a comment -

          carstenenglert We try to fix this issue, but I tried many time, it always works well. So, please provide me more details. Does the issue happen more frequently when there are more nodes? Will the issue happen when using SSH?

          Chenyang Liu added a comment - carstenenglert  We try to fix this issue, but I tried many time, it always works well. So, please provide me more details. Does the issue happen more frequently when there are more nodes? Will the issue happen when using SSH?

          Hi zackliu,
          great that your looking into this.
          Unfortunately I don't have much more details to add here.
          In our setup, we actually have only two different kinds of VMs and only one of them encounters this issue, i.e. it may be related to the VM's shutdown speed.
          Both VMs are almost identical, the one occasionally having this issue just has a smaller application deployed to its installed application server, so it's shutting down faster.

          We don't have the capacity right now to setup something via SSH. Sorry.
          I was hoping the reason for this could be deduced from the code relating to the "suspended" flag being added or not.
          Or to the decision when the plugin attempts to start the VM and when it just keeps on waiting.

          kkkuba, what is your setup? And maybe you have more helpful answers to the given questions?

          Does the issue happen more frequently when there are more nodes?
          Will the issue happen when using SSH?

          Carsten Wickner added a comment - Hi zackliu , great that your looking into this. Unfortunately I don't have much more details to add here. In our setup, we actually have only two different kinds of VMs and only one of them encounters this issue, i.e. it may be related to the VM's shutdown speed. Both VMs are almost identical, the one occasionally having this issue just has a smaller application deployed to its installed application server, so it's shutting down faster. We don't have the capacity right now to setup something via SSH. Sorry. I was hoping the reason for this could be deduced from the code relating to the "suspended" flag being added or not. Or to the decision when the plugin attempts to start the VM and when it just keeps on waiting. kkkuba , what is your setup? And maybe you have more helpful answers to the given questions? Does the issue happen more frequently when there are more nodes? Will the issue happen when using SSH?

          Jakub Michalec added a comment - - edited

          zackliu, carstenenglert we're using JNLP (std mode) and jenkins run as service on each vm. Jenkins 2.110, current last version of azure plugin, vm's are F4S and B2S with private IP only (connected to our internal VPN).

          Before full rollout to azure vm, there was only 1 azure vm and I think it was working as it should, so maybe scale is creating issue. Also, each vm label type = separate cloud configuration (limit to number of specific labels)

          eg 282d50 - test label, rest is build label

          Main scenario:

          Test runs, after idle time VM shutdown, each day at 2:15 AM there's a Jenkins master restart and after that this issue is really painfull

           

          vm jenkins-282d50, should be online at 4:00 AM (first job start) but it's wait untill I manually start it. Rest of VM should be offline and suspended but its only offline, at this moment there's no job queue but I think that they would not start by themselfs

          On node:

          Ping response time is too long or timed out.

           

          Node logs

          Remoting version: 3.17
          This is a Windows agent
          Agent successfully connected and online
          ERROR: Connection terminated
          java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
          at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
          at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
          at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313)
          at hudson.remoting.Channel.close(Channel.java:1446)
          at hudson.remoting.Channel.close(Channel.java:1399)
          at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746)
          at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99)
          at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)

           

           

          ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unkn
          own Source)

           

          Other node logs (created yesterday, run one pipeline and suspened offline success) - after jenkins restart stay as offline

          JNLP agent connected from 10.216.0.11/10.216.0.11
          Remoting version: 3.13
          This is a Windows agent
          Agent successfully connected and online
          ERROR: Connection terminated
          java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
          at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
          at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
          at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313)
          at hudson.remoting.Channel.close(Channel.java:1446)
          at hudson.remoting.Channel.close(Channel.java:1399)
          at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746)
          at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99)
          at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
          
          
          JNLP agent connected from 10.216.0.11/10.216.0.11
          Remoting version: 3.13
          This is a Windows agent
          ERROR: Failed to install restarter
          ERROR: Connection terminated
          java.nio.channels.ClosedChannelException
          Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.216.0.11/10.216.0.11:50373
          at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1737)
          at hudson.remoting.Request.call(Request.java:197)
          at hudson.remoting.Channel.call(Channel.java:951)
          at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:53)
          at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:34)
          at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:40)
          at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:37)
          at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          Caused: hudson.remoting.RequestAbortedException
          at hudson.remoting.Request.abort(Request.java:335)
          at hudson.remoting.Channel.terminate(Channel.java:1034)
          at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
          at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
          at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:172)
          at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
          java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
          
          

           

          It's looks like some race condition when vm is in idle.

           

          As summ, most of pain comes after Jenkins restart, than most of the time some of VMs are useless and need manual start, in worst scenario recreate even ( I had to do it couple times)

           

          edit:

           

          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-282d50, action: shutdown
          Mar 20, 2018 1:41:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-282d50
          Mar 20, 2018 1:41:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-282d50 is always shut down
          Mar 20, 2018 1:41:57 PM INFO hudson.model.AsyncPeriodicWork$1 run
          Started Azure VM Maintainer Pool Size
          Mar 20, 2018 1:41:57 PM INFO hudson.model.AsyncPeriodicWork$1 run
          Finished Azure VM Maintainer Pool Size. 0 ms
          Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check
          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-282d50, action: shutdown
          Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-282d50
          Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-282d50 is always shut down

           

          plugin can't shutdown this vm anymore, agent it's online

          Jakub Michalec added a comment - - edited zackliu , carstenenglert  we're using JNLP (std mode) and jenkins run as service on each vm. Jenkins 2.110, current last version of azure plugin, vm's are F4S and B2S with private IP only (connected to our internal VPN). Before full rollout to azure vm, there was only 1 azure vm and I think it was working as it should, so maybe scale is creating issue. Also, each vm label type = separate cloud configuration (limit to number of specific labels) eg 282d50 - test label, rest is build label Main scenario: Test runs, after idle time VM shutdown, each day at 2:15 AM there's a Jenkins master restart and after that this issue is really painfull   vm jenkins-282d50, should be online at 4:00 AM (first job start) but it's wait untill I manually start it. Rest of VM should be offline and suspended but its only offline, at this moment there's no job queue but I think that they would not start by themselfs On node: Ping response time is too long or timed out.   Node logs Remoting version: 3.17 This is a Windows agent Agent successfully connected and online ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313) at hudson.remoting.Channel.close(Channel.java:1446) at hudson.remoting.Channel.close(Channel.java:1399) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source)     ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unkn own Source)   Other node logs (created yesterday, run one pipeline and suspened offline success) - after jenkins restart stay as offline JNLP agent connected from 10.216.0.11/10.216.0.11 Remoting version: 3.13 This is a Windows agent Agent successfully connected and online ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313) at hudson.remoting.Channel.close(Channel.java:1446) at hudson.remoting.Channel.close(Channel.java:1399) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) JNLP agent connected from 10.216.0.11/10.216.0.11 Remoting version: 3.13 This is a Windows agent ERROR: Failed to install restarter ERROR: Connection terminated java.nio.channels.ClosedChannelException Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.216.0.11/10.216.0.11:50373 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1737) at hudson.remoting.Request.call(Request.java:197) at hudson.remoting.Channel.call(Channel.java:951) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:53) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:34) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:40) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:37) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at java.util.concurrent.FutureTask.run(Unknown Source) Caused: hudson.remoting.RequestAbortedException at hudson.remoting.Request.abort(Request.java:335) at hudson.remoting.Channel.terminate(Channel.java:1034) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:172) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source)   It's looks like some race condition when vm is in idle.   As summ, most of pain comes after Jenkins restart, than most of the time some of VMs are useless and need manual start, in worst scenario recreate even ( I had to do it couple times)   edit:   AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-282d50, action: shutdown Mar 20, 2018 1:41:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-282d50 Mar 20, 2018 1:41:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-282d50 is always shut down Mar 20, 2018 1:41:57 PM INFO hudson.model.AsyncPeriodicWork$1 run Started Azure VM Maintainer Pool Size Mar 20, 2018 1:41:57 PM INFO hudson.model.AsyncPeriodicWork$1 run Finished Azure VM Maintainer Pool Size. 0 ms Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy check AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-282d50, action: shutdown Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-282d50 Mar 20, 2018 1:42:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-282d50 is always shut down   plugin can't shutdown this vm anymore, agent it's online

          Chenyang Liu added a comment -

          From my tests today, suspended or not is not the key point, we don't check whether it's suspended when try to reuse the VM. We shutdown the VM and set EligibleForReuse=true. Also we restart the VM and set EligibleForReuse=false. I think there is some multi-thread problems between these two processes. EligibleForReuse fall into a mess so that cause some of these problems.

          I will add some synchronized logic and keep testing.

          The logs show there's really a mess after restarting the Jenkins Master, I will keep investigating.

          Chenyang Liu added a comment - From my tests today, suspended or not is not the key point, we don't check whether it's suspended when try to reuse the VM. We shutdown the VM and set EligibleForReuse=true. Also we restart the VM and set EligibleForReuse=false. I think there is some multi-thread problems between these two processes. EligibleForReuse fall into a mess so that cause some of these problems. I will add some synchronized logic and keep testing. The logs show there's really a mess after restarting the Jenkins Master, I will keep investigating.

          Jakub Michalec added a comment - - edited

          some more logs during goint to idle

           for Jenkins this agent is still online, from plugin perspective also coz now I get 'spamming' in logs with:

           

          AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-d971b0, action: shutdown
          Mar 20, 2018 2:44:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0
          Mar 20, 2018 2:44:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: agent jenkins-d971b0 is always shut down
          AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0
          Mar 20, 2018 2:37:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown
          AzureVMAgent: shutdown: shutting down agent jenkins-d971b0
          Mar 20, 2018 2:37:26 PM INFO com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate shutdownVirtualMachine
          AzureVMManagementServiceDelegate: shutdownVirtualMachine: called for jenkins-d971b0
          Mar 20, 2018 2:37:26 PM INFO com.microsoft.aad.adal4j.AuthenticationAuthority doInstanceDiscovery
          [Correlation ID: b0d9875f-777f-49e8-b380-624ba2207baa] Instance discovery was successful
          Mar 20, 2018 2:37:26 PM WARNING jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
          Computer.threadPoolForRemoting [#550] for jenkins-d971b0 terminated
          java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
          	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
          	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
          	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
          	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
          	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313)
          	at hudson.remoting.Channel.close(Channel.java:1446)
          	at hudson.remoting.Channel.close(Channel.java:1399)
          	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746)
          	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99)
          	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          	at java.util.concurrent.FutureTask.run(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          
          
          JNLP agent connected from 10.216.0.5/10.216.0.5
          Remoting version: 3.13
          This is a Windows agent
          ERROR: Connection terminated
          java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          ERROR: Failed to install restarter
          java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down
          	at hudson.remoting.Channel.call(Channel.java:945)
          	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:53)
          	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:34)
          	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:40)
          	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:37)
          	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          	at java.util.concurrent.FutureTask.run(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          ERROR: java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down
          	at hudson.remoting.Channel.call(Channel.java:945)
          	at hudson.FilePath.act(FilePath.java:1093)
          	at org.jenkinsci.plugins.envinject.EnvInjectComputerListener.getNewSlaveEnvironmentVariables(EnvInjectComputerListener.java:102)
          	at org.jenkinsci.plugins.envinject.EnvInjectComputerListener.onOnline(EnvInjectComputerListener.java:157)
          	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:620)
          	at jenkins.slaves.DefaultJnlpSlaveReceiver.afterChannel(DefaultJnlpSlaveReceiver.java:168)
          	at org.jenkinsci.remoting.engine.JnlpConnectionState$4.invoke(JnlpConnectionState.java:421)
          	at org.jenkinsci.remoting.engine.JnlpConnectionState.fire(JnlpConnectionState.java:312)
          	at org.jenkinsci.remoting.engine.JnlpConnectionState.fireAfterChannel(JnlpConnectionState.java:418)
          	at org.jenkinsci.remoting.engine.JnlpProtocol4Handler$Handler$1.run(JnlpProtocol4Handler.java:334)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          Failed to update jenkins-slave.exe
          java.io.IOException: remote file operation failed: C:\ws\jenkins-slave.exe at hudson.remoting.Channel@576cf8b8:JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down
          	at hudson.FilePath.act(FilePath.java:1005)
          	at hudson.FilePath.act(FilePath.java:987)
          	at hudson.FilePath.exists(FilePath.java:1473)
          	at org.jenkinsci.modules.windows_slave_installer.SlaveExeUpdater$1.call(SlaveExeUpdater.java:69)
          	at org.jenkinsci.modules.windows_slave_installer.SlaveExeUpdater$1.call(SlaveExeUpdater.java:59)
          	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          	at java.util.concurrent.FutureTask.run(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          Caused by: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down
          	at hudson.remoting.Channel.call(Channel.java:945)
          	at hudson.FilePath.act(FilePath.java:998)
          	... 9 more
          Caused by: java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
          	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	... 3 more

          Jakub Michalec added a comment - - edited some more logs during goint to idle  for Jenkins this agent is still online, from plugin perspective also coz now I get 'spamming' in logs with:   AzureVMCloudRetensionStrategy: check: Idle timeout reached for agent: jenkins-d971b0, action: shutdown Mar 20, 2018 2:44:26 PM INFO com.microsoft.azure.vmagent.AzureVMCloudRetensionStrategy$1 call AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0 Mar 20, 2018 2:44:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: agent jenkins-d971b0 is always shut down AzureVMCloudRetensionStrategy: going to idleTimeout agent: jenkins-d971b0 Mar 20, 2018 2:37:26 PM INFO com.microsoft.azure.vmagent.AzureVMAgent shutdown AzureVMAgent: shutdown: shutting down agent jenkins-d971b0 Mar 20, 2018 2:37:26 PM INFO com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate shutdownVirtualMachine AzureVMManagementServiceDelegate: shutdownVirtualMachine: called for jenkins-d971b0 Mar 20, 2018 2:37:26 PM INFO com.microsoft.aad.adal4j.AuthenticationAuthority doInstanceDiscovery [Correlation ID: b0d9875f-777f-49e8-b380-624ba2207baa] Instance discovery was successful Mar 20, 2018 2:37:26 PM WARNING jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed Computer.threadPoolForRemoting [#550] for jenkins-d971b0 terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313) at hudson.remoting.Channel.close(Channel.java:1446) at hudson.remoting.Channel.close(Channel.java:1399) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) JNLP agent connected from 10.216.0.5/10.216.0.5 Remoting version: 3.13 This is a Windows agent ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) ERROR: Failed to install restarter java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) Caused: hudson.remoting.ChannelClosedException: Channel "unknown" : Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:945) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:53) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:34) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:40) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:37) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) ERROR: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) Caused: hudson.remoting.ChannelClosedException: Channel "unknown" : Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:945) at hudson.FilePath.act(FilePath.java:1093) at org.jenkinsci.plugins.envinject.EnvInjectComputerListener.getNewSlaveEnvironmentVariables(EnvInjectComputerListener.java:102) at org.jenkinsci.plugins.envinject.EnvInjectComputerListener.onOnline(EnvInjectComputerListener.java:157) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:620) at jenkins.slaves.DefaultJnlpSlaveReceiver.afterChannel(DefaultJnlpSlaveReceiver.java:168) at org.jenkinsci.remoting.engine.JnlpConnectionState$4.invoke(JnlpConnectionState.java:421) at org.jenkinsci.remoting.engine.JnlpConnectionState.fire(JnlpConnectionState.java:312) at org.jenkinsci.remoting.engine.JnlpConnectionState.fireAfterChannel(JnlpConnectionState.java:418) at org.jenkinsci.remoting.engine.JnlpProtocol4Handler$Handler$1.run(JnlpProtocol4Handler.java:334) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) Failed to update jenkins-slave.exe java.io.IOException: remote file operation failed: C:\ws\jenkins-slave.exe at hudson.remoting.Channel@576cf8b8:JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943: hudson.remoting.ChannelClosedException: Channel "unknown" : Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down at hudson.FilePath.act(FilePath.java:1005) at hudson.FilePath.act(FilePath.java:987) at hudson.FilePath.exists(FilePath.java:1473) at org.jenkinsci.modules.windows_slave_installer.SlaveExeUpdater$1.call(SlaveExeUpdater.java:69) at org.jenkinsci.modules.windows_slave_installer.SlaveExeUpdater$1.call(SlaveExeUpdater.java:59) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) Caused by: hudson.remoting.ChannelClosedException: Channel "unknown" : Remote call on JNLP4-connect connection from 10.216.0.5/10.216.0.5:50943 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:945) at hudson.FilePath.act(FilePath.java:998) ... 9 more Caused by: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) ... 3 more

          Chenyang Liu added a comment -

          The issue should be resolved in 0.7.1

          Chenyang Liu added a comment - The issue should be resolved in 0.7.1

            zackliu Chenyang Liu
            carstenenglert Carsten Wickner
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: