Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37597

ECS nodes not removed from build executor list

    • Icon: New Feature New Feature
    • Resolution: Fixed
    • Icon: Major Major
    • amazon-ecs-plugin
    • Jenkins 2.18
      Plugin v1.5
      Ubuntu 14.04
      Java 8u101
      Running on an AWS EC2 c4.xlarge

      An invalid list of Jenkins build agents that were created in ECS remains even after the job completes.

      To reproduce:

      1. Configure the AWS ECS Plugin
      2. Configure a new freestyle job
      3. Restrict the job to the configured ECS cluster
      4. Build the job and observe completion (pass/fail state does not matter)
      5. Immediately build the job again
      6. Observe the completion and two offline nodes in the build executor list

          [JENKINS-37597] ECS nodes not removed from build executor list

          Have the same issue. ericgoedtel have you found any workaround?

          Ruslan Vlasyuk added a comment - Have the same issue. ericgoedtel have you found any workaround?

          Eric Goedtel added a comment -

          Nope. I gave up didn't use ECS and just ran my own Docker host. Sorry . Would be great if this worked because I would love to defer this to ECS.

          Eric Goedtel added a comment - Nope. I gave up didn't use ECS and just ran my own Docker host. Sorry . Would be great if this worked because I would love to defer this to ECS.

          Me too. But I'm using docker swarm plus docker registry and it works like ecs. I've found a script for Jenkins, it could be useful for deleting offline nodes, but it's not a right way. Waiting for this feature for Jenkins plugin.

          Ruslan Vlasyuk added a comment - Me too. But I'm using docker swarm plus docker registry and it works like ecs. I've found a script for Jenkins, it could be useful for deleting offline nodes, but it's not a right way. Waiting for this feature for Jenkins plugin.

          This is still a problem

          Peter Vaassens added a comment - This is still a problem

          I think this issue is related to Jenkins JNLP slave functional. I tried to use JNLP slaves with Docker swarm cluster - the same issue.

          Ruslan Vlasyuk added a comment - I think this issue is related to Jenkins JNLP slave functional. I tried to use JNLP slaves with Docker swarm cluster - the same issue.

          Wade Catron added a comment -

          I'm able to reproduce this problem with plugin version 1.6, but only when build duration is less than 5 seconds or so. Longer builds result in the node being removed after build completion.

          Perhaps there exists a race condition which leads to node removal failure if a build finishes before the node it runs on is completely registered, or something along those lines.

          Wade Catron added a comment - I'm able to reproduce this problem with plugin version 1.6, but only when build duration is less than 5 seconds or so. Longer builds result in the node being removed after build completion. Perhaps there exists a race condition which leads to node removal failure if a build finishes before the node it runs on is completely registered, or something along those lines.

          Lee Webb added a comment -

          Hmm, this is interesting

          I updated to Jenkins 2.41 this morning, one of the 'enhancements' of which in that release is JNLP4 for all agents (https://issues.jenkins-ci.org/browse/JENKINS-40886)

          As soon as I updated any agents created by the ECS plugin were left in an idle state after their builds. I occasionally them go into a suspended state, but then they would come back out & go idle again.

          Because all the ECS cluster resources were in use, no more containers would spawn.

          Rolling back to 2.40 immediately corrected the issue.

          Lee Webb added a comment - Hmm, this is interesting I updated to Jenkins 2.41 this morning, one of the 'enhancements' of which in that release is JNLP4 for all agents ( https://issues.jenkins-ci.org/browse/JENKINS-40886 ) As soon as I updated any agents created by the ECS plugin were left in an idle state after their builds. I occasionally them go into a suspended state, but then they would come back out & go idle again. Because all the ECS cluster resources were in use, no more containers would spawn. Rolling back to 2.40 immediately corrected the issue.

          Still an issue with Jenkins 2.60.3 and plugin version 1.11. 

          When launching a bunch of parallel jobs, I see the plugin starts launching agents as supposed but as there are still jobs in queue after the first jobs finish the agents stay on the list as offline even though the container tasks are stopped and containers deleted as supposed from ECS cluster. And as the plugin thinks the offline agents use all available ECS cpu capacity, no new agents are launched thus the rest of the jobs do not get run. This is a blocking bug for us, the plugin is virtually unusable as we would need to constantly manually delete the offline agents.

          Mika Karjalainen added a comment - Still an issue with Jenkins 2.60.3 and plugin version 1.11.  When launching a bunch of parallel jobs, I see the plugin starts launching agents as supposed but as there are still jobs in queue after the first jobs finish the agents stay on the list as offline even though the container tasks are stopped and containers deleted as supposed from ECS cluster. And as the plugin thinks the offline agents use all available ECS cpu capacity, no new agents are launched thus the rest of the jobs do not get run. This is a blocking bug for us, the plugin is virtually unusable as we would need to constantly manually delete the offline agents.

          Lukas Elsner added a comment -

          Is someone working on this? Plugin doesn't really seem production ready. Too many open bugs? No updates? Abandoned?

          Lukas Elsner added a comment - Is someone working on this? Plugin doesn't really seem production ready. Too many open bugs? No updates? Abandoned?

          Lukas Elsner added a comment -

          Maybe duplicate of JENKINS-40183?

          Lukas Elsner added a comment - Maybe duplicate of JENKINS-40183 ?

          hao wang added a comment -

          I have the same issue, see jenkins master log.

          hao wang added a comment - I have the same issue, see jenkins master log.

          Nicola Worthington added a comment - - edited

          I too am seeing similar, with build executors gradually accumulating in an offline state. Jobs to continue to execute okay, but this accumulation of offline executors is problematic, especially on a busy server.

          Jenkins version 2.121 (based on the official jenkins/jenkins:alpine Docker image) with plugin version 1.14, JNLP slaves using the jenkinsci/jnlp-slave Docker image.

          WARNING: jenkins.util.Timer [#3] for ci-jenkins-build-executors-42b6bb8c663f terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:713) at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88) at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:227) at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1551) at jenkins.model.Nodes$6.run(Nodes.java:261) at hudson.model.Queue._withLock(Queue.java:1378) at hudson.model.Queue.withLock(Queue.java:1255) at jenkins.model.Nodes.removeNode(Nodes.java:252) at jenkins.model.Jenkins.removeNode(Jenkins.java:2065) at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70) at com.cloudbees.jenkins.plugins.amazonecs.ECSSlave$1.check(ECSSlave.java:82) at com.cloudbees.jenkins.plugins.amazonecs.ECSSlave$1.check(ECSSlave.java:70) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1378) at hudson.model.Queue.withLock(Queue.java:1255) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

           

          WARNING: Computer.threadPoolForRemoting [#8281] for ci-jenkins-build-executors-429623dffe91 terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

          Nicola Worthington added a comment - - edited I too am seeing similar, with build executors gradually accumulating in an offline state. Jobs to continue to execute okay, but this accumulation of offline executors is problematic, especially on a busy server. Jenkins version 2.121 (based on the official jenkins/jenkins:alpine Docker image) with plugin version 1.14, JNLP slaves using the jenkinsci/jnlp-slave Docker image. WARNING: jenkins.util.Timer [#3] for ci-jenkins-build-executors-42b6bb8c663f terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:713) at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88) at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:227) at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1551) at jenkins.model.Nodes$6.run(Nodes.java:261) at hudson.model.Queue._withLock(Queue.java:1378) at hudson.model.Queue.withLock(Queue.java:1255) at jenkins.model.Nodes.removeNode(Nodes.java:252) at jenkins.model.Jenkins.removeNode(Jenkins.java:2065) at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70) at com.cloudbees.jenkins.plugins.amazonecs.ECSSlave$1.check(ECSSlave.java:82) at com.cloudbees.jenkins.plugins.amazonecs.ECSSlave$1.check(ECSSlave.java:70) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1378) at hudson.model.Queue.withLock(Queue.java:1255) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748)   WARNING: Computer.threadPoolForRemoting [#8281] for ci-jenkins-build-executors-429623dffe91 terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748)

          Damien Roche added a comment -

          Seeing this issue occur when we've more than 30 agents connected, it's only effecting ECS tasks EC2 seems to spawn fine. What I've noted is that only some of the ECS agent 'types' we have fail other ECS agent types seem to launch and connect fine with the same underlying ECS cluster.

           

          here's a cleanup script that may help for those with busy servers.

          String agentList = ""
          String agentPrefix = "example"
          Integer agentTotal = 0
          Integer ofType = 0
          Integer ofDeleted = 0
          
          // Itterate nodes
          Jenkins.instance.nodes.each {
              //println "Checking agent: $it.nodeName"
            if (it.nodeName.contains(agentPrefix)){
              //println it.computer.offlineCause.toString()
              if (it.computer.offlineCause.toString().contains('Time out for last 5 try')) {
                agentList += it.nodeName + "\n"
          	  it.computer.doDoDelete()
                ofDeleted +=1
              }
              ofType +=1
            }
            agentTotal +=1
          }
          
          println "Deleted Agent list: \n" + agentList
          println "Total: ${agentTotal} Total type: ${ofType} Deleted: ${ofDeleted}"
          

           

          Damien Roche added a comment - Seeing this issue occur when we've more than 30 agents connected, it's only effecting ECS tasks EC2 seems to spawn fine. What I've noted is that only some of the ECS agent 'types' we have fail other ECS agent types seem to launch and connect fine with the same underlying ECS cluster.   here's a cleanup script that may help for those with busy servers. String agentList = "" String agentPrefix = "example" Integer agentTotal = 0 Integer ofType = 0 Integer ofDeleted = 0 // Itterate nodes Jenkins.instance.nodes.each { //println "Checking agent: $it.nodeName" if (it.nodeName.contains(agentPrefix)){ //println it.computer.offlineCause.toString() if (it.computer.offlineCause.toString().contains( 'Time out for last 5 try ' )) { agentList += it.nodeName + "\n" it.computer.doDoDelete() ofDeleted +=1 } ofType +=1 } agentTotal +=1 } println "Deleted Agent list: \n" + agentList println "Total: ${agentTotal} Total type: ${ofType} Deleted: ${ofDeleted}"  

          Damien Roche added a comment -

          So there are a couple of different issues where they don't get cleaned up automatically.

          INFO: Created Slave: cidd-9bd0fafd9be3
          INFO: Running task definition arn:aws:ecs:us-east-1:123456789012:task-definition/cidd-t2-small-generic:1 on slave cidd-9bd0fafd9be3
          INFO: Slave cidd-9bd0fafd9be3 - Slave Task Started : arn:aws:ecs:us-east-1:123456789012:task/example-omited
          INFO: ECS Slave cidd-9bd0fafd9be3 (ecs task arn:aws:ecs:us-east-1:123456789012:task/example-omited) connected
          WARNING: Computer.threadPoolForRemoting [#1754] for cidd-9bd0fafd9be3 terminated
          WARNING: Making cidd-9bd0fafd9be3 offline because it’s not responding
          

          Even tho it was terminated the node/agent remained in the list

          INFO: Created Slave: cidd-9b44dbcdddde
          INFO: Running task definition arn:aws:ecs:us-east-1:123456789012:task-definition/cidd-t2-small-generic:1 on slave cidd-9b44dbcdddde
          INFO: Slave cidd-9b44dbcdddde - Slave Task Started : arn:aws:ecs:us-east-1:123456789012:task/example-omitted
          INFO: ECS Slave cidd-9b44dbcdddde (ecs task arn:aws:ecs:us-east-1:123456789012:task/example-omitted) connected
          WARNING: Making cidd-9b44dbcdddde offline temporarily due to the use of an old slave.jar
          WARNING: Computer.threadPoolForRemoting [#1753] for cidd-9b44dbcdddde terminated
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          WARNING: Making cidd-9b44dbcdddde offline because it’s not responding
          

          Similar issue node/agent was not removed from the list. But I believe at least this node is using a newer version of remoting than the master.
          Will downgrade this to match and try again.

           

           

           

           

          Damien Roche added a comment - So there are a couple of different issues where they don't get cleaned up automatically. INFO: Created Slave: cidd-9bd0fafd9be3 INFO: Running task definition arn:aws:ecs:us-east-1:123456789012:task-definition/cidd-t2-small- generic :1 on slave cidd-9bd0fafd9be3 INFO: Slave cidd-9bd0fafd9be3 - Slave Task Started : arn:aws:ecs:us-east-1:123456789012:task/example-omited INFO: ECS Slave cidd-9bd0fafd9be3 (ecs task arn:aws:ecs:us-east-1:123456789012:task/example-omited) connected WARNING: Computer.threadPoolForRemoting [#1754] for cidd-9bd0fafd9be3 terminated WARNING: Making cidd-9bd0fafd9be3 offline because it’s not responding Even tho it was terminated the node/agent remained in the list INFO: Created Slave: cidd-9b44dbcdddde INFO: Running task definition arn:aws:ecs:us-east-1:123456789012:task-definition/cidd-t2-small- generic :1 on slave cidd-9b44dbcdddde INFO: Slave cidd-9b44dbcdddde - Slave Task Started : arn:aws:ecs:us-east-1:123456789012:task/example-omitted INFO: ECS Slave cidd-9b44dbcdddde (ecs task arn:aws:ecs:us-east-1:123456789012:task/example-omitted) connected WARNING: Making cidd-9b44dbcdddde offline temporarily due to the use of an old slave.jar WARNING: Computer.threadPoolForRemoting [#1753] for cidd-9b44dbcdddde terminated WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding WARNING: Making cidd-9b44dbcdddde offline because it’s not responding Similar issue node/agent was not removed from the list. But I believe at least this node is using a newer version of remoting than the master. Will downgrade this to match and try again.        

          Philipp Garbe added a comment -

          Is this problem solved with v1.16?

          Philipp Garbe added a comment - Is this problem solved with v1.16?

          Philipp Garbe added a comment -

          PR #63 is merged in v1.16

          Philipp Garbe added a comment - PR #63 is merged in v1.16

          Damien Roche added a comment -

          Looks to have resolved it on our side. Much appreciated.

          Damien Roche added a comment - Looks to have resolved it on our side. Much appreciated.

            pgarbe Philipp Garbe
            ericgoedtel Eric Goedtel
            Votes:
            6 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: