Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37597

ECS nodes not removed from build executor list

    • Icon: New Feature New Feature
    • Resolution: Fixed
    • Icon: Major Major
    • amazon-ecs-plugin
    • Jenkins 2.18
      Plugin v1.5
      Ubuntu 14.04
      Java 8u101
      Running on an AWS EC2 c4.xlarge

      An invalid list of Jenkins build agents that were created in ECS remains even after the job completes.

      To reproduce:

      1. Configure the AWS ECS Plugin
      2. Configure a new freestyle job
      3. Restrict the job to the configured ECS cluster
      4. Build the job and observe completion (pass/fail state does not matter)
      5. Immediately build the job again
      6. Observe the completion and two offline nodes in the build executor list

          [JENKINS-37597] ECS nodes not removed from build executor list

          Eric Goedtel created issue -

          Have the same issue. ericgoedtel have you found any workaround?

          Ruslan Vlasyuk added a comment - Have the same issue. ericgoedtel have you found any workaround?

          Eric Goedtel added a comment -

          Nope. I gave up didn't use ECS and just ran my own Docker host. Sorry . Would be great if this worked because I would love to defer this to ECS.

          Eric Goedtel added a comment - Nope. I gave up didn't use ECS and just ran my own Docker host. Sorry . Would be great if this worked because I would love to defer this to ECS.

          Me too. But I'm using docker swarm plus docker registry and it works like ecs. I've found a script for Jenkins, it could be useful for deleting offline nodes, but it's not a right way. Waiting for this feature for Jenkins plugin.

          Ruslan Vlasyuk added a comment - Me too. But I'm using docker swarm plus docker registry and it works like ecs. I've found a script for Jenkins, it could be useful for deleting offline nodes, but it's not a right way. Waiting for this feature for Jenkins plugin.

          This is still a problem

          Peter Vaassens added a comment - This is still a problem

          I think this issue is related to Jenkins JNLP slave functional. I tried to use JNLP slaves with Docker swarm cluster - the same issue.

          Ruslan Vlasyuk added a comment - I think this issue is related to Jenkins JNLP slave functional. I tried to use JNLP slaves with Docker swarm cluster - the same issue.

          Wade Catron added a comment -

          I'm able to reproduce this problem with plugin version 1.6, but only when build duration is less than 5 seconds or so. Longer builds result in the node being removed after build completion.

          Perhaps there exists a race condition which leads to node removal failure if a build finishes before the node it runs on is completely registered, or something along those lines.

          Wade Catron added a comment - I'm able to reproduce this problem with plugin version 1.6, but only when build duration is less than 5 seconds or so. Longer builds result in the node being removed after build completion. Perhaps there exists a race condition which leads to node removal failure if a build finishes before the node it runs on is completely registered, or something along those lines.

          Lee Webb added a comment -

          Hmm, this is interesting

          I updated to Jenkins 2.41 this morning, one of the 'enhancements' of which in that release is JNLP4 for all agents (https://issues.jenkins-ci.org/browse/JENKINS-40886)

          As soon as I updated any agents created by the ECS plugin were left in an idle state after their builds. I occasionally them go into a suspended state, but then they would come back out & go idle again.

          Because all the ECS cluster resources were in use, no more containers would spawn.

          Rolling back to 2.40 immediately corrected the issue.

          Lee Webb added a comment - Hmm, this is interesting I updated to Jenkins 2.41 this morning, one of the 'enhancements' of which in that release is JNLP4 for all agents ( https://issues.jenkins-ci.org/browse/JENKINS-40886 ) As soon as I updated any agents created by the ECS plugin were left in an idle state after their builds. I occasionally them go into a suspended state, but then they would come back out & go idle again. Because all the ECS cluster resources were in use, no more containers would spawn. Rolling back to 2.40 immediately corrected the issue.

          Still an issue with Jenkins 2.60.3 and plugin version 1.11. 

          When launching a bunch of parallel jobs, I see the plugin starts launching agents as supposed but as there are still jobs in queue after the first jobs finish the agents stay on the list as offline even though the container tasks are stopped and containers deleted as supposed from ECS cluster. And as the plugin thinks the offline agents use all available ECS cpu capacity, no new agents are launched thus the rest of the jobs do not get run. This is a blocking bug for us, the plugin is virtually unusable as we would need to constantly manually delete the offline agents.

          Mika Karjalainen added a comment - Still an issue with Jenkins 2.60.3 and plugin version 1.11.  When launching a bunch of parallel jobs, I see the plugin starts launching agents as supposed but as there are still jobs in queue after the first jobs finish the agents stay on the list as offline even though the container tasks are stopped and containers deleted as supposed from ECS cluster. And as the plugin thinks the offline agents use all available ECS cpu capacity, no new agents are launched thus the rest of the jobs do not get run. This is a blocking bug for us, the plugin is virtually unusable as we would need to constantly manually delete the offline agents.

          Lukas Elsner added a comment -

          Is someone working on this? Plugin doesn't really seem production ready. Too many open bugs? No updates? Abandoned?

          Lukas Elsner added a comment - Is someone working on this? Plugin doesn't really seem production ready. Too many open bugs? No updates? Abandoned?

            pgarbe Philipp Garbe
            ericgoedtel Eric Goedtel
            Votes:
            6 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: