Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59661

EC2 fails to provision due incorrect instance cap

      We have been having issues with the EC2 plugin versions above 1.39 where jenkins wont connect to nodes once stopped.
      Our nodes are all set at instance caps of 1 however, when jenkins tries to bring back up a node it successfully launches it but fails to provision as it thinks we have an instance cap of 0 Cannot provision - no capacity for instances: 0
      Changing our instance cap to 2 instances allows jenkins to connect to one of the nodes but when it tries to re-startup a second node we then see `Cannot provision - no capacity for instances: 0` It seems like the instance cap is off by one.

       

      SlaveTemplate{ami='ami-0095fb81fe067be99', labels='devops'}. Attempting to provision slave needed by excess workload of 1 units
       Oct 04, 2019 3:17:02 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
       SlaveTemplate{ami='ami-0095fb81fe067be99', labels='devops'}. Cannot provision - no capacity for instances: 0

       

       

      We are running jenkins 2.190.1 with Amazon EC2 plugin version 1.46.1 on an EC2 instance running ubuntu 18.04

          [JENKINS-59661] EC2 fails to provision due incorrect instance cap

          Denis Bel added a comment -

          ryan_m we are experiencing the same issue, but in our case agents marked as Offline in Jenkins actually are still running in AWS.

          After stop those EC2 instances manually via AWS console, we can see increased capacity again, you could check this out too.

          Denis Bel added a comment - ryan_m we are experiencing the same issue, but in our case agents marked as Offline in Jenkins actually are still running in AWS. After stop those EC2 instances manually via AWS console, we can see increased capacity again, you could check this out too.

          Handi Gao added a comment -

          ryan_m We are having this issue as well and noticed the same as brialius. The current workaround we have is that instead of stopping the instances after idle timeout, we terminate the instances, i.e. uncheck the option "Stop/Disconnect on Idle Timeout". I think the root cause is somehow releated to JENKINS-57795 which has an open pr addressing it https://github.com/jenkinsci/ec2-plugin/pull/399

          Handi Gao added a comment - ryan_m  We are having this issue as well and noticed the same as brialius . The current workaround we have is that instead of stopping the instances after idle timeout, we terminate the instances, i.e. uncheck the option "Stop/Disconnect on Idle Timeout". I think the root cause is somehow releated to JENKINS-57795 which has an open pr addressing it  https://github.com/jenkinsci/ec2-plugin/pull/399

          ryan M added a comment -

          If jenkins has stopped the instance and tries to connect to it I get the instance cap error but the EC2 instance will be running in the EC2 console.

          I think terminating the instances instead of stopping will work for us. However, we have environment variables set for each node that need to be manually added when a new node is brought up.

          ryan M added a comment - If jenkins has stopped the instance and tries to connect to it I get the instance cap error but the EC2 instance will be running in the EC2 console. I think terminating the instances instead of stopping will work for us. However, we have environment variables set for each node that need to be manually added when a new node is brought up.

          Handi Gao added a comment - - edited

          ryan_m Yeah, the probelm they are having is that the EC2 instance gets restarted in AWS but it will take some time for it to show as "Pending" and the plugin only checks the state of the instance once... if the state is neither running or pending, the plugin stops provisioning the node... the pr adds the idea of retrying when waking up stopped instances so it should be able to solve the issue we are having right now

          Oct 16, 2019 5:06:19 PM WARNING hudson.plugins.ec2.EC2Cloud$1 call
          SlaveTemplate{ami='ami-02eac2c0129f6376b', labels='default master_ec2'}. Node stopped is neither pending, neither running, its {2}. Terminate provisioning
          

          For the environment variables, I'm not sure if the init script can add them for you... you could check it out, may save you some effort

          Handi Gao added a comment - - edited ryan_m  Yeah, the probelm they are having is that the EC2 instance gets restarted in AWS but it will take some time for it to show as "Pending" and the plugin only checks the state of the instance once... if the state is neither running or pending, the plugin stops provisioning the node... the pr adds the idea of retrying when waking up stopped instances so it should be able to solve the issue we are having right now Oct 16, 2019 5:06:19 PM WARNING hudson.plugins.ec2.EC2Cloud$1 call SlaveTemplate{ami= 'ami-02eac2c0129f6376b' , labels= ' default master_ec2' }. Node stopped is neither pending, neither running, its {2}. Terminate provisioning For the environment variables, I'm not sure if the init script can add them for you... you could check it out, may save you some effort

            thoulen FABRIZIO MANFREDI
            ryan_m ryan M
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: