Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19845

EC2 plugin incorrectly reports current instance count

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • ec2-plugin
    • AWS Hosted CentOS release 6.4 (Final) x86_64
      Jenkins ver. 1.533
      Amazon EC2 plugin 1.18

    Description

      This issue has come up in the past ~48 hours or so after a dev added a theme plugin and updated plugins.

      Ec2 plugin now states that instance cap has been reached and will not provision any further instances.

      Currently no slave instances exist (all terminated several hours ago).
      Plugin has been disabled and re-enabled.
      Cloud config removed and re-created.
      Cap has been increased to 50 (normally 10)
      All without success.

      Log output as below

      02/10/2013 9:48:03 PM hudson.plugins.ec2.EC2Cloud addProvisionedSlave
      INFO: Total instance cap of 10 reached, not provisioning.
      02/10/2013 9:48:13 PM hudson.plugins.ec2.EC2Cloud addProvisionedSlave
      INFO: Total instance cap of 50 reached, not provisioning.

      Attachments

        Activity

          weirded Stefan Zier added a comment - - edited

          I've observed the same issue in 1.19 of the plugin. Looking at the code (specifically, EC2Cloud.addProvisionedSlave() and EC2Cloud.countCurrentEC2Slaves()), it appears the plugin counts all instances in the EC2 account, not just the ones launched by Jenkins. In other words, if you have other instances in your EC2 account, they count, as well.

          To work around the bug, I did the following:

          • Cleared out the global "Instance Cap"
          • Set a cap per "Cloud"

          With those changes, only instances of the same AMI count against the cap. This is still a bug, though, since it is quite possible for the same AMI to be used for other things (in our case, we have an old Hudson cluster that happens to use the same AMI).

          In order to make this more dependable, I'd recommend using tags to remember which instances were launched by this particular Jenkins master. Maybe one tag (JenkinsCluster) that's the URL of the master, and a second tag (JenkinsCloud) that's the name of the cloud configuration. The tags would then be used to find the instances again.

          weirded Stefan Zier added a comment - - edited I've observed the same issue in 1.19 of the plugin. Looking at the code (specifically, EC2Cloud.addProvisionedSlave() and EC2Cloud.countCurrentEC2Slaves() ), it appears the plugin counts all instances in the EC2 account, not just the ones launched by Jenkins. In other words, if you have other instances in your EC2 account, they count, as well. To work around the bug, I did the following: Cleared out the global "Instance Cap" Set a cap per "Cloud" With those changes, only instances of the same AMI count against the cap. This is still a bug, though, since it is quite possible for the same AMI to be used for other things (in our case, we have an old Hudson cluster that happens to use the same AMI). In order to make this more dependable, I'd recommend using tags to remember which instances were launched by this particular Jenkins master. Maybe one tag ( JenkinsCluster ) that's the URL of the master, and a second tag ( JenkinsCloud ) that's the name of the cloud configuration. The tags would then be used to find the instances again.
          dwalend David Walend added a comment - - edited

          UPDATE – it's the same problem. I found a second Instance Cap field, tucked into (yet another) Advanced button. Changing that from 1 to 2 let jenkins start a machine.

          I've got similar symptoms, but not exactly the same. Everything was fine up until about a month ago.

          I didn't originally set a cap, tried setting the cap to 5, (with only the Jenkins server and one other machine are running in this account). No machine is starting when I start a new build (but the new job is queued). I get the log I've pasted below. I upgraded everything. No luck.

          Any thoughts?

          Thanks,

          Dave

          May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud provision

          Excess workload after pending Spot instances: 1

          May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud addProvisionedSlave

          AMI Instance cap of 1 reached for ami ami-e318378a, not provisioning.

          May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud provision

          Excess workload after pending Spot instances: 1

          May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud addProvisionedSlave

          AMI Instance cap of 1 reached for ami ami-e318378a, not provisioning.

          dwalend David Walend added a comment - - edited UPDATE – it's the same problem. I found a second Instance Cap field, tucked into (yet another) Advanced button. Changing that from 1 to 2 let jenkins start a machine. I've got similar symptoms, but not exactly the same. Everything was fine up until about a month ago. I didn't originally set a cap, tried setting the cap to 5, (with only the Jenkins server and one other machine are running in this account). No machine is starting when I start a new build (but the new job is queued). I get the log I've pasted below. I upgraded everything. No luck. Any thoughts? Thanks, Dave May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud provision Excess workload after pending Spot instances: 1 May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud addProvisionedSlave AMI Instance cap of 1 reached for ami ami-e318378a, not provisioning. May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud provision Excess workload after pending Spot instances: 1 May 13, 2014 3:20:51 AM INFO hudson.plugins.ec2.EC2Cloud addProvisionedSlave AMI Instance cap of 1 reached for ami ami-e318378a, not provisioning.
          diranged Matt Wise added a comment -

          This issue still exists in the 1.21 code. It looks like it also trickles down to the individual server-type definitions, where the "instance cap count" is generated by looking for the # of instances running a particular AMI image.

          Instead, the EC2 plugin should simply tag the machines with identifiers, and be able to search for them that way. It should also be able to recover lists of live running EC2 slaves by searching for these nodes with these tags.

          diranged Matt Wise added a comment - This issue still exists in the 1.21 code. It looks like it also trickles down to the individual server-type definitions, where the "instance cap count" is generated by looking for the # of instances running a particular AMI image. Instead, the EC2 plugin should simply tag the machines with identifiers, and be able to search for them that way. It should also be able to recover lists of live running EC2 slaves by searching for these nodes with these tags.
          kevcheng8 kevin cheng added a comment -

          I have two issues.

          1) Changing the instance cap does not get reflected. Jenkins log continues to show the following

          "INFO: Total instance cap of 5 reached, not provisioning.
          Sep 16, 2014 4:40:10 AM hudson.plugins.ec2.EC2Cloud provision
          INFO: Excess workload after pending Spot instances: 3"

          I learned that from Matt Wise that the count might be related to the AMI. Since I have 20 instances running using the current AMI. I have created a new AMI and change the configuration to use the new AMI instead. However, it still does not launch new EC2 instance when jobs in the queue.

          kevcheng8 kevin cheng added a comment - I have two issues. 1) Changing the instance cap does not get reflected. Jenkins log continues to show the following "INFO: Total instance cap of 5 reached, not provisioning. Sep 16, 2014 4:40:10 AM hudson.plugins.ec2.EC2Cloud provision INFO: Excess workload after pending Spot instances: 3" I learned that from Matt Wise that the count might be related to the AMI. Since I have 20 instances running using the current AMI. I have created a new AMI and change the configuration to use the new AMI instead. However, it still does not launch new EC2 instance when jobs in the queue.

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/dd2d1e63d423e0cfb4ee0a9ca38bfdc2c0c223eb
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Improvement, not a fix. This code checks if the tags match the configured tags
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/dd2d1e63d423e0cfb4ee0a9ca38bfdc2c0c223eb Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Improvement, not a fix. This code checks if the tags match the configured tags

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          src/main/java/hudson/plugins/ec2/SlaveTemplate.java
          http://jenkins-ci.org/commit/ec2-plugin/e81094cc897ddeb456d20bc8c3cc03fafbfa970c
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Now using the specific 'ec2slave' tag
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java src/main/java/hudson/plugins/ec2/SlaveTemplate.java http://jenkins-ci.org/commit/ec2-plugin/e81094cc897ddeb456d20bc8c3cc03fafbfa970c Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Now using the specific 'ec2slave' tag

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/ace707c4a3498384d8d6c382a84200152f17cb1e
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Now using the specific 'ec2slave' tag
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/ace707c4a3498384d8d6c382a84200152f17cb1e Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Now using the specific 'ec2slave' tag

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/f2a97d5204df32f177e18d5ac9e84324c876215d
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Now using the specific 'ec2slave' tag
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/f2a97d5204df32f177e18d5ac9e84324c876215d Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Now using the specific 'ec2slave' tag

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/58fbb7f3a492f4f04c7a32881e6c9946241524fb
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Now using the specific 'ec2slave' tag
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/58fbb7f3a492f4f04c7a32881e6c9946241524fb Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Now using the specific 'ec2slave' tag

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          src/main/java/hudson/plugins/ec2/SlaveTemplate.java
          http://jenkins-ci.org/commit/ec2-plugin/deffa182c4d64fcaf187010814b86df40d61a6a2
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Now using the specific 'ec2slave' tag
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java src/main/java/hudson/plugins/ec2/SlaveTemplate.java http://jenkins-ci.org/commit/ec2-plugin/deffa182c4d64fcaf187010814b86df40d61a6a2 Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Now using the specific 'ec2slave' tag

          Code changed in jenkins
          User: Roland Groen
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          src/main/java/hudson/plugins/ec2/EC2Tag.java
          src/main/java/hudson/plugins/ec2/SlaveTemplate.java
          http://jenkins-ci.org/commit/ec2-plugin/5286a643db63d6199f1a99e2015046e0948c7867
          Log:
          JENKINS-19845, EC2 plugin incorrectly reports current instance count

          • Changed tag name to jenkins_slave_type
          • Made the tag name a static value
          • Cleaned up indentation
          • Removed eclipse import format changes
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Roland Groen Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java src/main/java/hudson/plugins/ec2/EC2Tag.java src/main/java/hudson/plugins/ec2/SlaveTemplate.java http://jenkins-ci.org/commit/ec2-plugin/5286a643db63d6199f1a99e2015046e0948c7867 Log: JENKINS-19845 , EC2 plugin incorrectly reports current instance count Changed tag name to jenkins_slave_type Made the tag name a static value Cleaned up indentation Removed eclipse import format changes

          Code changed in jenkins
          User: Francis Upton
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          src/main/java/hudson/plugins/ec2/EC2Tag.java
          src/main/java/hudson/plugins/ec2/SlaveTemplate.java
          http://jenkins-ci.org/commit/ec2-plugin/2b28c7acb1d45e558c0ad8b306c960ed91d8a43d
          Log:
          Merge pull request #107 from EdiaEducationTechnology/master

          JENKINS-19845

          Compare: https://github.com/jenkinsci/ec2-plugin/compare/b99f8191e990...2b28c7acb1d4

          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Francis Upton Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java src/main/java/hudson/plugins/ec2/EC2Tag.java src/main/java/hudson/plugins/ec2/SlaveTemplate.java http://jenkins-ci.org/commit/ec2-plugin/2b28c7acb1d45e558c0ad8b306c960ed91d8a43d Log: Merge pull request #107 from EdiaEducationTechnology/master JENKINS-19845 Compare: https://github.com/jenkinsci/ec2-plugin/compare/b99f8191e990...2b28c7acb1d4
          martinfr62 martinfr62 added a comment -

          This fix does not resolve the following use cases

          1. Multiple jenkins servers running in the same VPC using the same ami - needed when 1 jenkins cannot manage that many jobs and/or each team wants their own jenkins server

          2. Multple labels using same ami
          2.1 - Different instance sizes for different build jobs/labels (eg use 5 T2Medium and 4 Large of same ami)
          2.2 - Different subnets used for different labels - all sorts of reasons to use multiple subnets (access control just being one of them)
          2.3 - Different instance counts for different labels/builds - so can manage costs differently (eg 2 for nightly builds and 5 for release and 10 for CI) - this allows bunch of CI builds to not block a release build for instance, or for the long running nightly build to not interfere with a CI build.

          Proposed solution - extend current fix of a tag to indicate this is a jenkins instance, to indicate it's a jenkins instance for a specific label - add labels to the EC2Tag.TAG_NAME_JENKINS_SLAVE_TYPE when instance started and match the label requested in isEc2ProvisionedSlave

          Solves both 1 and 2.

          martinfr62 martinfr62 added a comment - This fix does not resolve the following use cases 1. Multiple jenkins servers running in the same VPC using the same ami - needed when 1 jenkins cannot manage that many jobs and/or each team wants their own jenkins server 2. Multple labels using same ami 2.1 - Different instance sizes for different build jobs/labels (eg use 5 T2Medium and 4 Large of same ami) 2.2 - Different subnets used for different labels - all sorts of reasons to use multiple subnets (access control just being one of them) 2.3 - Different instance counts for different labels/builds - so can manage costs differently (eg 2 for nightly builds and 5 for release and 10 for CI) - this allows bunch of CI builds to not block a release build for instance, or for the long running nightly build to not interfere with a CI build. Proposed solution - extend current fix of a tag to indicate this is a jenkins instance, to indicate it's a jenkins instance for a specific label - add labels to the EC2Tag.TAG_NAME_JENKINS_SLAVE_TYPE when instance started and match the label requested in isEc2ProvisionedSlave Solves both 1 and 2.
          francisu Francis Upton added a comment -

          The tag has been extended with the description of the slave template. This should resolve the problems. This was done prior to 1.30.

          francisu Francis Upton added a comment - The tag has been extended with the description of the slave template. This should resolve the problems. This was done prior to 1.30.

          People

            francisu Francis Upton
            simonbeckett Simon Beckett
            Votes:
            4 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: