Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32779

Spot instance hit cap when sharing a AMI

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Hi since 1.31 we have had issues of spot slaves being unable to be created due to it exceeding the cap.

      Our setup is:
      Base AMI for our build process
      Multiple VPC to our different environments
      Cap on each AMI in different VPC

      The cap count seems to be for all AMI, not AMI per config, so we could have spot in 4 VPC but can't launch one in the 5th VPC.

      This is for automatic launch and for manual launching of the spot instance in the 5th VPC.

      Docs say: "Limit the total number of running instances launched from this AMI and associated configurations."

      This does not seem to be the case with 1.31, seems to just be "Limit the total number of running instances launched from this AMI"

        Attachments

          Activity

          Hide
          jjudd James Judd added a comment - - edited

          Also seeing this since 1.31

          2 ec2 configs in the same cloud sharing the same AMI. First config with limit 4. Second config with limit 2.

          2 instances of config 1 are launched -> cannot launch any of config 2
          Change instance limit of config 2 to 3+ -> config 2 launches

          My guess is that the limits are read per config but the total when launching an instance from a config is calculated from all instances using that AMI.

          Note: If it makes any difference, the instances above were all spot instances.

          Show
          jjudd James Judd added a comment - - edited Also seeing this since 1.31 2 ec2 configs in the same cloud sharing the same AMI. First config with limit 4. Second config with limit 2. 2 instances of config 1 are launched -> cannot launch any of config 2 Change instance limit of config 2 to 3+ -> config 2 launches My guess is that the limits are read per config but the total when launching an instance from a config is calculated from all instances using that AMI. Note: If it makes any difference, the instances above were all spot instances.
          Hide
          gijsk Gijs Kunze added a comment -

          I have the same issue, but it seems to be similar to JENKINS-32584 because if other (unrelated, not the same AMI or instance tags or anything) spot instances are running my jenkins server doesn't start any and jobs never get processed.

          I've reverted to 1.29 so my builds work again.

          Show
          gijsk Gijs Kunze added a comment - I have the same issue, but it seems to be similar to JENKINS-32584 because if other (unrelated, not the same AMI or instance tags or anything) spot instances are running my jenkins server doesn't start any and jobs never get processed. I've reverted to 1.29 so my builds work again.
          Hide
          johnny_shields Johnny Shields added a comment -
          Show
          johnny_shields Johnny Shields added a comment - Is this the same as https://issues.jenkins-ci.org/browse/JENKINS-34667 ?
          Hide
          francisu Francis Upton added a comment -

          Johnny Shields I don't think so.

          Show
          francisu Francis Upton added a comment - Johnny Shields I don't think so.
          Hide
          joshma Joshua Ma added a comment -

          We've had this issue on and off for the past few months, so it spans several versions. Currently on 1.36 and seeing this. We have 4 configs all using the same AMI.

          Caps: 6, 40, 5, 25

          Currently running: 2, 18, 5, 19 (total of 44)

          We now can't run anymore. We had 46, I terminated 2, and now I can't launch 2, so somehow it let me get to a number it doesn't like.

          I bumped our "40" cap to "45", hoping that would let me launch 1 more, but I still get "java.lang.Exception: Cloud or AMI instance cap would be exceeded for: c4.large.gittest.spot"

          I added a log recorder, and it says: "Available Total Slaves: 56 Available AMI slaves: -12 AMI: ami-99022c87 TemplateDesc: c4.large.gittest.spot"

          I don't know where -12 is coming from, but that's probably related.

           

          Show
          joshma Joshua Ma added a comment - We've had this issue on and off for the past few months, so it spans several versions. Currently on 1.36 and seeing this. We have 4 configs all using the same AMI. Caps: 6, 40, 5, 25 Currently running: 2, 18, 5, 19 (total of 44) We now can't run anymore. We had 46, I terminated 2, and now I can't launch 2, so somehow it let me get to a number it doesn't like. I bumped our "40" cap to "45", hoping that would let me launch 1 more, but I still get "java.lang.Exception: Cloud or AMI instance cap would be exceeded for: c4.large.gittest.spot" I added a log recorder, and it says: "Available Total Slaves: 56 Available AMI slaves: -12 AMI: ami-99022c87 TemplateDesc: c4.large.gittest.spot" I don't know where -12 is coming from, but that's probably related.  
          Hide
          mcating Mike Cating added a comment - - edited

          Also observed on Amazon EC2 plugin v1.38.

          Two node types, both use same AMI in the same VPC, on-demand instances. Instance cap set at 10 per node type. After the 10th of one node type is launched, get "Cloud or AMI instance cap would be exceeded for: <node type>"  when trying to launch a node of the other node type either automatically or through manual provisioning.

          Interestingly, saw the exact same behavior after cloning the AMI and changing one of the two node types to the new AMI. Wondering if the limit is applied to ALL of the EC2 instances launched, rather than just the ones launched for a given AMI.

          UPDATE: Discovered that our problem was a limit on the Global Instance cap, rather than the instance cap on the node type. Once I updated the global instance cap, the problem went away both on separate AMIs and on the same AMIs. Now thinking that this bug might be limited to ONLY spot instances, since we currently use on-demand.

           

          Show
          mcating Mike Cating added a comment - - edited Also observed on Amazon EC2 plugin v1.38. Two node types, both use same AMI in the same VPC, on-demand instances. Instance cap set at 10 per node type. After the 10th of one node type is launched, get "Cloud or AMI instance cap would be exceeded for: <node type>"  when trying to launch a node of the other node type either automatically or through manual provisioning. Interestingly, saw the exact same behavior after cloning the AMI and changing one of the two node types to the new AMI. Wondering if the limit is applied to ALL of the EC2 instances launched, rather than just the ones launched for a given AMI. UPDATE: Discovered that our problem was a limit on the Global Instance cap, rather than the instance cap on the node type. Once I updated the global instance cap, the problem went away both on separate AMIs and on the same AMIs. Now thinking that this bug might be limited to ONLY spot instances, since we currently use on-demand.  
          Hide
          arne_brys Arne Brys added a comment -

          Some time ago submitted created a pull request that fixes the issue for us.

          https://github.com/jenkinsci/ec2-plugin/pull/248

          Show
          arne_brys Arne Brys added a comment - Some time ago submitted created a pull request that fixes the issue for us. https://github.com/jenkinsci/ec2-plugin/pull/248

            People

            Assignee:
            francisu Francis Upton
            Reporter:
            jpd4nt John-Paul Drawneek
            Votes:
            4 Vote for this issue
            Watchers:
            8 Start watching this issue

              Dates

              Created:
              Updated: