Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36516

EC2 plugin: Plugin fails to automatically launch slaves to meet demand

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • ec2-plugin
    • None
    • Jenkins v2.12 / Ubuntu 14.04.4 x86_64 / Oracle Java 1.80_91-b14

      Hi,

      We've recently started seeing a problem auto-launching EC2 build slaves. The plugin detects that it needs to launch a slave to meet demand and that no current instances are runnign - it just doesn't launch anything.

      Manual provisioning works.

      Log (some info sanitised):

      Jul 08, 2016 3:47:34 PM hudson.plugins.ec2.EC2Cloud provision
      INFO: Attempting to provision slave from template hudson.plugins.ec2.SlaveTemplate@6a756eca needed by excess workload of 1 units of label 'null'
      Jul 08, 2016 3:47:34 PM hudson.plugins.ec2.EC2Cloud provision
      WARNING: Label is null - can't calculate how many executors slave will have. Using 2 number of executors
      Considering launching ami-6c14310f for template Remote Ubuntu 14.04 Docker Machine
      Jul 08, 2016 3:47:34 PM hudson.plugins.ec2.SlaveTemplate logProvisionInfo
      INFO: Considering launching ami-6c14310f for template Remote Ubuntu 14.04 Docker Machine
      Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate
      Jul 08, 2016 3:47:34 PM hudson.plugins.ec2.SlaveTemplate logProvisionInfo
      INFO: Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate
      Looking for existing instances with describe-instance: {InstanceIds: [],Filters: [{Name: image-id,Values: [ami-6c14310f]}, {Name: subnet-id,Values: [subnet-XXXXX]}, {Name: instance.group-id,Values: [sg-XXXXX]}, {Name: key-name,Values: [XXXXX]}, {Name: instance-type,Values: [t2.small]}, {Name: tag:Name,Values: [Jenkins Build Slave]}],}
      No existing instance found - but cannot create new instance
      Jul 08, 2016 3:47:34 PM hudson.plugins.ec2.EC2Cloud provision
      INFO: Attempting provision - finished, excess workload: 1
      

          [JENKINS-36516] EC2 plugin: Plugin fails to automatically launch slaves to meet demand

          Vlad Korolev added a comment -

          More information.  This problem seems to be very intermittent we had a bunch of builds today and all instances got launched correctly

          Vlad Korolev added a comment - More information.  This problem seems to be very intermittent we had a bunch of builds today and all instances got launched correctly

          Steve Talbot added a comment -

          Plugin v1.36 will launch a slave and successfully run a build for us. Plugin v1.37 launches the EC2 instance, but fails to recognise that it can use it to build.

          Steve Talbot added a comment - Plugin v1.36 will launch a slave and successfully run a build for us. Plugin v1.37 launches the EC2 instance, but fails to recognise that it can use it to build.

          si Joey added a comment -

          I am also having similar  issues with v1.37 of the plugin.  Builds run fine the first time, but after the agent timeout from inactivity, no new builds will start, however, it will work again if I change something in the plugins' profile.  Moreover, the instance remains on in the AWS console.  It's supposed to terminate after 30 minutes of inactivity.

          I see the following errors in the log:

          Nov 27, 2017 9:43:58 PM hudson.plugins.ec2.EC2Cloud provision
          WARNING: Exception during provisioning
          com.amazonaws.services.ec2.model.AmazonEC2Exception: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 309aa7e7-89e0-42c4-9df5-5b4985222ba5)

          si Joey added a comment - I am also having similar  issues with v1.37 of the plugin.  Builds run fine the first time, but after the agent timeout from inactivity, no new builds will start, however, it will work again if I change something in the plugins' profile.  Moreover, the instance remains on in the AWS console.  It's supposed to terminate after 30 minutes of inactivity. I see the following errors in the log: Nov 27, 2017 9:43:58 PM hudson.plugins.ec2.EC2Cloud provision WARNING: Exception during provisioning com.amazonaws.services.ec2.model.AmazonEC2Exception: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 309aa7e7-89e0-42c4-9df5-5b4985222ba5)

          I've had this issue after upgrading Jenkins from LTS to Weekly.

          But it was "accidentally" fixed by:

           - Increased instance cap from 1 to 2 -> then changed back to 1

           - Checked AWS console - there was a "zombie" instance which was created by Jenkins, but no longer visible on Nodes page

          Oleksandr Tereshchuk added a comment - I've had this issue after upgrading Jenkins from LTS to Weekly. But it was "accidentally" fixed by:  - Increased instance cap from 1 to 2 -> then changed back to 1  - Checked AWS console - there was a "zombie" instance which was created by Jenkins, but no longer visible on Nodes page

          si Joey added a comment -

          I was able to correct my issue with the plugin by checking the box for, "Use EC2 instance profile to obtain credentials." With that check, the ec2 agent was able to get new credentials to launch.

          si Joey added a comment - I was able to correct my issue with the plugin by checking the box for, "Use EC2 instance profile to obtain credentials." With that check, the ec2 agent was able to get new credentials to launch.

          We experienced the same issue, the "com.amazonaws.services.ec2.model.AmazonEC2Exception: Request has expired" error started appearing randomly after launched new nodes ok with the EC2 plugin for a while.

          Updating the server configuration would solve the issue, but only temporarily.

          After upgrading to Jenkins LTS 2.89.3 with 1.38 EC2 plugin the issue does not happen anymore. Not 100% sure though if the bug is really fixed.

          Stefan Verhoeff added a comment - We experienced the same issue, the " com.amazonaws.services.ec2.model.AmazonEC2Exception: Request has expired" error started appearing randomly after launched new nodes ok with the EC2 plugin for a while. Updating the server configuration would solve the issue, but only temporarily. After upgrading to Jenkins LTS 2.89.3 with 1.38 EC2 plugin the issue does not happen anymore. Not 100% sure though if the bug is really fixed.

          Francis Upton added a comment -

          Please try 1.39 of the plugin, particularly if you are upgrading from 1.36 or earlier.

          Francis Upton added a comment - Please try 1.39 of the plugin, particularly if you are upgrading from 1.36 or earlier.

          Ben Copeland added a comment -

          Still an issue in 1.42

          Ben Copeland added a comment - Still an issue in 1.42

          bithead is this still an inssue in 1.45?

          Raihaan Shouhell added a comment - bithead is this still an inssue in 1.45?

          ovi craciun added a comment - - edited

          still an issue with 1.48
          I can see this log but the spotinst mentioned is never spawned, no other error:

          ```
          Spot instance id in provision: sir-px3sj14h
          SlaveTemplate

          {ami='ami-17de70f54e3bad981', labels='windows'}

          . Attempting provision finished, excess workload: 0
          We have now 5 computers, waiting for 1 more
          ```

          and then
          ```
          Started EC2 alive slaves monitor
          EC2 alive slaves monitor. 112 ms
          ```

          and
          ```
          Available capacity=1, currentDemand=1
          Provisioning completedFinished
          ```

          however capacity is 0 cause there is a build in progress already and one in the waiting queue.

          and thus, the build queue keeps having that one waiting build to pe pushed to a provisioned slave but the slave is never provisioned.

          Later: even when there's no build in progress, I see this message
          `Available capacity=1, currentDemand=1`
          and a new slave is not provisioned when there's a build waiting in the build queue.

          How can I get around this mess?

          ovi craciun added a comment - - edited still an issue with 1.48 I can see this log but the spotinst mentioned is never spawned, no other error: ``` Spot instance id in provision: sir-px3sj14h SlaveTemplate {ami='ami-17de70f54e3bad981', labels='windows'} . Attempting provision finished, excess workload: 0 We have now 5 computers, waiting for 1 more ``` and then ``` Started EC2 alive slaves monitor EC2 alive slaves monitor. 112 ms ``` and ``` Available capacity=1, currentDemand=1 Provisioning completedFinished ``` however capacity is 0 cause there is a build in progress already and one in the waiting queue. and thus, the build queue keeps having that one waiting build to pe pushed to a provisioned slave but the slave is never provisioned. Later: even when there's no build in progress, I see this message `Available capacity=1, currentDemand=1` and a new slave is not provisioned when there's a build waiting in the build queue. How can I get around this mess?

            francisu Francis Upton
            torpy Chin Godawita
            Votes:
            10 Vote for this issue
            Watchers:
            25 Start watching this issue

              Created:
              Updated: