Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-26854

EC2 slave launch stops working after a while with AmazonServiceException "Request has expired"

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • ec2-plugin
    • None
    • * EC2 plugin version 1.26.
      * Jenkins 1.580.2 running inside the official Jenkins Docker LTS image.
      * Host O/S: Ubuntu 14.04 LTS 64-bit on an EC2 master.
      * EC2 rights are conferred via an EC2 InstanceProfile.

      After Jenkins first starts it is able to launch EC2 slaves, both manually and when jobs indicate they need to use the slave label.

      A few hours later (not sure how long, maybe 24 hours?) slaves no longer start, manually or automatically. In "Manage Jenkins -> System Log -> All Jenkins Logs" the following error occurs repeatedly. Restarting Jenkins solves the problem.

      Started EC2 alive slaves monitor
      Feb 09, 2015 5:14:47 AM INFO hudson.model.AsyncPeriodicWork$1 run
      Finished EC2 alive slaves monitor. 0 ms
      Feb 09, 2015 5:15:51 AM INFO hudson.plugins.ec2.EC2Cloud provision
      Excess workload after pending Spot instances: 1
      Feb 09, 2015 5:15:53 AM WARNING hudson.plugins.ec2.EC2Cloud provision
      Failed to count the # of live instances on EC2
      com.amazonaws.AmazonServiceException: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 59f7935f-15f0-455c-a6f1-f6057f5ffc77)
      	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:886)
      	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:484)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:256)
      	at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:8798)
      	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:4137)
      	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:8087)
      	at hudson.plugins.ec2.EC2Cloud.countCurrentEC2Slaves(EC2Cloud.java:228)
      	at hudson.plugins.ec2.EC2Cloud.addProvisionedSlave(EC2Cloud.java:299)
      	at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:389)
      	at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:281)
      	at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
      	at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:368)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)

      I also don't understand the log statement Excess workload after pending Spot instances: 1 as I have not ticked the "Use Spot instance" tick box.

      In my cloud settings I have ticked the "Use EC2 instance profile to obtain credentials" and have set both the access key and secret key values to "THIS VALUE IS NOT USED - THE INSTANCE PROFILE IS USED INSTEAD".

          [JENKINS-26854] EC2 slave launch stops working after a while with AmazonServiceException "Request has expired"

          James Judd added a comment - - edited

          Thanks for the info Martin.

          I spent some more time looking into this tonight and I think I found the cause. Even better, I think the fix is quite simple. At the moment, in EC2Cloud: we create an AmazonEC2Client like so

          AmazonEC2 client = new AmazonEC2Client(credentialsProvider.getCredentials(), config);
          

          According to the Amazon SDK source this creates a StaticCredentialsProvider using the given credentials. From what I can tell, StaticCredentialsProvider never refreshes its credentials, leading to expiration.

          Instead, you can create an AmazonEC2Client with a credentials provider directly. This should, as far as I can tell, refresh the credentials as needed.

          AmazonEC2 client = new AmazonEC2Client(credentialsProvider, config);
          

          This is further supported by this amazon documentation, which states

          Important

          The automatic credentials refresh happens only when you use the default client constructor, which creates its own InstanceProfileCredentialsProvider as part of the default provider chain, or when you pass an InstanceProfileCredentialsProvider instance directly to the client constructor. If you use another method to obtain or pass instance profile credentials, you are responsible for checking for and refreshing expired credentials.

          [emphasis mine]

          I just uploaded a version of the plugin with this change to our Jenkins server. I'll let it run and report back tomorrow if I see any errors. If it works, I'll create a pull request.

          James Judd added a comment - - edited Thanks for the info Martin. I spent some more time looking into this tonight and I think I found the cause. Even better, I think the fix is quite simple. At the moment, in EC2Cloud: we create an AmazonEC2Client like so AmazonEC2 client = new AmazonEC2Client(credentialsProvider.getCredentials(), config); According to the Amazon SDK source this creates a StaticCredentialsProvider using the given credentials. From what I can tell, StaticCredentialsProvider never refreshes its credentials, leading to expiration. Instead, you can create an AmazonEC2Client with a credentials provider directly . This should, as far as I can tell, refresh the credentials as needed. AmazonEC2 client = new AmazonEC2Client(credentialsProvider, config); This is further supported by this amazon documentation , which states Important The automatic credentials refresh happens only when you use the default client constructor, which creates its own InstanceProfileCredentialsProvider as part of the default provider chain, or when you pass an InstanceProfileCredentialsProvider instance directly to the client constructor . If you use another method to obtain or pass instance profile credentials, you are responsible for checking for and refreshing expired credentials. [emphasis mine] I just uploaded a version of the plugin with this change to our Jenkins server. I'll let it run and report back tomorrow if I see any errors. If it works, I'll create a pull request.

          James Judd added a comment -

          It's been almost 24 hours and I have not had any expirations. Created PR #148

          James Judd added a comment - It's been almost 24 hours and I have not had any expirations. Created PR #148

          Code changed in jenkins
          User: James Judd
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/2855d1d925dcfe92043ec6ce0b58111c116eb330
          Log:
          Creates an AmazonEC2Client with an AWSCredentialsProvider instead of the
          AWSCredentials directly. This is done so the credentials will refresh
          instead of expire. Resolves JENKINS-26854.

          At the moment, in EC2Cloud we create an AmazonEC2Client by passing in
          the credentials directly. This creates a StaticCredentialsProvider using
          the given credentials. StaticCredentialsProvider never refreshes its
          credentials, leading to expiration. Instead, you can create an
          AmazonEC2Client with a credentials provider directly. This refreshes the
          credentials as needed.

          See http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-roles.html
          for more information.AWSCredentials directly. This is done so the
          credentials will refresh instead of expire.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: James Judd Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/2855d1d925dcfe92043ec6ce0b58111c116eb330 Log: Creates an AmazonEC2Client with an AWSCredentialsProvider instead of the AWSCredentials directly. This is done so the credentials will refresh instead of expire. Resolves JENKINS-26854 . At the moment, in EC2Cloud we create an AmazonEC2Client by passing in the credentials directly. This creates a StaticCredentialsProvider using the given credentials. StaticCredentialsProvider never refreshes its credentials, leading to expiration. Instead, you can create an AmazonEC2Client with a credentials provider directly. This refreshes the credentials as needed. See http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-roles.html for more information.AWSCredentials directly. This is done so the credentials will refresh instead of expire.

          Code changed in jenkins
          User: Francis Upton
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/1fa2ee1126e007d874cb40d5dee25a031746a635
          Log:
          Merge pull request #148 from jjudd/request-expired

          JENKINS-26854: Fixing 'RequestExpired'

          Compare: https://github.com/jenkinsci/ec2-plugin/compare/b576bb3163db...1fa2ee1126e0

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Francis Upton Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/1fa2ee1126e007d874cb40d5dee25a031746a635 Log: Merge pull request #148 from jjudd/request-expired JENKINS-26854 : Fixing 'RequestExpired' Compare: https://github.com/jenkinsci/ec2-plugin/compare/b576bb3163db...1fa2ee1126e0

          Ximon Eighteen added a comment - - edited

          I see that there is a 1.28 release tag which includes fixes for this issue. However I don't see a 1.28 release at the Jenkins updates site or on the ec2 plugin home page. Also this issue is still marked as 'Open' - has the 1.28 build been tested by anyone following this issue? When will this 1.28 version be released?

          Additional: Can I test using this HPI? https://buildhive.cloudbees.com/job/jenkinsci/job/ec2-plugin/115/org.jenkins-ci.plugins$ec2/artifact/org.jenkins-ci.plugins/ec2/1.28-SNAPSHOT/ec2-1.28-SNAPSHOT.hpi

          Ximon Eighteen added a comment - - edited I see that there is a 1.28 release tag which includes fixes for this issue. However I don't see a 1.28 release at the Jenkins updates site or on the ec2 plugin home page . Also this issue is still marked as 'Open' - has the 1.28 build been tested by anyone following this issue? When will this 1.28 version be released? Additional: Can I test using this HPI? https://buildhive.cloudbees.com/job/jenkinsci/job/ec2-plugin/115/org.jenkins-ci.plugins$ec2/artifact/org.jenkins-ci.plugins/ec2/1.28-SNAPSHOT/ec2-1.28-SNAPSHOT.hpi

          Francis Upton added a comment -

          @Ximon, I had intended to do the release and I think something went wrong. I will try to fix this today.

          Francis Upton added a comment - @Ximon, I had intended to do the release and I think something went wrong. I will try to fix this today.

          James Judd added a comment -

          francisu Just curious when 1.28 will be released.

          James Judd added a comment - francisu Just curious when 1.28 will be released.

          Francis Upton added a comment -

          I think the release actually worked this time, so by tomorrow it should be present on the wiki page and available.

          Francis Upton added a comment - I think the release actually worked this time, so by tomorrow it should be present on the wiki page and available.

          I updated to 1.28 yesterday and have disabled my auto-restart job.

          After 19 hours of uptime, no expired credentials. I think things are copacetic.

          Vincent Rivellino added a comment - I updated to 1.28 yesterday and have disabled my auto-restart job. After 19 hours of uptime, no expired credentials. I think things are copacetic.

          Francis Upton added a comment -

          Fixed in 1.28.

          Francis Upton added a comment - Fixed in 1.28.

            francisu Francis Upton
            ximon18 Ximon Eighteen
            Votes:
            5 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: