Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-26854

EC2 slave launch stops working after a while with AmazonServiceException "Request has expired"

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • ec2-plugin
    • None
    • * EC2 plugin version 1.26.
      * Jenkins 1.580.2 running inside the official Jenkins Docker LTS image.
      * Host O/S: Ubuntu 14.04 LTS 64-bit on an EC2 master.
      * EC2 rights are conferred via an EC2 InstanceProfile.

      After Jenkins first starts it is able to launch EC2 slaves, both manually and when jobs indicate they need to use the slave label.

      A few hours later (not sure how long, maybe 24 hours?) slaves no longer start, manually or automatically. In "Manage Jenkins -> System Log -> All Jenkins Logs" the following error occurs repeatedly. Restarting Jenkins solves the problem.

      Started EC2 alive slaves monitor
      Feb 09, 2015 5:14:47 AM INFO hudson.model.AsyncPeriodicWork$1 run
      Finished EC2 alive slaves monitor. 0 ms
      Feb 09, 2015 5:15:51 AM INFO hudson.plugins.ec2.EC2Cloud provision
      Excess workload after pending Spot instances: 1
      Feb 09, 2015 5:15:53 AM WARNING hudson.plugins.ec2.EC2Cloud provision
      Failed to count the # of live instances on EC2
      com.amazonaws.AmazonServiceException: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 59f7935f-15f0-455c-a6f1-f6057f5ffc77)
      	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:886)
      	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:484)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:256)
      	at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:8798)
      	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:4137)
      	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:8087)
      	at hudson.plugins.ec2.EC2Cloud.countCurrentEC2Slaves(EC2Cloud.java:228)
      	at hudson.plugins.ec2.EC2Cloud.addProvisionedSlave(EC2Cloud.java:299)
      	at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:389)
      	at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:281)
      	at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
      	at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:368)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)

      I also don't understand the log statement Excess workload after pending Spot instances: 1 as I have not ticked the "Use Spot instance" tick box.

      In my cloud settings I have ticked the "Use EC2 instance profile to obtain credentials" and have set both the access key and secret key values to "THIS VALUE IS NOT USED - THE INSTANCE PROFILE IS USED INSTEAD".

          [JENKINS-26854] EC2 slave launch stops working after a while with AmazonServiceException "Request has expired"

          Ximon Eighteen created issue -
          Ximon Eighteen made changes -
          Description Original: After Jenkins first starts it is able to launch EC2 slaves, both manually and when jobs indicate they need to use the slave label.

          A few hours later (not sure how long, maybe 24 hours?) slaves no longer start, manually or automatically. In "Manage Jenkins -> System Log -> All Jenkins Logs" the following error occurs repeatedly. Restarting Jenkins solves the problem.

          {code}Started EC2 alive slaves monitor
          Feb 09, 2015 5:14:47 AM INFO hudson.model.AsyncPeriodicWork$1 run
          Finished EC2 alive slaves monitor. 0 ms
          Feb 09, 2015 5:15:51 AM INFO hudson.plugins.ec2.EC2Cloud provision
          Excess workload after pending Spot instances: 1
          Feb 09, 2015 5:15:53 AM WARNING hudson.plugins.ec2.EC2Cloud provision
          Failed to count the # of live instances on EC2
          com.amazonaws.AmazonServiceException: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 59f7935f-15f0-455c-a6f1-f6057f5ffc77)
          at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:886)
          at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:484)
          at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:256)
          at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:8798)
          at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:4137)
          at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:8087)
          at hudson.plugins.ec2.EC2Cloud.countCurrentEC2Slaves(EC2Cloud.java:228)
          at hudson.plugins.ec2.EC2Cloud.addProvisionedSlave(EC2Cloud.java:299)
          at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:389)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:281)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
          at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:368)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745){code}

          I also don't understand the log statement {{Excess workload after pending Spot instances: 1}} as I have not ticked the "Use Spot instance" tick box.
          New: After Jenkins first starts it is able to launch EC2 slaves, both manually and when jobs indicate they need to use the slave label.

          A few hours later (not sure how long, maybe 24 hours?) slaves no longer start, manually or automatically. In "Manage Jenkins -> System Log -> All Jenkins Logs" the following error occurs repeatedly. Restarting Jenkins solves the problem.

          {code}Started EC2 alive slaves monitor
          Feb 09, 2015 5:14:47 AM INFO hudson.model.AsyncPeriodicWork$1 run
          Finished EC2 alive slaves monitor. 0 ms
          Feb 09, 2015 5:15:51 AM INFO hudson.plugins.ec2.EC2Cloud provision
          Excess workload after pending Spot instances: 1
          Feb 09, 2015 5:15:53 AM WARNING hudson.plugins.ec2.EC2Cloud provision
          Failed to count the # of live instances on EC2
          com.amazonaws.AmazonServiceException: Request has expired. (Service: AmazonEC2; Status Code: 400; Error Code: RequestExpired; Request ID: 59f7935f-15f0-455c-a6f1-f6057f5ffc77)
          at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:886)
          at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:484)
          at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:256)
          at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:8798)
          at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:4137)
          at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:8087)
          at hudson.plugins.ec2.EC2Cloud.countCurrentEC2Slaves(EC2Cloud.java:228)
          at hudson.plugins.ec2.EC2Cloud.addProvisionedSlave(EC2Cloud.java:299)
          at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:389)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:281)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
          at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:368)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745){code}

          I also don't understand the log statement {{Excess workload after pending Spot instances: 1}} as I have not ticked the "Use Spot instance" tick box.

          In my cloud settings I have ticked the "Use EC2 instance profile to obtain credentials" and have set both the access key and secret key values to "THIS VALUE IS NOT USED - THE INSTANCE PROFILE IS USED INSTEAD".

          Environment Original: Jenkins 1.580.2 running inside the official Jenkins Docker LTS image, running inside Ubuntu 14.04 LTS 64-bit on an EC2 master. New: Jenkins 1.580.2 running inside the official Jenkins Docker LTS image, running inside Ubuntu 14.04 LTS 64-bit on an EC2 master. EC2 rights are conferred via an EC2 InstanceProfile.
          EC2 plugin version 1.26.
          Ximon Eighteen made changes -
          Environment Original: Jenkins 1.580.2 running inside the official Jenkins Docker LTS image, running inside Ubuntu 14.04 LTS 64-bit on an EC2 master. EC2 rights are conferred via an EC2 InstanceProfile.
          EC2 plugin version 1.26.
          New: * EC2 plugin version 1.26.
          * Jenkins 1.580.2 running inside the official Jenkins Docker LTS image.
          * Host O/S: Ubuntu 14.04 LTS 64-bit on an EC2 master.
          * EC2 rights are conferred via an EC2 InstanceProfile.

          I've been looking at the underlying AWS SDK code and it looks like it already has built in support for refreshing the credentials before they expire, and that this support is already being used by the Jenkins EC2 plugin. The only things I can think of are either that NTP isn't working properly and clock drift causes the problem, or that the synchronous mode of credential refresh being used by the Jenkins EC2 plugin doesn't work for some reason and perhaps the asynchronous background thread mode needs to be used? I will investigate NTP on my side.

          Ximon Eighteen added a comment - I've been looking at the underlying AWS SDK code and it looks like it already has built in support for refreshing the credentials before they expire, and that this support is already being used by the Jenkins EC2 plugin. The only things I can think of are either that NTP isn't working properly and clock drift causes the problem, or that the synchronous mode of credential refresh being used by the Jenkins EC2 plugin doesn't work for some reason and perhaps the asynchronous background thread mode needs to be used? I will investigate NTP on my side.

          Ximon Eighteen added a comment - - edited

          My investigation into NTP hasn't found any problems:

          ubuntu@ip-172-30-0-149:~$ ntpq -pn
               remote           refid      st t when poll reach   delay   offset  jitter
          ==============================================================================
          -87.232.1.40     62.231.32.35     4 u 1037 1024  377    4.120   -1.394   2.634
          +78.143.174.10   193.1.219.116    2 u  382 1024  377   71.766   -7.342  48.358
          -86.43.77.42     193.120.10.3     2 u  624 1024  337   40.723    8.783   0.787
          +85.91.1.180     195.66.241.2     2 u   94 1024  377    1.812   -3.254   0.835
          *91.189.89.199   192.93.2.20      2 u  741 1024  377   10.945   -1.322   0.847
          

          The * shows the NTP server being used, and the reach 377 value shows that the NTP daemon was repeatedly able to contact the remote time server, the stratum number is low which is good, and the jitter and offset values are low which is good. Executing 'date' in both the Ubuntu host and the Docker container yields the same date and time.

          Unrelated to the NTP investigation, but related to my point about restarting Jenkins in the initial post, if I invoke http://<jenkins>/safeRestart then the slave is started correctly after Jenkins restarts, without my fixing any clocks.

          Ximon Eighteen added a comment - - edited My investigation into NTP hasn't found any problems: ubuntu@ip-172-30-0-149:~$ ntpq -pn remote refid st t when poll reach delay offset jitter ============================================================================== -87.232.1.40 62.231.32.35 4 u 1037 1024 377 4.120 -1.394 2.634 +78.143.174.10 193.1.219.116 2 u 382 1024 377 71.766 -7.342 48.358 -86.43.77.42 193.120.10.3 2 u 624 1024 337 40.723 8.783 0.787 +85.91.1.180 195.66.241.2 2 u 94 1024 377 1.812 -3.254 0.835 *91.189.89.199 192.93.2.20 2 u 741 1024 377 10.945 -1.322 0.847 The * shows the NTP server being used, and the reach 377 value shows that the NTP daemon was repeatedly able to contact the remote time server, the stratum number is low which is good, and the jitter and offset values are low which is good. Executing 'date' in both the Ubuntu host and the Docker container yields the same date and time. Unrelated to the NTP investigation, but related to my point about restarting Jenkins in the initial post, if I invoke http://<jenkins>/safeRestart then the slave is started correctly after Jenkins restarts, without my fixing any clocks.

          Ximon Eighteen added a comment - - edited

          I have no reason to think this is the cause of my problem but I just noticed that the plugin is built using v1.8.3 of the Java SDK while the latest version is 1.9.17, in theory at least there could be a bug fix in the newer versions. I looked through the release notes of the interim versions but didn't find an obvious bug fix that could be related to this issue.

          Update: Actually the release 1.8.10 of the Java SDK added the InstanceProfilerCredentialsProvider(true) behaviour I refer to above, but this was not directly mentioned in the release notes. This release was not a happy release, two hot fixes 1.8.10.1 and 1.8.10.2 were released in the following week.

          Update: If I build the EC2 plugin with the latest 1.9.17 SDK version it fails with an HTTP 401 Auth error which I haven't tracked down yet. I suspect this is because my AWS IAM InstanceProfile role does not include a permission which newer versions of the SDK require, but I haven't determined which permission is missing yet.

          Ximon Eighteen added a comment - - edited I have no reason to think this is the cause of my problem but I just noticed that the plugin is built using v1.8.3 of the Java SDK while the latest version is 1.9.17, in theory at least there could be a bug fix in the newer versions. I looked through the release notes of the interim versions but didn't find an obvious bug fix that could be related to this issue. Update: Actually the release 1.8.10 of the Java SDK added the InstanceProfilerCredentialsProvider(true) behaviour I refer to above, but this was not directly mentioned in the release notes . This release was not a happy release, two hot fixes 1.8.10.1 and 1.8.10.2 were released in the following week. Update: If I build the EC2 plugin with the latest 1.9.17 SDK version it fails with an HTTP 401 Auth error which I haven't tracked down yet. I suspect this is because my AWS IAM InstanceProfile role does not include a permission which newer versions of the SDK require, but I haven't determined which permission is missing yet.

          Ximon Eighteen added a comment - - edited

          I'm going to put some logging into a subclass of InstanceProfilerCredentialsProvider() because I suspect that for some reason the credentials are not being refreshed by this class... will let you know what I find out.

          Update: Indeed the EC2 credential refresh functionality of the AWS Java SDK is not invoked. See attached jenkins.log (look for lines containing "Ximon:") and gitdiff.txt. Rough highlights from the log (might not be entirely accurate, trying to remember what I did last night):

          1. 11:33:00 UTC: Jenkins finished responding to a /safeRestart request that I performed.
          2. 11:34:36 UTC: I instructed Jenkins to launch a new EC2 slave. The EC2 plugin fetched the EC2 credentials as part of launching the instance.
          3. 11:45:46 UTC: I instructed Jenkins to terminate the EC2 slave. The EC2 plugin did not refetch the credentials.
          4. 11:46:09 UTC: I instructed Jenkins to launch a new EC2 slave. The EC2 plugin did not refetch the credentials.
          5. 12:22:46 UTC: The EC2 plugin correctly stopped the EC2 slave instance after the idle timeout expired. The EC2 plugin did not refetch the credentials.
          6. 04:32:58 UTC: UNRELATED BUG: Jenkins logged "Making <NODE NAME> (i-a2a41545) offline because it’s not responding". Why is this logged over 4 hours after the EC2 plugin stopped the slave?
          7. 05:30:22 UTC: I refreshed the Jenkins log web page which seems to have caused the EC2 plugin to attempt to update its knowledge about the state of the slave. By this point the EC2 credentials have expired, but the EC2 plugin did not refetch the credentials. The call to the EC2 API failed with HTTP 400 AmazonServiceException "Request has expired".

          Ximon Eighteen added a comment - - edited I'm going to put some logging into a subclass of InstanceProfilerCredentialsProvider() because I suspect that for some reason the credentials are not being refreshed by this class... will let you know what I find out. Update: Indeed the EC2 credential refresh functionality of the AWS Java SDK is not invoked. See attached jenkins.log (look for lines containing "Ximon:") and gitdiff.txt . Rough highlights from the log (might not be entirely accurate, trying to remember what I did last night): 11:33:00 UTC: Jenkins finished responding to a /safeRestart request that I performed. 11:34:36 UTC: I instructed Jenkins to launch a new EC2 slave. The EC2 plugin fetched the EC2 credentials as part of launching the instance. 11:45:46 UTC: I instructed Jenkins to terminate the EC2 slave. The EC2 plugin did not refetch the credentials. 11:46:09 UTC: I instructed Jenkins to launch a new EC2 slave. The EC2 plugin did not refetch the credentials. 12:22:46 UTC: The EC2 plugin correctly stopped the EC2 slave instance after the idle timeout expired. The EC2 plugin did not refetch the credentials. 04:32:58 UTC: UNRELATED BUG: Jenkins logged "Making <NODE NAME> (i-a2a41545) offline because it’s not responding" . Why is this logged over 4 hours after the EC2 plugin stopped the slave? 05:30:22 UTC: I refreshed the Jenkins log web page which seems to have caused the EC2 plugin to attempt to update its knowledge about the state of the slave. By this point the EC2 credentials have expired, but the EC2 plugin did not refetch the credentials. The call to the EC2 API failed with HTTP 400 AmazonServiceException "Request has expired".

          Ximon Eighteen added a comment - - edited

          I tried building the EC2 plugin with AWS Java SDK 1.8.11, the first stable release after 1.8.10 that introduced the new InstanceProfileCredentialsProvider(true) functionality. This solves the missing refetch of credentials, causing the SDK to check them once a minute. However, it fails with the Auth error I referred to above:

          com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0ad2caa1-4f05-41e7-b168-3dc37940265b)
          	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1032)
          	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:687)
          	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
          	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:292)
          	at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:9225)
          	at com.amazonaws.services.ec2.AmazonEC2Client.describeKeyPairs(AmazonEC2Client.java:6321)
          	at com.amazonaws.services.ec2.AmazonEC2Client.describeKeyPairs(AmazonEC2Client.java:8879)
          	at hudson.plugins.ec2.EC2PrivateKey.find(EC2PrivateKey.java:135)
          	at hudson.plugins.ec2.SlaveTemplate.getKeyPair(SlaveTemplate.java:719)
          	at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:303)
          	at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:287)
          	at hudson.plugins.ec2.EC2Cloud.doProvision(EC2Cloud.java:283)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:606)
          	at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298)
          	at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161)
          	at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96)
          	at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:121)
          	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
          	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:745)
          	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:875)
          	at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249)
          	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
          	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:745)
          	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:875)
          	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:648)
          	at org.kohsuke.stapler.Stapler.service(Stapler.java:237)
          	at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
          	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494)
          	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96)
          	at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          	at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
          	at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
          	at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
          	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          	at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
          	at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          	at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          	at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
          	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
          	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
          	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
          	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
          	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
          	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
          	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
          	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
          	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
          	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
          	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
          	at org.eclipse.jetty.server.Server.handle(Server.java:370)
          	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
          	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:960)
          	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1021)
          	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)
          	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
          	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
          	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
          	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
          	at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          

          I suspect this is something to do with my IAM configuration, and not with the EC2 plugin or the Java SDK. So I suspect that upgrading the SDK to 1.8.11 and invoking the InstanceProfileCredentialProvider(true) constructor will solve this issue for other people.

          Ximon Eighteen added a comment - - edited I tried building the EC2 plugin with AWS Java SDK 1.8.11, the first stable release after 1.8.10 that introduced the new InstanceProfileCredentialsProvider(true) functionality. This solves the missing refetch of credentials, causing the SDK to check them once a minute. However, it fails with the Auth error I referred to above: com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0ad2caa1-4f05-41e7-b168-3dc37940265b) at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1032) at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:687) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:292) at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:9225) at com.amazonaws.services.ec2.AmazonEC2Client.describeKeyPairs(AmazonEC2Client.java:6321) at com.amazonaws.services.ec2.AmazonEC2Client.describeKeyPairs(AmazonEC2Client.java:8879) at hudson.plugins.ec2.EC2PrivateKey.find(EC2PrivateKey.java:135) at hudson.plugins.ec2.SlaveTemplate.getKeyPair(SlaveTemplate.java:719) at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:303) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:287) at hudson.plugins.ec2.EC2Cloud.doProvision(EC2Cloud.java:283) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:121) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:745) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:875) at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:745) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:875) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:648) at org.kohsuke.stapler.Stapler.service(Stapler.java:237) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:960) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1021) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I suspect this is something to do with my IAM configuration, and not with the EC2 plugin or the Java SDK. So I suspect that upgrading the SDK to 1.8.11 and invoking the InstanceProfileCredentialProvider(true) constructor will solve this issue for other people.

          I have created a minimal pull request. See: https://github.com/jenkinsci/ec2-plugin/pull/131

          Ximon Eighteen added a comment - I have created a minimal pull request. See: https://github.com/jenkinsci/ec2-plugin/pull/131

          Ximon Eighteen added a comment - - edited

          Ah, the AuthFailure may affect others too. See: https://forums.aws.amazon.com/thread.jspa?messageID=574914&tstart=0. I'm seeing this issue in the eu-west-1 region.

          Update: Applying the force previous signer configuration solved this issue for me, that is this part of the forum article that I referred to:

          clientConfiguration.setSignerOverride(“QueryStringSignerType”);
          AmazonEC2 ec2 = new AmazonEC2Client(configuration);
          

          Ximon Eighteen added a comment - - edited Ah, the AuthFailure may affect others too. See: https://forums.aws.amazon.com/thread.jspa?messageID=574914&tstart=0 . I'm seeing this issue in the eu-west-1 region. Update: Applying the force previous signer configuration solved this issue for me, that is this part of the forum article that I referred to: clientConfiguration.setSignerOverride(“QueryStringSignerType”); AmazonEC2 ec2 = new AmazonEC2Client(configuration);

            francisu Francis Upton
            ximon18 Ximon Eighteen
            Votes:
            5 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: