Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54315

Instances sometimes stay disconnected

    XMLWordPrintable

Details

    Description

      Sometimes when ec2-plugin spins up an instance it will stay disconnected and will never be used. 

      The error on Jenkins log:

      INFO: Started provisioning js-ubuntu (ami-d7bhbda9) from ec2-mng-jenkins-slave with 1 executors. Remaining excess workload: 0 Oct 25, 2018 8:31:18 PM hudson.slaves.NodeProvisioner$2 run WARNING: Unexpected exception encountered while provisioning agent js-ubuntu (ami-d7bhbda9) com.amazonaws.services.ec2.model.AmazonEC2Exception: The instance ID 'i-05a000703b7084e53' does not exist (Service: AmazonEC2; Status Code: 400; Error Code: InvalidInstanceID.NotFound; Request ID: 2d9f3866-19d7-482c-a4cb-057c88880e6c) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1658) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1322) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1072) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:745) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:719) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:701) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:669) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:651) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:515) at com.amazonaws.services.ec2.AmazonEC2Client.doInvoke(AmazonEC2Client.java:17597) at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:17566) at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:17555) at com.amazonaws.services.ec2.AmazonEC2Client.executeDescribeInstances(AmazonEC2Client.java:8680) at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:8651) at hudson.plugins.ec2.CloudHelper.getInstance(CloudHelper.java:47) at hudson.plugins.ec2.EC2Cloud$1.call(EC2Cloud.java:603) at hudson.plugins.ec2.EC2Cloud$1.call(EC2Cloud.java:586) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
      

      Cloudtrail on AWS shows this errocode: 

      {
          "eventVersion": "1.05",
          "userIdentity": {
              "type": "IAMUser",
              "principalId": "",
              "arn": "arn:aws:iam:::user/_nsof_int_jenkins",
              "accountId": "",
              "accessKeyId": "",
              "userName": "_nsof_int_jenkins"
          },
          "eventTime": "2018-10-25T17:31:16Z",
          "eventSource": "ec2.amazonaws.com",
          "eventName": "DescribeInstances",
          "awsRegion": "us-east-1",
          "sourceIPAddress": "52.70.50.139",
          "userAgent": "aws-sdk-java/1.11.403 Linux/4.4.0-1069-aws OpenJDK_64-Bit_Server_VM/25.181-b13 java/1.8.0_181 groovy/2.4.11",
          "errorCode": "Client.InvalidInstanceID.NotFound",
          "errorMessage": "The instance ID 'i-05a000703b7084e53' does not exist",
          "requestParameters": {
              "instancesSet": {
                  "items": [
                      {
                          "instanceId": "i-05a000703b7084e53"
                      }
                  ]
              },
              "filterSet": {}
          },
          "responseElements": null,
          "requestID": "2d9f3866-19d7-482c-a4cb-057c88880e6c",
          "eventID": "cb5da0a7-15f7-4360-979e-fff11effa285",
          "eventType": "AwsApiCall",
          "recipientAccountId": ""
      }
      
      

       

      After chatting with AWS support they say it's due to the nature of their API's eventual consistency

      API calls were made before the instance spawned on AWS.

      Is there a way to add a retry in these cases?

       

       

      Attachments

        Activity

          thoulen FABRIZIO MANFREDI added a comment - Can you test this snapshot that contains a fix ?:   https://repo.jenkins-ci.org/snapshots/org/jenkins-ci/plugins/ec2/1.42-SNAPSHOT/ec2-1.42-20181106.195515-1.hpi

          Thanks for the update!

          gavriel_meta Gavriel Fishel added a comment - Thanks for the update!
          thoulen FABRIZIO MANFREDI added a comment - - edited

          Yes, that part of the code is still using the old function :

          Instance inst = CloudHelper.getInstance(getInstanceId(), getCloud());

          It should be 

          Instance inst = CloudHelper.getInstanceWithRetry(getInstanceId(), getCloud());

           

          It will be fixed in the 1.42 (https://github.com/jenkinsci/ec2-plugin/pull/317)

          thoulen FABRIZIO MANFREDI added a comment - - edited Yes, that part of the code is still using the old function : Instance inst = CloudHelper.getInstance(getInstanceId(), getCloud()); It should be  Instance inst = CloudHelper.getInstanceWithRetry(getInstanceId(), getCloud());   It will be fixed in the 1.42 ( https://github.com/jenkinsci/ec2-plugin/pull/317 )

          People

            thoulen FABRIZIO MANFREDI
            gavriel_meta Gavriel Fishel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: