Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-35524

new ec2 slave is terminated or terminating during launch

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • ec2-plugin
    • None
    • Jenkins 2.8
      EC2-Plugin 1.33

      This happens when I start new job and the plugin try to create new slave. However the slaves never get online instead the plugin try to create new slave over and over. I have tried to set Launch Timeout in seconds to 0 (infinite) but it doesn't works.

      Jun 10, 2016 9:13:37 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:13:42 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:13:47 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:13:52 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:13:57 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:02 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:07 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:12 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:17 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:23 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:28 AM null
      FINEST: Node (i-5727dc86)(i-5727dc86) is still pending/launching, waiting 5s
      Jun 10, 2016 9:14:33 AM null
      INFO: Node (i-5727dc86)(i-5727dc86) is terminated or terminating, aborting launch

          [JENKINS-35524] new ec2 slave is terminated or terminating during launch

          This is hitting us really big, at the moment we have over 400 ec2 instances that were started and terminated instantly. Using Jenkins 2.19.4 and ec2-plugin 1.36.

          Agent logs show only this:
          INFO: Node ***** (i-0045d9d744706f494)(i-0045d9d744706f494) is terminated or terminating, aborting launch

          Mika Karjalainen added a comment - This is hitting us really big, at the moment we have over 400 ec2 instances that were started and terminated instantly. Using Jenkins 2.19.4 and ec2-plugin 1.36. Agent logs show only this: INFO: Node ***** (i-0045d9d744706f494)(i-0045d9d744706f494) is terminated or terminating, aborting launch

          Neil Rhine added a comment -

          Any word on this? It hit us super hard today.

          Neil Rhine added a comment - Any word on this? It hit us super hard today.

          I have seen this happen now multiple times. It seems to occur some time after we have set some static agents temporarily offline. I have no idea if the events are actually related but we have identified these two things happening roughly the same time.

          Mika Karjalainen added a comment - I have seen this happen now multiple times. It seems to occur some time after we have set some static agents temporarily offline. I have no idea if the events are actually related but we have identified these two things happening roughly the same time.

          Neil Rhine added a comment -

          For us, this ended up being on the AWS side, we were hitting a limit on EBS drives. I can imagine it could also be from instance type, number of instances, etc. I have also seen something similar when the AMI has an encrypted drives and the access/secret don't have permissions to access it. I think this bug should really be about making the error more verbose.

          Neil Rhine added a comment - For us, this ended up being on the AWS side, we were hitting a limit on EBS drives. I can imagine it could also be from instance type, number of instances, etc. I have also seen something similar when the AMI has an encrypted drives and the access/secret don't have permissions to access it. I think this bug should really be about making the error more verbose.

          On a seconod look I agree with Neil. This happens for us due to reaching EBS volumes limit in the AWS region. Still, the plugin should understand the situation and avoid flooding aws with instances.

          Mika Karjalainen added a comment - On a seconod look I agree with Neil. This happens for us due to reaching EBS volumes limit in the AWS region. Still, the plugin should understand the situation and avoid flooding aws with instances.

          Daniel Starling added a comment - - edited

          We ran into this issue today. I have not yet determined the cause (was not likely EC2 instance limits or EBS volume limits), but it was a bit alarming that in a few minutes it started 25 instances that were all terminated when I checked on them. If I understand EC2 billing correctly, each of those instances counts as 1 hour of EC2 time.

          After an hour or so, the issue has magically evaporated. We saw the same error ("is terminated or terminating, aborting launch") as in logs here.

          It seems like it would be good to have a low-level configurable throttle – rapidly cycling instances up/down at too great a rate smells like a condition you might want to halt everything for. Or at a minimum, set a limit on how many instances can be created per hour. This might be a good preventative measure to deal with instance-creation problems.

          EDIT: This is helpful: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_InstanceStraightToTerminated.html – turns out we did hit "State transition reason
          Client.VolumeLimitExceeded: Volume limit exceeded"

          Daniel Starling added a comment - - edited We ran into this issue today. I have not yet determined the cause (was not likely EC2 instance limits or EBS volume limits), but it was a bit alarming that in a few minutes it started 25 instances that were all terminated when I checked on them. If I understand EC2 billing correctly, each of those instances counts as 1 hour of EC2 time. After an hour or so, the issue has magically evaporated. We saw the same error ("is terminated or terminating, aborting launch") as in logs here. It seems like it would be good to have a low-level configurable throttle – rapidly cycling instances up/down at too great a rate smells like a condition you might want to halt everything for. Or at a minimum, set a limit on how many instances can be created per hour. This might be a good preventative measure to deal with instance-creation problems. EDIT: This is helpful: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_InstanceStraightToTerminated.html – turns out we did hit "State transition reason Client.VolumeLimitExceeded: Volume limit exceeded"

          Maksim Skutin added a comment -

          danielstarling
          Also ran into this today, root cause: a set of AMI images were created without EBS `DeleteOnTerminate` flag.

          Maksim Skutin added a comment - danielstarling Also ran into this today, root cause: a set of AMI images were created without EBS `DeleteOnTerminate` flag.

          Allan BURDAJEWICZ added a comment - - edited

          The same issue happens if you use an AMI based on encrypted snapshot but the AWS profile / credentials set up for the EC2 plugin do not have permission to us this KMS key. This issue is mainly caused by AWS specific errors and the problem is mainly about handling AWS errors to provide a useful feedback to the Jenkins administrator.

          Allan BURDAJEWICZ added a comment - - edited The same issue happens if you use an AMI based on encrypted snapshot but the AWS profile / credentials set up for the EC2 plugin do not have permission to us this KMS key. This issue is mainly caused by AWS specific errors and the problem is mainly about handling AWS errors to provide a useful feedback to the Jenkins administrator.

            francisu Francis Upton
            mangengkus Ken Kustian
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: