[JENKINS-55492] Plugin is not reusing stopped AWS EC2 instances

Type: Bug
Resolution: Unresolved
Priority: Major
Component/s: ec2-plugin
Labels:
None
Environment:
Plugin version 1.42

Similar Issues:
Powered by SuggestiMate

Show

Using 1.39 of the plugin, the behaviour I observed was that when the instance cap was set to 2 and 2 slaves were already provisioned (e.g. in the stopped EC2 state rather than terminated), when a slave is required for a pending queued build, the plugin used to simply start one of the 2 stopped instances as required.

Now, on 1.42, the plugin appears to leave the previous 2 EC2 instances in the stopped state on AWS and provision a brand new slave, leading to it exceeding the instance cap of 2 and there now being 3 EC2 instances (albeit 2 stopped and only 1 running).

For now, my workaround was to go back to 1.39.

FABRIZIO MANFREDI added a comment - 2019-01-09 17:43

Are the nodes in stop state created before the update ?

From 1.39 to 1.41 has been changed the tag labeling, in your case if the nodes was created before the upgrade the label associated is not recognize by the new version.

What is reported in the log in term of the number of stopped instances ?

FABRIZIO MANFREDI added a comment - 2019-01-09 17:43 Are the nodes in stop state created before the update ? From 1.39 to 1.41 has been changed the tag labeling, in your case if the nodes was created before the upgrade the label associated is not recognize by the new version. What is reported in the log in term of the number of stopped instances ?

Sascha Kettler added a comment - 2019-01-16 16:41

I might experiencing the same issue. Do you also have multiple subnet ids configured?

At least in my case this is the culprit: The DescribeInstancesRequest to look for possible instances filters by ONE subnetId equal to chooseSubnetId() - which will return one of the defined subnets in a round-robin fashion.

As I have a subnet per AZ, the first instance will be started (and on idle, stopped) in AZ a. Then, when we need an instance again, chooseSubnetId() will have moved to a different subnet in AZ b - no instances can be found there, so a new one is created.

What should happen instead is, that the DescribeInstancesRequest should filter for ANY of the defined subnetIds.

Sascha Kettler added a comment - 2019-01-16 16:41 I might experiencing the same issue. Do you also have multiple subnet ids configured? At least in my case this is the culprit: The DescribeInstancesRequest to look for possible instances filters by ONE subnetId equal to chooseSubnetId() - which will return one of the defined subnets in a round-robin fashion. As I have a subnet per AZ, the first instance will be started (and on idle, stopped) in AZ a. Then, when we need an instance again, chooseSubnetId() will have moved to a different subnet in AZ b - no instances can be found there, so a new one is created. What should happen instead is, that the DescribeInstancesRequest should filter for ANY of the defined subnetIds.

Dirk Heinrichs added a comment - 2022-11-18 11:30

Same here, fresh install, current plugin version. AMI template is configured to stop instead of terminate, instances are stopped on idle, but when a new job is started a new instance is launched instead of starting a stopped one.

Dirk Heinrichs added a comment - 2022-11-18 11:30 Same here, fresh install, current plugin version. AMI template is configured to stop instead of terminate, instances are stopped on idle, but when a new job is started a new instance is launched instead of starting a stopped one.

Anton Goorin added a comment - 2022-11-27 06:07

I'm observing this behaviour for more than a year, Installed 4 fresh servers, and all of the leave stopped nodes as garbage.
For now I just delete them manually once in a while.

Anton Goorin added a comment - 2022-11-27 06:07 I'm observing this behaviour for more than a year, Installed 4 fresh servers, and all of the leave stopped nodes as garbage. For now I just delete them manually once in a while.

Evan added a comment - 2023-01-17 20:37

2.0.4 of this plugin still has this issue Multi subnet/az config, turned on "stop instance" instead of allowing it to terminate on Friday, came back on Tuesday and had 185 stopped instances plus 40 running.

Evan added a comment - 2023-01-17 20:37 2.0.4 of this plugin still has this issue Multi subnet/az config, turned on "stop instance" instead of allowing it to terminate on Friday, came back on Tuesday and had 185 stopped instances plus 40 running.

Sebastian Opel added a comment - 2023-06-15 19:59

evanrich408 Does it work for you when you have only 1 subnet configured in the plugin?
I have only 1 subnet configured and it doesn't start the stopped instances. it tries to connect via ssh but the instance remains stopped in aws ec2.

or are we talking about just the presence of multiple subnets in the vpc?

Sebastian Opel added a comment - 2023-06-15 19:59 evanrich408 Does it work for you when you have only 1 subnet configured in the plugin? I have only 1 subnet configured and it doesn't start the stopped instances. it tries to connect via ssh but the instance remains stopped in aws ec2. or are we talking about just the presence of multiple subnets in the vpc?

Sebastian Opel added a comment - 2023-06-15 21:05

evanrich408 Sorry, my problem was, that i set the wrong availabily zone. i configured "eu-central-1" instead of "eu-central-1a".

So with a single subnet configured it works.

Sebastian Opel added a comment - 2023-06-15 21:05 evanrich408 Sorry, my problem was, that i set the wrong availabily zone. i configured "eu-central-1" instead of "eu-central-1a". So with a single subnet configured it works.

Assignee:: FABRIZIO MANFREDI

Reporter:: David Goate

Votes:: 3 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: 2019-01-09 17:01

Updated:: 2023-06-15 21:05

Jenkins

Details

Description

Attachments

Activity

Collapse comment: FABRIZIO MANFREDI added a comment - 2019-01-09 17:43

Expand comment: FABRIZIO MANFREDI added a comment - 2019-01-09 17:43

Collapse comment: Sascha Kettler added a comment - 2019-01-16 16:41

Expand comment: Sascha Kettler added a comment - 2019-01-16 16:41

Collapse comment: Dirk Heinrichs added a comment - 2022-11-18 11:30

Expand comment: Dirk Heinrichs added a comment - 2022-11-18 11:30

Collapse comment: Anton Goorin added a comment - 2022-11-27 06:07

Expand comment: Anton Goorin added a comment - 2022-11-27 06:07

Collapse comment: Evan added a comment - 2023-01-17 20:37

Expand comment: Evan added a comment - 2023-01-17 20:37

Collapse comment: Sebastian Opel added a comment - 2023-06-15 19:59

Expand comment: Sebastian Opel added a comment - 2023-06-15 19:59

Collapse comment: Sebastian Opel added a comment - 2023-06-15 21:05

Expand comment: Sebastian Opel added a comment - 2023-06-15 21:05

People

Dates