Loading...

This issue is archived. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Minor
Component/s: ec2-plugin
Labels:
- issue-exported-to-github

We a have setup with multiple AMI that are tag based. Here is the scenario that is causing a problem.

Setup:
AMI_1 has tag ONE and only runs Jenkins jobs with that tag. This AMI has an instance cap of 1 and the executor count on the slave is 1 as well, meaning there can be at most one Jenkins job with tag ONE running at a time.

AMI_2 has no tags and is used as much as possible. It also has an instance cap of 6 and each slave has 4 executors.

The overall max cap is 10.

Jenkins is currently running one instance of AMI_1 (at cap) and two running instances and 2 stopped instances of AMI_2 (below cap). All executors are active. In the queue there is 1 job waiting for AMI_1 and 6 jobs waiting for AMI_2.

Problem:
What happens is the plugin tries provision a new slave for AMI_1 but cannot because it is at cap, and then DOES NOT attempt to provision (start) any AMI_2 slaves.

I looked through the code and saw the following in EC2Cloud.java:

EC2AbstractSlave.java

private synchronized EC2AbstractSlave provisionSlaveIfPossible(SlaveTemplate template) {
        /*
         * Note this is synchronized between counting the instances and then allocating the node. Once the node is
         * allocated, we don't look at that instance as available for provisioning.
         */
        int possibleSlavesCount = getPossibleNewSlavesCount(template);
        if (possibleSlavesCount < 0) {
            LOGGER.log(Level.INFO, "Cannot provision - no capacity for instances: " + possibleSlavesCount);
            return null;
        }

        try {
            return template.provision(StreamTaskListener.fromStdout(), possibleSlavesCount > 0);
        } catch (IOException e) {
            LOGGER.log(Level.WARNING, "Exception during provisioning", e);
            return null;
        }
    }

I think that

possibleSlavesCount < 0

should be

possibleSlavesCount <= 0

This issue does resolve itself once the queue for AMI_1 is empty, but it can mean an unnecessary long wait for jobs to complete when there should available capacity.

Assignee:: Francis Upton
Reporter:: Robert Pilachowski

Created:: 2016-05-07 19:22
Updated:: 2025-12-06 19:01
Resolved:: 2016-05-09 02:30
Archived:: 2025-12-06 19:01

Details

Description

Attachments

Activity

People

Dates