Most of our nodes idle timeout overnight. We run a large Multi-configuration job early in the morning. We know we need a certain number of nodes online and used to be able to proactively provision instances via the /computer UI. In recent jenkins and ec2 plugin versions I observe the following:
1. attempt to start a new slave of a given ami
2. plugin finds there is an idle node of that particular type and returns its id, rather than creating a new one.
In the above, i-b0614034 was an existing instance.
This used to work just fine. Now it refuses to create a new node. The change in behavior seems to have been introduced in relation to commit 6b286b185ba41efc33ab8558a6c43969975e6238 for issues
JENKINS-23787 EC2 not spooling up stopped nodes
I blame this code and a subsequent commit that added the additonal logging of " true - Node has capacity - can use it" where it returns the idle instance id rather than provisioning a new one as I explicity asked for.
Exacerbating this is extremely slow instance provisioning in StandardStrategyImpl, which I intend to log as a separate story and cross reference.