Status: Closed (View Workflow)
Dear colleagues, we found issue when we're using ec2-plugin. Problem appears when aws spot instance (jenkins slave) is terminating because of "Idle termination timeout".
EC2 plugin tries to first Cancel and Terminate AWS spot worker(slave) and after that remove node from Jenkins.
AWS instance Cancel and Termination process takes longer period and during this time Jenkins can try to build any new job on this let's say "available" node. Job failed because node is already in terminating state within AWS.
The better handling of node termination should be - put the node offline and after that cancel and remove it from aws.
1. "put the node offline or disconnect" (I dont know exact method)
Please see attached job.log file where you can see end of failed job.
Spot instance was terminated by ec2 plugin (Event name: CancelSpotInstanceRequests). Node was available for some moment during instance spot termination and node removal from Jenkins. Therefore a new job was assigned.
Hi, are you sure that the spot instance has not be retired by AWS ?
I have to check the spot instance code, because is a bit different from the on-demand, but it should not possible to assign any new jobs to a node that reached the idle timeout.