-
Bug
-
Resolution: Cannot Reproduce
-
Critical
SSHLauncher{host='10.50.10.252', port=22, credentialsId='aaf2ee5e-32bd-4675-9793-0570922f9c66', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=5, maxNumRetries=120, retryWaitTime=2, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.ManuallyTrustedKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
[11/16/18 20:19:40] [SSH] Opening SSH connection to 10.50.10.252:22.
Connection refused (Connection refused)
SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 120 more retries left.
Connection refused (Connection refused)
SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 119 more retries left.
Connection refused (Connection refused)
SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 118 more retries left.
ERROR: null
java.util.concurrent.CancellationException
{{ at java.util.concurrent.FutureTask.report(FutureTask.java:121)}}
{{ at java.util.concurrent.FutureTask.get(FutureTask.java:192)}}
{{ at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:904)}}
{{ at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)}}
{{ at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)}}
{{ at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)}}
{{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
{{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
{{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
{{ at java.lang.Thread.run(Thread.java:748)}}
[11/16/18 20:19:45] Launch failed - cleaning up connection
[11/16/18 20:19:45] [SSH] Connection closed.
This happens whenever a new ec2 fleet instance is brought online. During this time cloud-init is still working it's magic to install docker/openjdk and add the new Jenkins user (and it's key). However after the Launch failed error message there are no more retries and that slave is never contacted again, even-though if we manually press the button to reconnect the slave comes online without issues.
Clearly there are more retries left, yet it is completely dead in the water.
This used to work without issues on older versions of Jenkins and this just recently started.
We are running Jenkins ver. 2.138.3 from the jenkinsci/blueocean docker image.