SSHLauncher doesn't continue retrying to connect to remote executor

XMLWordPrintable

      SSHLauncher{host='10.50.10.252', port=22, credentialsId='aaf2ee5e-32bd-4675-9793-0570922f9c66', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=5, maxNumRetries=120, retryWaitTime=2, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.ManuallyTrustedKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
      [11/16/18 20:19:40] [SSH] Opening SSH connection to 10.50.10.252:22.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 120 more retries left.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 119 more retries left.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 118 more retries left.
      ERROR: null
      java.util.concurrent.CancellationException
      {{ at java.util.concurrent.FutureTask.report(FutureTask.java:121)}}
      {{ at java.util.concurrent.FutureTask.get(FutureTask.java:192)}}
      {{ at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:904)}}
      {{ at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)}}
      {{ at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)}}
      {{ at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)}}
      {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
      {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
      {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
      {{ at java.lang.Thread.run(Thread.java:748)}}
      [11/16/18 20:19:45] Launch failed - cleaning up connection
      [11/16/18 20:19:45] [SSH] Connection closed.

      Ā 

      This happens whenever a new ec2 fleet instance is brought online. During this time cloud-init is still working it's magic to install docker/openjdk and add the new Jenkins user (and it's key). However after the Launch failed error message there are no more retries and that slave is never contacted again, even-though if we manually press the button to reconnect the slave comes online without issues.

      Ā 

      Clearly there are more retries left, yet it is completely dead in the water.

      This used to work without issues on older versions of Jenkins and this just recently started.

      Ā 

      We are runningĀ Jenkins ver. 2.138.3Ā from the jenkinsci/blueocean docker image.

            Assignee:
            Artem Stasiuk
            Reporter:
            Bert JW Regeer
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: