-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Jenkins v2.1112
Docker Plugin 0.15
SSH Slaves Plugin 1.26
Curious if someone can help me unpack this. We recently upgraded Jenkins. We use the Docker-plugin to dynamically provision slaves and we're now running into a situation where the slaves do not properly finish provisioning (The SSH connection is never established). When taking a thread dump there is a very large number of Blocked threads on the SSHLauncher teardown and Fingerprinting for some reason, here's the dumps:
"Computer.threadPoolForRemoting [#1020]" daemon prio=5 BLOCKED hudson.plugins.sshslaves.SSHLauncher.tearDownConnection(SSHLauncher.java:1407) hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1403) com.nirima.jenkins.plugins.docker.launcher.DockerComputerLauncher.afterDisconnect(DockerComputerLauncher.java:71) hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:665) jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) java.util.concurrent.FutureTask.run(FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:748) "Computer.threadPoolForRemoting [#1019]" daemon prio=5 BLOCKED hudson.model.Fingerprint.save(Fingerprint.java:1238) hudson.BulkChange.commit(BulkChange.java:98) com.cloudbees.plugins.credentials.CredentialsProvider.trackAll(CredentialsProvider.java:1533) com.cloudbees.plugins.credentials.CredentialsProvider.track(CredentialsProvider.java:1478) hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:866) com.nirima.jenkins.plugins.docker.launcher.DockerComputerLauncher.launch(DockerComputerLauncher.java:66) hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:288) jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) java.util.concurrent.FutureTask.run(FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:748)
The thread dump has many of these (when things get bad it gets to 100's). We're currently planning a Docker-plugin upgrade and moving away from the SSH Launcher but I'm looking for ideas as to why this may be happening.
- duplicates
-
JENKINS-49235 Fingerprinting added in ssh-slaves causes memory-leak and performance issue with dynamic slaves
-
- Resolved
-
I believe this is related to this issue: https://issues.jenkins-ci.org/browse/JENKINS-49235
The SSH Slaves plugin started to store fingerprints and that's causing a bit of a race condition I think when the containers shutdown/startup, not entirely sure. It might just be that the Docker Plugin I'm using (0.15) doesn't handle this shutdown/spin up well and the newer ones do.
(The newer ones have very different expectations of SSH slaves which require us to completely refactor our container environments, so we'd actually move off SSH connectors).
I'm curious if I just revert the SSH Slaves plugin if this problem goes away.