Have encountered some reports of slow slave performance on a Unix master using many slaves where the thread dumps show all but one slave connection thread waiting for a single lock, which is held by a thread that looks like this:
From what I can tell neither the Jenkins SSH Slaves plugin nor the Trilead SSH library are to blame, as they produce a different SecureRandom instance for each slave. Rather it is NativePRNG (the default implementation on typical Linux installations among others) which uses a global lock, to synchronize access to /dev/random and /dev/urandom; and random can block waiting for sufficient entropy to accumulate.
It might help for the SSH Slaves plugin to offer a java.security.SecureRandom based on sun.security.provider.SecureRandom, which does not acquire a global lock to process connection data. (It may take longer to set up a connection, since it needs to seed the random-number generator based on thread activity.)
- Edit the JRE's $JAVA_HOME/lib/security/java.security to comment out the line securerandom.source=file:/dev/urandom (should switch back to the generic implementation)
- Running -Djava.security.egd=file:/dev/./urandom (should force use of urandom which is supposed to be nonblocking)