Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-20108

SSH slaves can block for a long time in NativePRNG

XMLWordPrintable

      Have encountered some reports of slow slave performance on a Unix master using many slaves where the thread dumps show all but one slave connection thread waiting for a single lock, which is held by a thread that looks like this:

      "Pipe writer thread: ..." - Thread ...
         java.lang.Thread.State: RUNNABLE
          at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:255)
          - locked <598aec0c> (a java.lang.Object)
          at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
          at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
          at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
          - locked <329129da> (a java.security.SecureRandom)
          at java.security.SecureRandom.next(SecureRandom.java:455)
          at java.util.Random.nextInt(Random.java:189)
          at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:154)
      

      From what I can tell neither the Jenkins SSH Slaves plugin nor the Trilead SSH library are to blame, as they produce a different SecureRandom instance for each slave. Rather it is NativePRNG (the default implementation on typical Linux installations among others) which uses a global lock, to synchronize access to /dev/random and /dev/urandom; and random can block waiting for sufficient entropy to accumulate.

      It might help for the SSH Slaves plugin to offer a java.security.SecureRandom based on sun.security.provider.SecureRandom, which does not acquire a global lock to process connection data. (It may take longer to set up a connection, since it needs to seed the random-number generator based on thread activity.)

      Unconfirmed workarounds:

      • Edit the JRE's $JAVA_HOME/lib/security/java.security to comment out the line securerandom.source=file:/dev/urandom (should switch back to the generic implementation)
      • Running -Djava.security.egd=file:/dev/./urandom (should force use of urandom which is supposed to be nonblocking)

            kohsuke Kohsuke Kawaguchi
            jglick Jesse Glick
            Votes:
            1 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: