Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32510

SSHD relying on NativePRNG

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Similar to the well-known problem in ssh-slaves but affecting the Jenkins SSH server. I noticed that workflow-plugin functional tests were running very slowly with low CPU usage, and saw in thread dumps

      "Executing testDeleteSubFolder(org.jenkinsci.plugins.workflow.steps.DeleteDirStepTest)" ...
      java.lang.Thread.State: WAITING (on object monitor)
      	at java.lang.Object.wait(Native Method)
      	at java.lang.Object.wait(Object.java:502)
      	at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:267)
      	- locked <...> (a jenkins.model.Jenkins$7)
      	at jenkins.InitReactorRunner.run(InitReactorRunner.java:44)
      	at jenkins.model.Jenkins.executeReactor(Jenkins.java:915)
      	at jenkins.model.Jenkins.<init>(Jenkins.java:814)
      	at hudson.model.Hudson.<init>(Hudson.java:83)
      	at org.jvnet.hudson.test.JenkinsRule.newHudson(JenkinsRule.java:559)
      	at org.jvnet.hudson.test.JenkinsRule.before(JenkinsRule.java:346)
      	at ...
      "SSHD.init" ...
         java.lang.Thread.State: RUNNABLE
      	at java.io.FileInputStream.readBytes(Native Method)
      	at java.io.FileInputStream.read(FileInputStream.java:255)
      	at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:410)
      	at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:427)
      	- locked <...> (a java.lang.Object)
      	at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:329)
      	at sun.security.provider.NativePRNG.engineGenerateSeed(NativePRNG.java:224)
      	at java.security.SecureRandom.generateSeed(SecureRandom.java:533)
      	at org.apache.sshd.common.random.BouncyCastleRandom.<init>(BouncyCastleRandom.java:57)
      	at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:48)
      	at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:41)
      	at org.apache.sshd.common.random.SingletonRandomFactory.<init>(SingletonRandomFactory.java:37)
      	at org.apache.sshd.SshServer.setUpDefaultServer(SshServer.java:452)
      	at org.jenkinsci.main.modules.sshd.SSHD.start(SSHD.java:83)
      	- locked <...> (a org.jenkinsci.main.modules.sshd.SSHD)
      	at org.jenkinsci.main.modules.sshd.SSHD.init(SSHD.java:146)
      	at sun.reflect.GeneratedMethodAccessor149.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:106)
      	at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:176)
      	at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:282)
      	at jenkins.model.Jenkins$7.runTask(Jenkins.java:904)
      	at ...
      

      The bundled server should be switched to not rely on NativePRNG, which can be quite slow.

      sudo apt-get install haveged seems to have improved test speed in my case, but I should not have to do this.

        Attachments

          Issue Links

            Activity

            Hide
            reinholdfuereder Reinhold Füreder added a comment - - edited

            I guess this actually led to a range of failing (old/existing) tests of PRs (e.g. for https://github.com/jenkinsci/email-ext-plugin/pull/170 for email-ext plugin currently maintained by David van Laatum) due to running into timeout:

            • https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Femail-ext-plugin/detail/PR-170/5/tests
              • ScriptContentTest.testWhenTemplateNotFoundThrowFileNotFoundException
                Error Message
                
                test timed out after 180 seconds
                
                Stacktraceorg.junit.runners.model.TestTimedOutException: test timed out after 180 seconds
                	at java.lang.Object.wait(Native Method)
                	at java.lang.Object.wait(Object.java:502)
                	at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:267)
                	at jenkins.InitReactorRunner.run(InitReactorRunner.java:45)
                	at jenkins.model.Jenkins.executeReactor(Jenkins.java:1009)
                	at jenkins.model.Jenkins.<init>(Jenkins.java:877)
                	at hudson.model.Hudson.<init>(Hudson.java:85)
                	at org.jvnet.hudson.test.JenkinsRule.newHudson(JenkinsRule.java:617)
                	at org.jvnet.hudson.test.JenkinsRule.before(JenkinsRule.java:390)
                	at hudson.plugins.emailext.plugins.content.ScriptContentTest$1.before(ScriptContentTest.java:62)
                	at org.jvnet.hudson.test.JenkinsRule$1.evaluate(JenkinsRule.java:543)
                	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
                	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
                	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                	at java.lang.Thread.run(Thread.java:748)
                Standard Output
                === Starting testWhenTemplateNotFoundThrowFileNotFoundException(hudson.plugins.emailext.plugins.content.ScriptContentTest)
                
                Standard Error
                   0.009 [id=1157]	INFO	o.jvnet.hudson.test.JenkinsRule#createWebServer: Running on http://localhost:40893/jenkins/
                   0.047 [id=1163]	INFO	jenkins.InitReactorRunner$1#onAttained: Started initialization
                   0.049 [id=1175]	INFO	jenkins.InitReactorRunner$1#onAttained: Listed all plugins
                   0.087 [id=1163]	WARNING	o.j.h.test.TestExtensionLoader#isActive: ignoring class org.jenkinsci.plugins.workflow.support.actions.LogActionImplTest$ChattyStep$DescriptorImpl compiled against jenkins-test-harness older than 2.16
                   0.088 [id=1163]	WARNING	o.j.h.test.TestExtensionLoader#isActive: ignoring class org.jenkinsci.plugins.workflow.steps.AbstractStepImplRestartTest$BlockStep$DescriptorImpl compiled against jenkins-test-harness older than 2.16
                   0.120 [id=1163]	INFO	jenkins.InitReactorRunner$1#onAttained: Prepared all plugins
                   0.120 [id=1176]	INFO	jenkins.InitReactorRunner$1#onAttained: Started all plugins
                   0.121 [id=1168]	INFO	jenkins.InitReactorRunner$1#onAttained: Augmented all extensions
                   0.292 [id=1169]	INFO	jenkins.InitReactorRunner$1#onAttained: Loaded all jobs
                 180.002 [id=1]	WARNING	o.j.hudson.test.JenkinsRule$2#evaluate: Test timed out (after 180 seconds).
                
                ...
                
                "SSHD.init" Id=1172 Group=FailOnTimeoutGroup RUNNABLE (in native)
                	at java.io.FileInputStream.readBytes(Native Method)
                	at java.io.FileInputStream.read(FileInputStream.java:255)
                	at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:424)
                	at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:441)
                	-  locked java.lang.Object@6deee370
                	at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:331)
                	at sun.security.provider.NativePRNG.engineGenerateSeed(NativePRNG.java:226)
                	at java.security.SecureRandom.generateSeed(SecureRandom.java:533)
                	at org.apache.sshd.common.random.BouncyCastleRandom.<init>(BouncyCastleRandom.java:57)
                	at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:48)
                	at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:41)
                	at org.apache.sshd.common.random.SingletonRandomFactory.<init>(SingletonRandomFactory.java:37)
                	at org.apache.sshd.SshServer.setUpDefaultServer(SshServer.java:452)
                	at org.jenkinsci.main.modules.sshd.SSHD.start(SSHD.java:83)
                	-  locked org.jenkinsci.main.modules.sshd.SSHD@49c17ba4
                	at org.jenkinsci.main.modules.sshd.SSHD.init(SSHD.java:146)
                	at sun.reflect.GeneratedMethodAccessor284.invoke(Unknown Source)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104)
                	at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175)
                	at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:282)
                	at jenkins.model.Jenkins$7.runTask(Jenkins.java:998)
                	at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:210)
                	at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
                	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                	at java.lang.Thread.run(Thread.java:748)	Number of locked synchronizers = 1
                	- java.util.concurrent.ThreadPoolExecutor$Worker@5414b9f
            Show
            reinholdfuereder Reinhold Füreder added a comment - - edited I guess this actually led to a range of failing (old/existing) tests of PRs (e.g. for https://github.com/jenkinsci/email-ext-plugin/pull/170 for email-ext plugin currently maintained by David van Laatum ) due to running into timeout: https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Femail-ext-plugin/detail/PR-170/5/tests ScriptContentTest.testWhenTemplateNotFoundThrowFileNotFoundException Error Message test timed out after 180 seconds Stacktraceorg.junit.runners.model.TestTimedOutException: test timed out after 180 seconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:267) at jenkins.InitReactorRunner.run(InitReactorRunner.java:45) at jenkins.model.Jenkins.executeReactor(Jenkins.java:1009) at jenkins.model.Jenkins.<init>(Jenkins.java:877) at hudson.model.Hudson.<init>(Hudson.java:85) at org.jvnet.hudson.test.JenkinsRule.newHudson(JenkinsRule.java:617) at org.jvnet.hudson.test.JenkinsRule.before(JenkinsRule.java:390) at hudson.plugins.emailext.plugins.content.ScriptContentTest$1.before(ScriptContentTest.java:62) at org.jvnet.hudson.test.JenkinsRule$1.evaluate(JenkinsRule.java:543) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) Standard Output === Starting testWhenTemplateNotFoundThrowFileNotFoundException(hudson.plugins.emailext.plugins.content.ScriptContentTest) Standard Error 0.009 [id=1157] INFO o.jvnet.hudson.test.JenkinsRule#createWebServer: Running on http://localhost:40893/jenkins/ 0.047 [id=1163] INFO jenkins.InitReactorRunner$1#onAttained: Started initialization 0.049 [id=1175] INFO jenkins.InitReactorRunner$1#onAttained: Listed all plugins 0.087 [id=1163] WARNING o.j.h.test.TestExtensionLoader#isActive: ignoring class org.jenkinsci.plugins.workflow.support.actions.LogActionImplTest$ChattyStep$DescriptorImpl compiled against jenkins-test-harness older than 2.16 0.088 [id=1163] WARNING o.j.h.test.TestExtensionLoader#isActive: ignoring class org.jenkinsci.plugins.workflow.steps.AbstractStepImplRestartTest$BlockStep$DescriptorImpl compiled against jenkins-test-harness older than 2.16 0.120 [id=1163] INFO jenkins.InitReactorRunner$1#onAttained: Prepared all plugins 0.120 [id=1176] INFO jenkins.InitReactorRunner$1#onAttained: Started all plugins 0.121 [id=1168] INFO jenkins.InitReactorRunner$1#onAttained: Augmented all extensions 0.292 [id=1169] INFO jenkins.InitReactorRunner$1#onAttained: Loaded all jobs 180.002 [id=1] WARNING o.j.hudson.test.JenkinsRule$2#evaluate: Test timed out (after 180 seconds). ... "SSHD.init" Id=1172 Group=FailOnTimeoutGroup RUNNABLE (in native) at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:424) at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:441) - locked java.lang.Object@6deee370 at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:331) at sun.security.provider.NativePRNG.engineGenerateSeed(NativePRNG.java:226) at java.security.SecureRandom.generateSeed(SecureRandom.java:533) at org.apache.sshd.common.random.BouncyCastleRandom.<init>(BouncyCastleRandom.java:57) at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:48) at org.apache.sshd.common.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:41) at org.apache.sshd.common.random.SingletonRandomFactory.<init>(SingletonRandomFactory.java:37) at org.apache.sshd.SshServer.setUpDefaultServer(SshServer.java:452) at org.jenkinsci.main.modules.sshd.SSHD.start(SSHD.java:83) - locked org.jenkinsci.main.modules.sshd.SSHD@49c17ba4 at org.jenkinsci.main.modules.sshd.SSHD.init(SSHD.java:146) at sun.reflect.GeneratedMethodAccessor284.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:282) at jenkins.model.Jenkins$7.runTask(Jenkins.java:998) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:210) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Number of locked synchronizers = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@5414b9f https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Femail-ext-plugin/detail/PR-170/1/tests/ AbstractScriptTriggerTest.beforeBuildTriggerMigration ScriptContentSecureTest.testWhenTemplateNotFoundThrowFileNotFoundException https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Femail-ext-plugin/detail/PR-170/6/tests ScriptContentSecureTest.testShouldFindTemplateOnClassPath
            Hide
            svanoort Sam Van Oort added a comment -

            Devin Nusbaum Could I get you to take a look? Normally one wouldn't stress too much about test flakes, but this ties to a cluster of actual issues we've seen reported in the past and some of our colleagues have seen.

            Show
            svanoort Sam Van Oort added a comment - Devin Nusbaum Could I get you to take a look? Normally one wouldn't stress too much about test flakes, but this ties to a cluster of actual issues we've seen reported in the past and some of our colleagues have seen.
            Hide
            dnusbaum Devin Nusbaum added a comment - - edited

            SSHD's BouncyCastleRandom class only uses the SecureRandom instance to generate an 8-byte seed using SecureRandom#generateSeed for a pure-Java PRNG from BouncyCastle, and the constructed BouncyCastleRandom instance is used as a singleton in SSHD, so that call should only block once. The most likely explanation is that it is using /dev/random, which is blocking occasionally when running tests.

            We could change the SecureRandom instance to use the NativePRNGNonBlocking provider, which will use /dev/urandom for engineGenerateSeed.

            Notably, it looks like SHA1PRNG's generateSeed and nextBytes methods use either /dev/random or /dev/urandom as their source based on the values of java.security.egd and securerandom.source, which we don't want to modify, so SHA1PRNG seems like a bad choice here. We could still use SHA1PRNG's engineNextBytes method, since the seed that it uses to initialize the state is computed statically, and so may already be ready by the time we want to use if if someone else initialized it, but we can't guarantee that it won't block

            Since NativePRNGNonBlocking always uses /dev/urandom, and unlike in JENKINS-20108 we don't care about SHA1PRNG's increased throughput, I think switching to NativePRNGNonBlocking is the best option.

            Show
            dnusbaum Devin Nusbaum added a comment - - edited SSHD's BouncyCastleRandom class only uses the SecureRandom instance to generate an 8-byte seed using SecureRandom#generateSeed for a pure-Java PRNG from BouncyCastle, and the constructed BouncyCastleRandom instance is used as a singleton in SSHD, so that call should only block once. The most likely explanation is that it is using /dev/random , which is blocking occasionally when running tests. We could change the SecureRandom instance to use the NativePRNGNonBlocking provider, which will use /dev/urandom for engineGenerateSeed. Notably, it looks like SHA1PRNG's generateSeed and nextBytes methods use either /dev/random or /dev/urandom as their source based on the values of java.security.egd and securerandom.source , which we don't want to modify, so SHA1PRNG seems like a bad choice here. We could still use SHA1PRNG's engineNextBytes method, since the seed that it uses to initialize the state is computed statically , and so may already be ready by the time we want to use if if someone else initialized it, but we can't guarantee that it won't block Since NativePRNGNonBlocking always uses /dev/urandom , and unlike in JENKINS-20108 we don't care about SHA1PRNG's increased throughput, I think switching to NativePRNGNonBlocking is the best option.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jglick Jesse Glick
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated: