Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50642

intermittent JenkinsRule test timeout failure

    XMLWordPrintable

Details

    Description

      Sometimes, the JenkinsRule based tests in the git-plugin can fail intermittently when the default test is set to 180.

      It appears pretty often if you just run "mvn test" in the git plugin source code, it is fairly common to see a few tests marked as flaky get re-run and pass on the second run.

      If you increase the test timeout to say, 10 minutes, then the tests will pass. A few other tests that aren't marked as flaky also sometimes fail, and it is difficult to debug exactly what's causing them.

      [INFO]
      [INFO] Results:
      [INFO]
      [WARNING] Flakes:
      [WARNING] hudson.plugins.git.GitChangeSetBadArgsTest.testFindOrCreateEmptyCommitter(hudson.plugins.git.GitChangeSetBadArgsTest)
      [ERROR] Run 1: GitChangeSetBadArgsTest>Object.wait:502->Object.wait:-2 » TestTimedOut test ti...
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] hudson.plugins.git.GitPublisherTest.testMergeAndPushFF(hudson.plugins.git.GitPublisherTest)
      [ERROR] Run 1: GitPublisherTest>Object.wait:502->Object.wait:-2 » TestTimedOut test timed out...
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] hudson.plugins.git.GitPublisherTest.testMergeAndPushWithSystemEnvVar(hudson.plugins.git.GitPublisherTest)
      [ERROR] Run 1: GitPublisherTest>Object.wait:502->Object.wait:-2 » TestTimedOut test timed out...
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] hudson.plugins.git.GitSCMTest.testEmailCommitter(hudson.plugins.git.GitSCMTest)
      [ERROR] Run 1: GitSCMTest>Object.wait:502->Object.wait:-2 » TestTimedOut test timed out after...
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] hudson.plugins.git.GitStatusTest.testDoNotifyCommitWithTwoBranches(hudson.plugins.git.GitStatusTest)
      [ERROR] Run 1: GitStatusTest>Object.wait:502->Object.wait:-2 » TestTimedOut test timed out af...
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] jenkins.plugins.git.GitStepTest.multipleSCMs(jenkins.plugins.git.GitStepTest)
      [ERROR] Run 1: GitStepTest.multipleSCMs
      [INFO] Run 2: PASS
      [INFO]
      [WARNING] jenkins.plugins.git.GitStepTest.roundtrip(jenkins.plugins.git.GitStepTest)
      [ERROR] Run 1: GitStepTest.roundtrip:75 » TestTimedOut test timed out after 180 seconds
      [INFO] Run 2: PASS
      [INFO]
      [INFO]

       

      Interestingly, if you single out a test to run manually, it does not timeout, so it is difficult to debug or determine the problem that causes the test to fail.

       

       

      I suspect that there is some sort of contention issue when running the tests, or something within the test body takes too long sometimes. But I haven't been able to root cause it.

      I'll try to update this with more information when I can catch specific tests failing.

      Attachments

        Activity

          markewaite Mark Waite added a comment -

          I've intentionally based it on 2.60.3 to bring it to Java 8 and still give as many users as possible the chance to update to the new plugin version when it releases, without requiring that they are on the latest Jenkins LTS.

          Once the JENKINS-48061 fix is complete, verified, and released, I'll merge the require-jdk-8 branch to master and deliver a beta release of git plugin 4.0.0. At that time, I'll re-evaluate the adoption numbers of various Jenkins releases so that I can follow the same pattern that Stephen Connolly is using with the credentials plugin.

          markewaite Mark Waite added a comment - I've intentionally based it on 2.60.3 to bring it to Java 8 and still give as many users as possible the chance to update to the new plugin version when it releases, without requiring that they are on the latest Jenkins LTS. Once the JENKINS-48061 fix is complete, verified, and released, I'll merge the require-jdk-8 branch to master and deliver a beta release of git plugin 4.0.0. At that time, I'll re-evaluate the adoption numbers of various Jenkins releases so that I can follow the same pattern that Stephen Connolly is using with the credentials plugin.
          jekeller Jacob Keller added a comment -

          The test failures are occurring on the VMs see: https://ci.jenkins.io/job/Plugins/job/git-plugin/job/PR-579/5/display/redirect

          How difficult would it be to get haveged installed on the linux VM? That's what I got running on my local system to resolve the issue (as the particular system does not have a keyboard/mouse locally, nor does it have a hardware RNG source for rng-tools to use).

          I also verified that my original assumption that it wasn't fixed by 2.60.3 was wrong, so I think just upgrading to 2.60.3 and stopping tests on the old version would fix this as well.

          jekeller Jacob Keller added a comment - The test failures are occurring on the VMs see: https://ci.jenkins.io/job/Plugins/job/git-plugin/job/PR-579/5/display/redirect How difficult would it be to get haveged installed on the linux VM? That's what I got running on my local system to resolve the issue (as the particular system does not have a keyboard/mouse locally, nor does it have a hardware RNG source for rng-tools to use). I also verified that my original assumption that it wasn't fixed by 2.60.3 was wrong, so I think just upgrading to 2.60.3 and stopping tests on the old version would fix this as well.
          markewaite Mark Waite added a comment -

          jekeller it appears that the JDK 8 changes which have been included in the git plugin master branch 14 May 2018 have resolved this issue, at least in the cases that I've checked. Can this bug be closed?

          markewaite Mark Waite added a comment - jekeller it appears that the JDK 8 changes which have been included in the git plugin master branch 14 May 2018 have resolved this issue, at least in the cases that I've checked. Can this bug be closed?
          jekeller Jacob Keller added a comment -

          Yea, upgrading to depending on the 2.60.3 Jenkins server avoids the issue due to no longer starting up SSH. Will close.

          jekeller Jacob Keller added a comment - Yea, upgrading to depending on the 2.60.3 Jenkins server avoids the issue due to no longer starting up SSH. Will close.
          jekeller Jacob Keller added a comment -

          The root cause of this failure is due to launching the SSH server in the Jenkins test harness, which generates an SSH key. This key generation requires entropy, and Linux blocks when there isn't enough entropy available.

          In Virtual Machines, especially when running high test volumes, it is very easy to consume all of the available entropy, causing the tests to block when attempting to start the JenkinsRule. If entropy took too long to generate, it resulted in random test failures.

          This was fixed/worked around by upgrading to a newer version of Jenkins which has the SSH server disabled by default, thus avoiding the root cause of key generation.

          jekeller Jacob Keller added a comment - The root cause of this failure is due to launching the SSH server in the Jenkins test harness, which generates an SSH key. This key generation requires entropy, and Linux blocks when there isn't enough entropy available. In Virtual Machines, especially when running high test volumes, it is very easy to consume all of the available entropy, causing the tests to block when attempting to start the JenkinsRule. If entropy took too long to generate, it resulted in random test failures. This was fixed/worked around by upgrading to a newer version of Jenkins which has the SSH server disabled by default, thus avoiding the root cause of key generation.

          People

            jekeller Jacob Keller
            jekeller Jacob Keller
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: