Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-10234

Junit result archiver getting stuck for a long time in concurrent builds

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • junit-plugin
    • Jenkins 1.415, Hudson 1.393. Both on Fedora, Tomcat 6. x86_64.

      When reaching the end of a build with jUnit results, possibly when the job is allowed to run concurrently, we are frequently seeing our system get stuck on "Recording test results".

      Looking at the thread list, I see the following:
      "Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241
      java.lang.Object.wait(Native Method)
      java.lang.Object.wait(Object.java:502)
      hudson.model.Run$Runner$CheckpointSet.waitForCheckPoint(Run.java:1266)
      hudson.model.Run.waitForCheckpoint(Run.java:1234)
      hudson.model.CheckPoint.block(CheckPoint.java:144)
      hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:159)
      hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:663)
      hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:638)
      hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:616)
      hudson.model.Build$RunnerImpl.post2(Build.java:161)
      hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:585)
      hudson.model.Run.run(Run.java:1399)
      hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      hudson.model.ResourceController.execute(ResourceController.java:88)
      hudson.model.Executor.run(Executor.java:145)
      Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241"

      All the stuck jobs are in the same place. They do eventually come unstuck, but can spend a long time (hours and sometimes a day or so) in this state.
      Machine load average is at 0.23 0.22 0.21.

          [JENKINS-10234] Junit result archiver getting stuck for a long time in concurrent builds

          Danny Staple created issue -

          Danny Staple added a comment -

          We know have understood exactly what happens here - it may not be a bug but a "feature".
          When concurrent jobs are started, it is possible, especially where parameterization affects the run time of a job, for a later build to finish before the earlier build. This means that when it reaches the archive stage, it is doing the junit analysis.
          Some of our tests take 20 minutes, and some 15 hours.

          Junit then tries to sort out regression. The oldest job will hold up the archiving of any newer ones while it waits to find this.
          We use junit as a convenient way to display results - but hadn't anticipated this behaviour.

          Danny Staple added a comment - We know have understood exactly what happens here - it may not be a bug but a "feature". When concurrent jobs are started, it is possible, especially where parameterization affects the run time of a job, for a later build to finish before the earlier build. This means that when it reaches the archive stage, it is doing the junit analysis. Some of our tests take 20 minutes, and some 15 hours. Junit then tries to sort out regression. The oldest job will hold up the archiving of any newer ones while it waits to find this. We use junit as a convenient way to display results - but hadn't anticipated this behaviour.

          Danny Staple added a comment -

          The answer we are now considering is to find (or make) a way to disable the regression checking behaviour of junit - preferably as an option we can set per job, so that other jobs that are sequential and not concurrent, or that should consistently take the same time, can have it enabled.

          Danny Staple added a comment - The answer we are now considering is to find (or make) a way to disable the regression checking behaviour of junit - preferably as an option we can set per job, so that other jobs that are sequential and not concurrent, or that should consistently take the same time, can have it enabled.

          We're also having this problem of different running times, as we sometimes are skipping a job in a chain of jobs.

          Danny, have you found any workaround for this?

          Temporarily disable the checkPoint waiting on certain jobs would help us. If the JUnitResultArchiver were a plugin it would simplify forking the feature.

          Jonas Eriksson added a comment - We're also having this problem of different running times, as we sometimes are skipping a job in a chain of jobs. Danny, have you found any workaround for this? Temporarily disable the checkPoint waiting on certain jobs would help us. If the JUnitResultArchiver were a plugin it would simplify forking the feature.

          Danny Staple added a comment -

          We are in the process of removing things using the concurrent build flag from our setup. It means more duplication and a violation of SPOT (Single point of truth), but concurrent builds have lead us to too many problems - including the far more serious #JENKINS-10615. We are looking into the viability of the templatised job plugin to prevent us duplicating stuff, and have most of our control logic in an SCM run within shell build steps.

          Danny Staple added a comment - We are in the process of removing things using the concurrent build flag from our setup. It means more duplication and a violation of SPOT (Single point of truth), but concurrent builds have lead us to too many problems - including the far more serious # JENKINS-10615 . We are looking into the viability of the templatised job plugin to prevent us duplicating stuff, and have most of our control logic in an SCM run within shell build steps.

          I've been playing around with Jenkins now and found out that even if I don't have JUnit Reports enabled I have another plugin (email ext) that will wait for the job to finish by using the checkPoint.

          I guess running concurrent parameterized builds in Jenkins is not fitting how the model is implemented in the first place.

          Jonas Eriksson added a comment - I've been playing around with Jenkins now and found out that even if I don't have JUnit Reports enabled I have another plugin (email ext) that will wait for the job to finish by using the checkPoint. I guess running concurrent parameterized builds in Jenkins is not fitting how the model is implemented in the first place.

          If the JUnit Reports is the only task holding you back from finishing a job when running concurrent builds I've found out that the xunit extension configured with Custom Tool is a solution to the checkPoint waiting problem.

          Jonas Eriksson added a comment - If the JUnit Reports is the only task holding you back from finishing a job when running concurrent builds I've found out that the xunit extension configured with Custom Tool is a solution to the checkPoint waiting problem.

          Inbar Rose added a comment -

          same problem here. total blocker. task A starts, then task B starts. task B reaches the 'Recording test results' stage and hangs until task A finishes. after testing with simple timed builds with many plugins/options enabled/disabled concluded that junit is the problem.

          Inbar Rose added a comment - same problem here. total blocker. task A starts, then task B starts. task B reaches the 'Recording test results' stage and hangs until task A finishes. after testing with simple timed builds with many plugins/options enabled/disabled concluded that junit is the problem.
          Inbar Rose made changes -
          Priority Original: Major [ 3 ] New: Blocker [ 1 ]

          kutzi added a comment -

          IMO this is definitely a feature and not a bug. If you don't like this behaviour, then use e.g. the xunit plugin which doesn't seem to behave in this way.

          kutzi added a comment - IMO this is definitely a feature and not a bug. If you don't like this behaviour, then use e.g. the xunit plugin which doesn't seem to behave in this way.

            jglick Jesse Glick
            dannystaple Danny Staple
            Votes:
            15 Vote for this issue
            Watchers:
            30 Start watching this issue

              Created:
              Updated:
              Resolved: