-
Bug
-
Resolution: Fixed
-
Blocker
-
Jenkins 1.415, Hudson 1.393. Both on Fedora, Tomcat 6. x86_64.
When reaching the end of a build with jUnit results, possibly when the job is allowed to run concurrently, we are frequently seeing our system get stuck on "Recording test results".
Looking at the thread list, I see the following:
"Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
hudson.model.Run$Runner$CheckpointSet.waitForCheckPoint(Run.java:1266)
hudson.model.Run.waitForCheckpoint(Run.java:1234)
hudson.model.CheckPoint.block(CheckPoint.java:144)
hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:159)
hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:663)
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:638)
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:616)
hudson.model.Build$RunnerImpl.post2(Build.java:161)
hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:585)
hudson.model.Run.run(Run.java:1399)
hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
hudson.model.ResourceController.execute(ResourceController.java:88)
hudson.model.Executor.run(Executor.java:145)
Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241"
All the stuck jobs are in the same place. They do eventually come unstuck, but can spend a long time (hours and sometimes a day or so) in this state.
Machine load average is at 0.23 0.22 0.21.
- is related to
-
JENKINS-9913 Not obvious why some post-build tasks enforce serial behavior even when builds are concurrent
-
- Resolved
-
-
JENKINS-42727 Cppcheck plugin waiting for checkpoint erroneously when concurrent builds are enabled
-
- Resolved
-
-
JENKINS-24450 JacocoPublisher serializes concurrent builds waiting for checkpoint
-
- Closed
-
[JENKINS-10234] Junit result archiver getting stuck for a long time in concurrent builds
Priority | Original: Major [ 3 ] | New: Blocker [ 1 ] |
We know have understood exactly what happens here - it may not be a bug but a "feature".
When concurrent jobs are started, it is possible, especially where parameterization affects the run time of a job, for a later build to finish before the earlier build. This means that when it reaches the archive stage, it is doing the junit analysis.
Some of our tests take 20 minutes, and some 15 hours.
Junit then tries to sort out regression. The oldest job will hold up the archiving of any newer ones while it waits to find this.
We use junit as a convenient way to display results - but hadn't anticipated this behaviour.