-
Bug
-
Resolution: Fixed
-
Blocker
-
Jenkins 1.415, Hudson 1.393. Both on Fedora, Tomcat 6. x86_64.
When reaching the end of a build with jUnit results, possibly when the job is allowed to run concurrently, we are frequently seeing our system get stuck on "Recording test results".
Looking at the thread list, I see the following:
"Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
hudson.model.Run$Runner$CheckpointSet.waitForCheckPoint(Run.java:1266)
hudson.model.Run.waitForCheckpoint(Run.java:1234)
hudson.model.CheckPoint.block(CheckPoint.java:144)
hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:159)
hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:663)
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:638)
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:616)
hudson.model.Build$RunnerImpl.post2(Build.java:161)
hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:585)
hudson.model.Run.run(Run.java:1399)
hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
hudson.model.ResourceController.execute(ResourceController.java:88)
hudson.model.Executor.run(Executor.java:145)
Executor #9 for master : executing Run_Manual_SOAK #242 : waiting for Check point JUnit result archiving on Run_Manual_SOAK #241"
All the stuck jobs are in the same place. They do eventually come unstuck, but can spend a long time (hours and sometimes a day or so) in this state.
Machine load average is at 0.23 0.22 0.21.
- is related to
-
JENKINS-9913 Not obvious why some post-build tasks enforce serial behavior even when builds are concurrent
-
- Resolved
-
-
JENKINS-42727 Cppcheck plugin waiting for checkpoint erroneously when concurrent builds are enabled
-
- Resolved
-
-
JENKINS-24450 JacocoPublisher serializes concurrent builds waiting for checkpoint
-
- Closed
-
Since
JENKINS-9913is covering only the reporting of checkpoints, this should be reopened: JUnitResultArchiver.CHECKPOINT still exists, and probably should not.Needs to be determined if anything needs to be done to replace it, in case a build with a higher number in fact finishes before one with a lower number, so calculation of test regressions cannot be done accurately when the result is published (in case anyone even cares about build-to-build diffs for a concurrent-capable job). Until the earlier build finishes, will the later build’s test result display show any “regressions” (against the last completed build), or show no regressions ever, or throw exceptions? After the earlier build finishes, will the later’s result display show regressions against the earlier build, or against the last completed build at the time of this build’s completion, or do something else? In other words, are calls to getPreviousResult made on demand whenever a build-to-build diff is requested (great)? Or made once when the build completes (not great but adequate)? Or does something really break? My casual inspection of the code suggests that there is some improper caching (CaseResult.failedSince) but that code generally defends against a prior build having no test result action, meaning that simply deleting CHECKPOINT would cause little harm.