-
Bug
-
Resolution: Fixed
-
Blocker
-
Jenkins 1.477.2
Master and Slaves Windows Server 2008 r2
(Also on Jenkins 1.488 Windows Server 2008)
-
Powered by SuggestiMate
We have recently noticed builds disappearing from the "Build History" listing on the project page. Developer was watching a build, waiting for it to complete and said it disappeared after it finished. Nothing was noted in any of the logs concerning that build.
The data was still present on the disk and doing a reload from disk brought the build back. We have other automated jobs that deploy these builds based on build number, so it is pretty big issue in our environment.
We are not able to reproduce at this point, but I still wanted to document what was happening.
I have seen other JIRA issues that look similar, but in those jobs were disappearing after a restart, or upgrade. That is not the case for us. The build disappears after completion, success or failure.
- is duplicated by
-
JENKINS-16018 Not all builds show in the build history dashboard on job
-
- Resolved
-
-
JENKINS-16735 All builds are gone since I copied an existing project
-
- Resolved
-
-
JENKINS-15533 Envinject plugin incompatibility with Jenkins 1.485
-
- Resolved
-
-
JENKINS-15719 Builds and workspace disappear for jobs created after upgrade to 1.487
-
- Resolved
-
-
JENKINS-16117 sporadical unwanted suppression of build artifacts
-
- Resolved
-
-
JENKINS-16175 Dashboard not showing job status
-
- Resolved
-
-
JENKINS-15594 CopyArtifact plugin cannot always copy artifacts
-
- Closed
-
- is related to
-
JENKINS-18678 Builds disappear some time after renaming job
-
- Resolved
-
-
JENKINS-21268 No build.xml is created when warnings plugin is used in combination with deactivated maven plugin
-
- Resolved
-
-
JENKINS-8754 ROADMAP: Improve Start-up Time
-
- Closed
-
-
JENKINS-16845 NullPointer in getPreviousBuild
-
- Resolved
-
-
JENKINS-17265 Builds disappearing from history
-
- Resolved
-
-
JENKINS-23130 nextBuildNumber keeps being set to previous numbers
-
- Resolved
-
-
JENKINS-23152 builds getting lost due to GerritTrigger
-
- Resolved
-
[JENKINS-15156] Builds disappear from build history after completion
Hi, it is still reproducible for me in Jenkins ver. 1.511. It happens for jobs that were copied from existing ones. I think JENKINS-8754 is not related.
reopening.
Jessie, it still happens, see my separate comment in thread. Do you ensist than new issue has to filed here?
@vladichko yes, see previous comments about filing fresh issues with steps to reproduce or other analysis. There was a documented bug here, which was fixed; there may be other bugs with superficially similar symptoms, but it does no good to dump them all in the same issue.
@vladichko I meant that a problem was observed in a plugin due to changes in core, and the fix for this problem was made in that plugin.
Hello,
I'm using Jenkins v 1.517 and I'm experiencing a very similar (if not the same) problem. All builds are visible on the master machine but older builds, from time to time, simply disappears from the list of builds on the slave machine. Is this problem fixed in actual or upcoming version?
Please see the attached screenshot (disappeared history of builds on slave machine).
@odklizec your problem looks unrelated: matrix configuration builds getting deleted from disk (not just sporadically failing to appear in the web UI). As above, if you can figure out under which conditions this happens, file a separate bug report.
I'm currently facing this issue with version 1.526. A workaround is to move to "manage jenkins" and click the "Reload Configuration from Disk" link.
Also experiencing this issue since at least 1.505. Build history disappears for certain builds, seems to be most prevalent in cloned builds. "Reload configuration" does not work for me. Currently running 1.526 and issue remains. Cloned build plans are susceptible to this. "Fix" seems to be to avoid use of clone to create a new build plan, and to create a new build from scratch each time. This is not good, since our builds contain many common steps, and manual re-entry of information for a new build can create new errors in the plan. Interestingly, if a new build plan is created with the same information as the original cloned plan and the new build plan is given the same name as the original "lossy" cloned build plan, the all-new build plan inherits the history disappearance problem as well! There may be a problem with build plan attribute inheritance when a cloned build plan is created.
I really don't know why this issue is marked as resolved?!
when I click on build history on the left Jenkins menu bar, there are not displayed any built jobs?
@ntshako: there are numerous underlying problems that could produce this general visible symptom, and these would get tracked as separate bug reports. I cannot guess what the problem is in your case; we are incrementally fixing issues and adding diagnostics, but in general a Jenkins developer needs to analyze your system to diagnose.
Why not just keep this as a parent issue, with specific conditions as child issues?
Setting an issue to "Resolved" when it clearly is not, is unhelpful for everyone, no matter what the rational. They're called "issues" because they track issues.
I'm more then a little frusterated running into many, many absolutely catastrophic show stopping bugs in every recent Jenkins release, only to find them already reported in Jira and marked "Resolved", when nothing could be farther from the truth.
If you Won't Fix a bug, please at least have the curtsey to your users of flagging the bug Won't Fix rather then lie and call it Resolved.
why not just create them and link them to this issue as a parent?
Please go ahead. It is better for the user encountering the issue to actually file the report though.
I'm wondering how we (users who don't know internals of Jenkins) are supposed to know if our "build history disappears" problem (symptom) is a result of #15156, #17265 or of any other issue with the similar title and description… This applies to other "symptoms" as well.
@binary: if you do not know any internals of Jenkins, you generally cannot know this. All you can know for sure is that if you are running a build newer than the one with a particular fix, such as JENKINS-17125 (1.509.3/1.519), then your symptoms are not a result of the known bug (error-prone FingerprintAction serial form in build.xml in that case).
If you know how to reproduce the problem from scratch, then it does not matter what you know about Jenkins internals; just file a bug report with those steps and let someone else figure out what is going on inside and what relation this might have to previously filed reports or recent changes. Otherwise, filing an issue report is of limited value, which is why paid support exists: there are many more users with unsolved problems than there are volunteers sifting through reports and trying to guess what happened and collate all the data. Sometimes it is possible to immediately diagnose a mistake in code based on seeing the symptom (this is often true of exceptions with stack traces), but sometimes diagnosis without reproducibility would require detailed investigation (as with missing builds).
I have the same problem on 1.523, reloading the configuration from disk makes the missing builds reappear... is there an easy way to check against the remaining open issues to know which one in particular is affecting us ? the job is a cloned job but I don't have specific steps to reproduce the problem.
I have the same issue on 1.526 on Win2k12. Builds disappear from several jobs almost hourly. Reloading the configuration from disk usually brings some of them back, but not all. It's nearly becoming a blocking issue, as jobs are 'losing' their entire build history multiple times a day.
For all the recent posters, please note that the issue is considered as resolved by assignee. He insists on opening a new one if the issue is still relevant.
Code changed in jenkins
User: Jesse Glick
Path:
src/main/java/hudson/maven/MavenModuleSetBuild.java
http://jenkins-ci.org/commit/maven-plugin/dc2215b85d3c1a2e1a4f3ba7b542c2bbc6d41776
Log:
JENKINS-15156 Found a problem with uninitialized run maps in new Maven modules.
Not observed in actual usage, but reproducible (for me at least, though apparently not ci.jenkins-ci.org) in a test:
java.lang.AssertionError: null
at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:628)
at jenkins.model.lazy.AbstractLazyLoadRunMap.all(AbstractLazyLoadRunMap.java:581)
at jenkins.model.lazy.AbstractLazyLoadRunMap.entrySet(AbstractLazyLoadRunMap.java:243)
at java.util.AbstractMap$2$1.<init>(AbstractMap.java:378)
at java.util.AbstractMap$2.iterator(AbstractMap.java:377)
at hudson.util.RunList.iterator(RunList.java:103)
at hudson.util.RunList.size(RunList.java:114)
at hudson.maven.MavenProjectTest.testDeleteSetBuildDeletesModuleBuilds(MavenProjectTest.java:159)
Originally-Committed-As: 09c7cf6ad7cfb4d88d6d8936f29b13f3ca187875
Jesse, whas is released? in what version?
I still observe it in 1.541
I still see it in 1.541
builds just dissapear ~once a week. Reload configuration helps.
@vladichko then you are seeing some other bug with a similar symptom. If you know how to reproduce it, then it can be fixed. Otherwise, maybe, maybe not.
Code changed in jenkins
User: Jesse Glick
Path:
test/src/test/groovy/hudson/model/AbstractProjectTest.groovy
http://jenkins-ci.org/commit/jenkins/85e9e126773c0bb20a8529a2e6591dde17d7e209
Log:
JENKINS-10615 AbstractProjectTest.testWorkspaceLock frequently fails on jenkins.ci due to InterruptedException in HudsonTestCase.setUp.
Possibly because it is sorted after JENKINS-15156 testGetBuildAfterGC and the test suite times out.
Is this Issue Fixed..As We are having the same issue on Jenkins ver. 1.509.4 LTS..Is there a fix for the issue??
@arvindramalingam you mean you are experiencing a different bug with a similar symptom. Without knowing how to reproduce from scratch we cannot help. See comments above.
Hi Jesse,
Thanks for the quick response.We have renamed the jobs and pointed it to a different release and the Build history keeps disappearing everyday.If there is a fix for the issue can you please send it to me so that I will build the war file with the change.
I see the builds on the server but not on the UI...The trend on the UI also does not show the history
I'm having something similar; a build disappears at some point after completion...some times even during the build.
This is Jenkins 1.554.1 on Java:
java version "1.6.0_30"
OpenJDK Runtime Environment (IcedTea6 1.13.3) (rhel-5.1.13.3.el5_10-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
It impacts both jobs that we use the Job DSL for as well as for jobs are normal manually edited.
We can "work around" the problem (without having to reload the whole config from disk, which has in the past caused problems with plugins like GerritTrigger) by doing one of the following:
1. Click 'configure' then immediately click 'save'. The missing builds sometimes reappear. (In previous versions of Jenkins, but it isn't working in 1.554.1 apparently).
2. If we're lucky, we can click 'trends' and the build will be there. We click on it and the missing builds magically reappear.
I've noticed that some jobs behave badly (e.g. fail with weird messages from Jenkins) after updating plugins until I go in and click 'configure' and then 'save' immediately (without changing anything). I've looked at the config.xml and it does change, sometimes new elements get added, or plugin versions are bumped. I'm unsure if the config.xml changing is part of the problem/work-around or if saving just causes Jenkins to rescan things.
It's not reproducible, but if someone can suggest things they'd like us to record/look-at when it happens again I'd be happy to do it. I'm not a Java debugging expert, but I can pull some in.
EDITED: Fixed my using 'job' when I meant 'build'.
job disappears at some point after completion
The jobs sometimes reappear
This issue is about builds disappearing. Are you confusing terminology, or are experiencing completely unrelated symptoms?
Ugh.
I'm sorry danielbeck, I meant 'build'. I'll re-edit that comment to be correct.
I have written a ruby tool for analyzing and fixed build history problems...
https://github.com/docwhat/jenkins-job-checker
If you run jobber.rb with --solve it'll try to fix problems, otherwise it just prints out how it would have solved the problem (in addition of a description of the problem(s)).
The cases that look the most interesting to me are builds that are :STOLEN (e.g. two date-directory's build.xml files have the same <number>) and :NEXT (e.g. the nextBuildNumber is a number that has already been used).
I suspect these are related to the builds missing. Because the builds that disappear are after the duplicate builds. This includes jobs the "Some projects have builds whose timestamps are inconsistent. These will confuse Jenkins when it tries to look up build records." thingy doesn't catch!
I also notice the "Some projects have builds whose timestamps are inconsistent." message doesn't re-check when the job and build history is re-read from disk or when jobs are pruned due to being old.
If this duplication isn't part of the problem, I apologize for clouding the issue. If it is part of the problem, then JENKINS-11853 is probably related to this bug too.
A nice feature for troubleshooting this would be an option to only load a specific job from disk instead of everything. Assuming that's possible...
Christian:
I also notice the "Some projects have builds whose timestamps are inconsistent." message doesn't re-check when the job and build history is re-read from disk or when jobs are pruned due to being old.
It's an independent deliberately slow process only loading one build per 10 seconds to not be too expensive in terms of disk IO.
The cases that look the most interesting to me are builds that are :STOLEN (e.g. two date-directory's build.xml files have the same <number>)
Detecting two builds with same number is covered by the OutOfOrderBuildMonitor (JENKINS-22631) in 1.561+ and 1.554.1+.
I had the same problem a few times over the last 8 months or so. It is definitely a different issue from the one reported here, as no two builds with the same number will be loaded by Jenkins – so the 'Reload from disk' cannot help here.
I suggest you open a new issue and link it to this one.
So we just got more builds missing from the history that are well formed (e.g. my checker shows no problem and I can't see any problems from examining things).
So I guess the duplicate builds (aka JENKINS-23130) are not causing this problem.
Actually JENKINS-23130 is likely a symptom to this problem.
We use GerritTrigger a lot, I'm wondering if it has something to do with this.
Anyway, we just watched this happen in the wild...
We had build 482 at 3:04pm.
We then kicked off builds 483, 484, 485 at 4:09am, 4:12am, and 4:28 all submitted by Gerrit Trigger.
483, 484, and 485 weren't showing up in the Jenkins Web UI but they were well formed on disk.
We then did a "Query and Trigger Gerrit Patches" and a new 483 was kicked off (stealing the build number from the previous 483).
Some how the RunMap lost track of the previous 483, 484, and 485.
And something between 485 and "Query and Trigger Gerrit Patches" caused nextBuildNumber to be lost (going back to 483 matching the Jenkins UI only showing up to build 482).
So I just used the debugger "fix" a nextBuildNumber inside assignBuildNumber on the fly...
on disk was ... 55, 56, 57 – However, the Jenkins Web UI showed only 55.
I changed the nextBuildNumber to 58, and now the Web UI is showing 55 and 58.
I'm beginning to think something is monkeying around with the list of builds (aka RunMap)...
On IRC schristou said that he tracked this down to GerritTrigger...
Specifically, his steps to reproduce are to "Reload Configuration from Disk" and then kick off a Gerrit build.
The nextBuildNumber was new to GerritTrigger author rsandell; "It could have something to do with cancel previous patchsets, but I'm just guessing".
I'm also fairly certain its happening to us even without "Reload Configuration from Disk" because we normally don't use that (we've been using Jenkins since the days when build info would be lost and are afraid of it).
rsandell mentioned that core not sending a start/stop signal to the triggers when a reload from disk is performed makes it very hard for him to make GerritTrigger behave better.
Guys i also see it with gerrit builds. there is no one assigned to this.
It appears this is a known issue with Gerrit Trigger. See yesterday's IRC log, 23:05-23:25.
assigned to Robert Sandell (id: rsandell) since it seems to be gerrit plugin issues
Please open a new JIRA issue to track the Gerrit Trigger issue. This is definitely a different issue to the original bug although it has similar symptoms.
It gets too confusing if we overload one issue that has been resolved for over a year.
Okay, I have a case that isn't GerritTrigger related. The job uses just Git.
I get this output from my job checker:
myjob-release: Problem: STOLEN: The date build myjob-release/builds/2014-05-23_09-21-18 had its number stolen by myjob-release/builds/4113 -> 2014-05-23_09-36-18 Problem: STOLEN: The date build myjob-release/builds/2014-05-23_09-21-18 had its number stolen by myjob-release/builds/4113 -> 2014-05-23_09-36-18 Proposal: Relink 4113 to myjob-release/builds/2014-05-23_09-21-18 Proposal: Archive newer build myjob-release/builds/2014-05-23_09-36-18 Proposal: Relink 4113 to myjob-release/builds/2014-05-23_09-21-18 Proposal: Archive newer build myjob-release/builds/2014-05-23_09-36-18
So something else is holding on to an old job reference someplace.
I suggest logging that as a new issue and linking it to this one. Irrespective of whether or not it is related the comments trail/information is just too big/complex on this issue.
Many issues could have similar symptoms. The root cause of the original issue has been fixed. Please log any further occurrences in a new issue.
This bug is really annoying. I don't know what the root cause was here and how many bugs have the same symptom. It also happens again in Jenkins 1.570 and 1.571 on Windows Server 2008. Build occur after random time they disappear, symbolic links go away but builds are on disk...
Similar symptoms of "disappearing builds", but different root cause.
Code changed in jenkins
User: Jesse Glick
Path:
test/src/test/groovy/hudson/model/AbstractProjectTest.groovy
http://jenkins-ci.org/commit/jenkins/389a565de417170f586830ee9fa7a7ec9749fc68
Log:
JENKINS-10615 AbstractProjectTest.testWorkspaceLock frequently fails on jenkins.ci due to InterruptedException in HudsonTestCase.setUp.
Possibly because it is sorted after JENKINS-15156 testGetBuildAfterGC and the test suite times out.
(cherry picked from commit 85e9e126773c0bb20a8529a2e6591dde17d7e209)
Recently upgraded from Jenkins 1.540 to latest release 1.605, After upgrade noticed build history for all the Jenkins jobs were gone. I currently have close to 350 jobs and losing history is really pain. Had to downgrade to 1.540 to restore the environment.
Update: Tried "Reload Configuration" option but no luck.
After upgrade noticed build history for all the Jenkins jobs were gone
Completely unrelated issue.
Hi,
I am facing the same problem on Jenkins 1.555 on windows. Sometime build history item will disappear when running and leave the job unfinished.
So I am wondering this bug is fixed on which version? Can I avoid this problem by upgrading to new version?
Code changed in jenkins
User: Johno Crawford
Path:
test/src/main/java/org/jvnet/hudson/test/MemoryAssert.java
http://jenkins-ci.org/commit/jenkins-test-harness/6856f5db18f4295043438e4018e95bd454bdd0ca
Log:
JENKINS-15156 Refactored fix.
Originally-Committed-As: fe95fc53f891542e63e24a64c0b7add60ac91c64
Code changed in jenkins
User: Jesse Glick
Path:
test/src/main/java/org/jvnet/hudson/test/MemoryAssert.java
http://jenkins-ci.org/commit/jenkins-test-harness/9933530be8674c316f6be0fafc71c9239c0755f7
Log:
JENKINS-15156 Class loading errors can result from OOME if you are not careful.
Originally-Committed-As: 74e6409a06531144316f2e36bdc0b069de52b94d
@pedroreis rather than reopening an issue with a known fix, which makes it very hard to track what version of Jenkins has which fix, please file separate bug reports (linked to
JENKINS-8754) with details to reproduce the problem. Without a known way to reproduce, or at least some error messages in the log etc., it is impossible to diagnose anything.