Status: Resolved (View Workflow)
Linux host, Linux, OSX, Windows, slaves. Jenkins version 1.602.
The problem is as described in
JENKINS-4501. As requested in JENKINS-4501, I am creating a new issue as this problem still exists in 1.602.
In short, Jenkins silently and erroneously deletes workspaces on slaves for matrix projects that are not old.
Over the course of the time I've worked with Jenkins this behavior has created literally days of work and waiting on very long running builds that rely on cached workspaces to be manageable. It's cost me more hours again today after restoring jenkins to a new server after a hardware failure. This setting was reset since it exists outside normal recommended backup files and I didn't think to add it when I "fixed" this last time.
Would it not be easier to have hudson.model.WorkspaceCleanupThread.disabled default to true? Having the default behavior be "destroy my data" seems bad, especially with how cheap disk is now. I'm sure when this option was implemented it made a lot of sense, but when I can get a 1TB for $50, it just seems wrong-headed. Let the fallow workspaces lie. I can clean them up if I need to.
If that's not an acceptable solution, could it not be moved to a config location in the Jenkins home? That way we can be relatively sure that the setting will be propagated in backups and not bite someone who thought they solved this problem and had forgotten about it?
- is duplicated by
JENKINS-30916 workspace being deleted mid-build on slave
- relates to
JENKINS-51724 WorkspaceCleanupThread may delete workspaces for parallel-running AbstractProject builds
- links to
Code changed in jenkins
User: Reinhold Füreder
JENKINS-27329 Less aggressive WorkspaceCleanupThread (#3444) JENKINS-27329Less aggressive WorkspaceCleanupThread
I dare to claim that the default behaviour of WorkspaceCleanupThread is too aggressive => this little change is by no means perfect (or admittedly even far from perfect), but IMHO a saner or slightly more defensive default behaviour.
Mind that according to https://github.com/jenkinsci/jenkins/blob/9e64bcdcb4a2cf12d59dfa334e09ffb448d361e9/core/src/main/java/hudson/model/Job.java#L301 this "only" checks whether or not the last build of a job is in progress, while the JavaDoc says "Returns true if a build of this project is in progress." (cf. http://javadoc.jenkins-ci.org/hudson/model/Job.html#isBuilding--)
- Fix compilation
- Dummy commit to trigger pipeline
Previous pipeline execution (https://ci.jenkins.io/blue/organizations/jenkins/Core%2Fjenkins/detail/PR-3444/2/tests) failed with one failing test that at first glance appears to be unrelated with my change(s) and looks like a flaky test?
- Add fine logging message
*NOTE:* This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/
Functionality will be removed from GitHub.com on January 31st, 2019.
Fix has been applied in 2.125. IMHO the fix is not complete for parallel AbstractProject builds, but it is better than nothing. Will create a follow-up ticket
danielbeck this thing is marked as RFE in the changelog, but I think this is a bug. Would you agree if I recategorize it?
also seeing this happen recently, version 2.60.3