-
Bug
-
Resolution: Fixed
-
Blocker
-
Platform: All, OS: All
-
Powered by SuggestiMate -
2.0.21
Removal of a job leaves the workspace intact.
- causes
-
JENKINS-60451 "JENKINS-2111" warning gets logged for master with 0 executors
-
- Fixed but Unreleased
-
- is duplicated by
-
JENKINS-44360 '%' in branch name causes GitHub multi-branch job failures
-
- Open
-
-
JENKINS-34177 Delete workspaces when deleting a job
-
- Resolved
-
-
JENKINS-11046 Project deletion does not wipe out project workspaces
-
- Resolved
-
-
JENKINS-36264 Deleted builds do not delete workspace
-
- Resolved
-
-
JENKINS-40930 WorkspaceCleanupThread doesn't work properly with Pipeline jobs
-
- Resolved
-
-
JENKINS-51897 Delete work space when PR closed.
-
- Resolved
-
- is related to
-
JENKINS-22240 Workspace folder not renamed/deleted when renaming job
-
- Open
-
-
JENKINS-60969 @libs pseudo-workspace collision due to branch name truncation
-
- In Progress
-
- relates to
-
JENKINS-54654 A recent update breaks builds by escaping slashes to percent signs in workspace paths
-
- Resolved
-
-
JENKINS-26471 Split WorkspaceCleanupThread from core
-
- Open
-
-
JENKINS-30148 Allocate shorter workspace if it will be too long for reasonable use inside build
-
- Open
-
-
JENKINS-38706 Workspace directory names mangled in multibranch pipeline
-
- Resolved
-
-
JENKINS-54640 Workspace folders are not unique
-
- Closed
-
-
JENKINS-67836 Pipeline: Groovy Plugin [SECURITY-2463] excessive path length
-
- Open
-
-
JENKINS-34564 Give the ability to choose how the multibranch subprojects will be named.
-
- Resolved
-
- links to
[JENKINS-2111] removing a job (including multibranch/org folder branches/repos) does not remove the workspace
Re: "feasible"
Would it also be feasible to have all (online) slaves (attempt to) rename their workspace when a job is renamed?
I think that having all (online) slaves (attempt to) delete/rename the workspace belonging to a job that is being deleted/renamed would be a good idea.
Jenkins already has a web-ui page asking "are you sure" when you ask it to delete/rename a job, so this page could also be the place where the question is asked about whether or not Jenkins should attempt to delete/rename the matching workspaces.
Note: Any code that's iterating over multiple slaves, asking them to delete/rename workspace folders, MUST NOT STOP if one slave declares a failure. We have (far too many) Windows-based slaves, and filesystem operations on Windows aren't reliable as the OS can decide (at any point, albeit briefly) that a file is locked and hence cause a deletion/rename to fail because a "file is in use", so we've leaned the hard way that if you want something to "just work" then you need to cope with filesystem "failures" (because they're often not reporting a fatal problem).
i.e. the loop going over all slaves should ask them to try, but shouldn't stop asking the 3rd slave just because the 2nd failed - this "tidy up" operation should be on a "best effort" basis, not a "succeed or die trying" basis.
I would also suggest that any operation that's being done "on all slaves" should run in multiple threads, rather than asking each slave in turn.
From the user's side being asked to delete/rename workspace when deleting/renaming a job looks quite fine. The options on job deletion could be:
- delete altogether
- leave intact
And on renaming:
- rename to the job's new name
- use the old workspace (the "Use custom workspace" option)
- delete and create a new workspace with job's name
- leave intact and create a new workspace with job's name
Not every user is able to operate directly in the filesystem of a particular slave (either by ssh or directly). Maybe such users should not be able to leave a workspace intact, because otherwise they would waste storage space, which then will have to be cleaned by somebody else, like a sysadmin. (But there's a workaround with creating a new job with the old name, "catching" the workspace and wiping it.)
There is a major exclusion with workspaces used by several jobs. They should be detected and only options 2 and 4 should be available. Maybe the whole feature/bug of not-auto-renaming exists to support reused workspaces?
If a node is offline, it may be useful to allow only "use custom" and "leave intact". The user will be able to explicitly rename, delete or set up the new workspace when the node is online again.
For me the workdirs on the Jenkins master (no slaves) are never deleted, no matter whether I delete them via REST or from the GUI. Our Jenkins version is 1.596.2. What am I doing wrong? All workdirs are stored unter /opt/.jenkins/workspace. Is that a non-default directory?
With the multi-branch/pipeline setup that is getting popular now, this really is a MUST in order to handle disk usage well
We are also experiencing this. It seems to cause issues at the most inconvenient time. (Of course when would it ever be convenient). Please give this more priority.
THANKS!
From a jglick comment over on JENKINS-34177:
I think this is actually a more general core issue: Job.delete (or some associated ItemListener.onDeleted should proactively delete any associated workspaces it can find on any connected nodes. WorkspaceCleanupThread as currently implemented is not going to find them.
We have written a script and scheduled it with cron:
#TODO handle folders with spaces IFS=$'\n' #find empty job directories & form the new folder structure for workspace directories emptydirs_jobs=$(find . -type d -empty | cut -d '/' -f2-6) emptydirs_workspace=$(find . -type d -empty | cut -d '/' -f2,4,6) #remove the corresponding directory from workspace for i in $emptydirs_workspace; do rm -rf /var/jenkins_home/workspace/$i done #remove empty directories from jobs for j in $emptydirs_jobs; do rm -rf /var/jenkins_home/jobs/$j done
I hope this can help you until a fix is provided.
jglick Feels to me like Job.performDelete may be more of the right place to do this?
Also, blergh, finding all the workspaces for a Pipeline job is...hard. node.getWorkspaceFor isn't useful here. I think we'd need to look for all FlowNode on a WorkflowRun for a given WorkflowJob to see if they've got a WorkspaceAction and then act on those WorkspaceAction...which is demented. Oy.
FWIW, I wrote this today:
Closure cleanMultiBranchWorkspaces cleanMultiBranchWorkspaces = { item -> if (item instanceof com.cloudbees.hudson.plugins.folder.Folder) { if (item.name == 'archive') { println "Skipping $item" } else { println "Found folder $item, checking its items" item.items.each { cleanMultiBranchWorkspaces(it) } } } else if (item instanceof org.jenkinsci.plugins.workflow.multibranch.WorkflowMultiBranchProject) { println "Found a multi-branch workflow $item" workspaces = jenkins.model.Jenkins.instance.nodes.collect { it.getWorkspaceFor(item).listDirectories() }.flatten().findAll { it != null } def activeBranches = item.items.name println "Active branches = $activeBranches" if (workspaces) { workspaces.removeAll { workspace -> activeBranches.any { workspace.name.startsWith(it) } } workspaces.each { println "Removing workspace $it.name on ${it.toComputer().name} without active branch" } } } } jenkins.model.Jenkins.instance.items.each { cleanMultiBranchWorkspaces(it) }
Need to switch the startsWith to a regex for more exact matching.
Created a job with the Groovy plugin executing this as a system script.
I will try to solve this for branch projects as part of JENKINS-34564, since these are especially likely to be created and discarded rapidly.
I think a general implementation need not really be that difficult. Each node (master, agent) should just pay attention to when a workspace is used. (If in core, via WorkspaceList; otherwise, perhaps via WorkspaceListener.) Then record a workspaces.xml, a sibling of workspace/, with a list of records: relative workspace path, Item.fullName, timestamp. Periodically, or when an agent comes online, etc., iterate the list and check for jobs which no longer exist under that name (covers JENKINS-22240), or workspaces which have not been used in a long time. If in a plugin (JENKINS-26471) you could get fancy and modify behavior according to free disk space, etc.
So was it addressed in JENKINS-34564 ? I've had a look at the commit mentioned in Jira, but I couldn't easily see any code pertaining to the deletion of the workspaces.
I couldn't easily see any code pertaining to the deletion of the workspaces.
jglick does that imply this can be closed as a newer branch-api-plugin has a fix for this?
I'm not sure if my issue is related.
We use mulitbranch to make a php site. Part of the script creates a database based on the branch name. It would be nice to have an onDelete hook we can include code to cleanup the site when we remove the branch. Something in the groovy script would be nice.
onDelete {
// clean up DB and composer files.
}
I can make a new ticket if this is not the right thread for this.
jglick
Jesse, I'm new to Jenkins and this forum. I apologize for any newbie errors in advance.
Am I seeing the problem that you resolved?
I have directories for Multi-configuration projects sticking around after the Discard Old Builds / Max # of builds to keep (15) has been exceeded. Jobs at the bottom of the Build History (after the 15th job) disappear as new jobs complete. The directories that are kept that I'd expect to be deleted are named as follows.
/export/build/<slave node>/sub-build/<Jenkins project>/<CONFIGURATION>/build/999033
/export/build/<slave node>/sub-build/<Jenkins project>/<CONFIGURATION>/build/999035
where CONFIGURATION=${label}${target}${platform}_${type}
The 33 and 35 in the 999033 and 999035 at the end of the path match the Build History build numbers.
Do the above directories correspond to workspaces?
My Discard Old Build settings are:
Strategy Log Rotation (note: this is the only option given)
Days to keep builds 7
Max # of builds to keep 15
I turned on some logging. Are the negative ones below a problem? Is there a way for me to determine if ANYTHING is being deleted?
Feb 02, 2017 4:42:37 PM FINE hudson.tasks.LogRotator
Running the log rotation for hudson.matrix.MatrixConfiguration@6ec64ed8[<Jenkins project>/label=e,platform=a,target=s,type=d] with numToKeep=-1 daysToKeep=-1 artifactNumToKeep=-1 artifactDaysToKeep=-1
Feb 02, 2017 4:44:30 PM FINE hudson.tasks.LogRotator
Running the log rotation for hudson.matrix.MatrixConfiguration@7b748f18[<Jenkins project>/label=e,platform=a,target=c,type=d] with numToKeep=-1 daysToKeep=-1 artifactNumToKeep=-1 artifactDaysToKeep=-1
I am running 1.609. I don't see your fix for this discarder issue listed in the change log: https://jenkins.io/changelog/
Thank you for your help.
does that imply this can be closed as a newer branch-api-plugin has a fix for this?
No, it has a fix for the limited case that the Job is in fact a branch project, and the agent is online at the time.
michaelpporter see JENKINS-40606.
sepstein no those are build directories, not workspaces, so unrelated to this ticket.
Any update on this one? Is a fix in progress? The fact that PR branch job workspaces are not deleted when the PR job is deleted is major drawback. Disk space on my slaves is being eaten up rapidly.
boon FYI, I ended up rolling my own. I have a webhook on branch delete which calls a PHP script and passed the repo and branch.
It would be nice to have a natural solution.
I was discussing with others how we could resolve this issues with a maintenance job. We are making use of the multi-branch pipeline jobs and this causes our servers to run out of space on a regular basis. This is what we came up with:
- assuming that we can ask Jenkins to generate the name of the workspace folder for each active job, we can create a list of "workspaces that may still be in use"
- a job on each worker could enumerate the workspace folders and any not on the list generate by #1 would be deleted
Jenkins does appear to use a deterministic algorithm for generating the workspace name or stores the workspace name in the Job's data. The question is: is that workspace folder name accessible through APIs such that we can write this cleanup job? I haven't started looking at the Jenkins APIs to determine if the workspace folder name is exposed, but we're going to go down this road to see what we can do while we wait for the real solution (it's been 9 years already and we don't expect this to be fixed soon).
This did get me thinking: if this was a feature in Jenkins, then the Worker process could easily do this periodically based on some system-wide configuration. No need for the main Jenkins instance to keep a running list of "workspaces to be deleted" or worrying about workers being offline at the time of the Job being deleted. Just let the worker do it's own cleanup IF the system is configured for it.
While Jenkins does have a deterministic algorithm for generating the workspace folder, it doesn't always get to use that algorithm: some jobs specify a custom workspace. Also, jobs can get renamed and this would mean that you'd get the wrong answer from the algorithm.
However I agree that making the slaves responsible for driving this clean-up process is the right way to do this: it's only ever a problem for slaves that still exist, and it'll scale a lot better if it's driven by the slaves instead of by the master.
What I would suggest is this:
- The Jenkins server be enhanced so that a slave can ask the master what workspace folders that the master expects it to possess.
- This would basically be a summary of all the "workspace" links that point at the requesting slave.
- If computing this is particularly computationally intensive at scale then Jenkins could cache a map of slave to List<workspacePath> in memory.
- The Jenkins slave be enhanced so that it keeps a list of folders that it has used as the workspace for builds that it has run, and also any parent folders that it needed to create in order to do this.
- This list would be added to whenever a job starts using a workspace folder.
- It would have to be persisted to disk.
- Folders would be removed from the list once they'd been deleted.
- Access to this list would have to be thread-safe, and "watchable" by the clean-up process.
- The Jenkins slave be enhanced so that, periodically, it compares this list of known workspace folders with the list of folders that the Jenkins server knows about, and then goes about deleting any folders that are no longer on the list known workspaces.
- It would have to only remove a folder from the list once it'd been successfully deleted. It's quite possible for deletion to fail (especially on Windows) if processes have things "locked open", so we need to keep things on the list of things to be deleted until they've been deleted.
- It would have to only delete folders that it had created itself.
- It would have to avoid recursively deleting content that belonged to another build, e.g. if one old build had a workspace /foo/... which is no longer known to the server but a new (current) build is known to use /foo/bar/... then the /foo/bar/ folder (and contents) should not be removed when removing /foo/.
- It's possible that something else might delete a folder (e.g. a user, a build process, or another cleanup plugin) and, if the folder doesn't exist anymore then it should immediately get removed from the list of stuff to be deleted.
I'm less certain of exactly when would be a "good time" to trigger the clean-up process. Ideally, we would do the deletion in the background on the slave when the slave isn't otherwise busy, but that'd mean a busy slave would be most likely to run out of space. Maybe we should do it periodically in the background "regardless" but in the foreground (blocking the running of a new job) when space gets tight?
Also, if we were to implement the deletion as a background job, the implementation would need to be able to halt the deletion the moment a build job started that used the same workspace as we were currently deleting (the method that adds a workspace to the list of workspace would have to block until it was sure no further deletion was in progress).
I also think there's a fair amount of crossover between the resource cleanup functionality in the Jenkins master and what we require here, e.g. once this has been implemented, the resource cleanup functionality in the Jenkins master could be amended to simply delegate the removal of resources to the slave instead of having to take responsibility for it itself.
Overall, this doesn't sound like it's going to be trivial to implement in a non-disruptive way (garbage-collection is rarely simple) but at least it's a fairly well defined problem
I would like the ability to call a cleanup script. We use Jenkins Mulitbranch Pipelines to build php websites. For me cleanup is removing a database and removing files from a custom workspace. At this time we call a php file on branch delete, which then processes a bash script that is saved on project build.
michaelpporter I'd suggest that you take a look at the Post build task plugin and the PostBuildScript plugin. Either of those might satisfy your requirements. We have much the same requirements where I work and we use these (and we make our build setup code tolerant to a messy starting environment).
That said, I'm don't think that the ability to call a cleanup script is really a core requirement of this issue - this issue is all to do with managing the workspace folders created by jobs, not in dealing with other job-specific resources used by jobs.
Thank you for the input. The plugins you mentioned will not help as I we do not want to call the cleanup post build, that is easy in pipeline. We want a cleanup hook in Jenkins that fires when it detects a branch is removed. As you mentioned there might be custom workspaces that are not Jenkins aware. I imagine the solution for cleaning those assets could also clean up other items.
Have you considered building up and breaking down the database and other "external" dependencies as part of your build? We do this using Docker in our builds and the overhead of creating, initializing, and the subsequent teardown is negligible.
I will say that, at least for a multibranch pipeline, having the ability to kick off a job on child job deletion/creation would be an interesting feature, but I agree anything like that is outside the scope of the issue here.
As a workaround for this had to write a cron job that scans the get branches, matches them up with the workspace folders, then removes them. An option that would just do then whenever the job is removed would be fantastic.
If anyone is interested in trying my fix for this (and other) issues, you can install this experimental build.
nice to see this happening!
Also what is up with the date on this ticket: create in 2008? It foretold Pipeline-as-Code?
I tried your experimental build and it worked for the slave agents. When the, apparently scheduled, clean up occurred, it removed <name> and <name>@tmp folders from all the slave agents. However, I still have a <name>@script folder on my Jenkins master. Is that the responsibility of another component to delete?
Edit:
Nevermind. I recently updated the subversion plugin which enabled lightweight checkouts and no longer creates <name>@script folders, so it is a non-issue.
I still have a <name>@script folder on my Jenkins master. Is that the responsibility of another component to delete?
Yes, this is created by Pipeline builds under certain SCM configurations (“heavyweight checkouts”), and is not covered by the workspace infrastructure. Probably those should be cleaned up, too, but it would be a separate patch.
jglick Changes to Branch API has got strange behavior, please see my below images:
Jenkins job:
Workspace folder when builds on master Jenkins:
Looks better:
Looks same as before (but why one branch works and another fails to old behavior)
when building on Slave: (Looks good, but the path is too long and variable -Djenkins.branch.WorkspaceLocatorImpl.PATH_MAX=2 doesn't help now, I was able to short path before, looks like new plugin isn't caring variable)
Jenkins Version - 2.138.2
Branch API 2.0.21-rc632.1afb188ed43f
shankar4ever for compatibility, existing workspaces created under the old scheme continue to be used (but their names are now tracked). The new naming policy applies only when defining a workspace for a given job on a given node for the first time.
jglick I cleared up everything from my workspace (master/slave), I still see the same issue with folder name on slave workspace (Still adding %2F. I created a new branch to test, so there were no history about this branch)
Also, Jenkins master workspace has a file called `workspace.txt`, but slave doesn't have it.
Build on master looks good
And you are using a multibranch project in each case? For now, the feature is limited to branch projects by default; you can run with -Djenkins.branch.WorkspaceLocatorImpl.MODE=ENABLED to apply it to all projects.
Also check your system log for any warnings.
rsandell just released. Arguably should have been as 2.1, but I have never been able to make sense of the version numbering schemes used in plugins historically maintained by stephenconnolly (since we certainly are not enforcing semver).
All the projects are using Declarative pipeline with multi-branch. I am running jenkins with -Djenkins.branch.WorkspaceLocatorImpl.PATH_MAX=2. I will add -Djenkins.branch.WorkspaceLocatorImpl.MODE=ENABLED and see if that helps.
PATH_MAX is ignored by the new implementation except for purposes of identifying folders apparently created by the old implementation. In other words, it has no effect on a node without existing workspaces.
That didn't help I see %2F in folder name on slaves
Dont see anything related info in system log.
shankar4ever no idea. Please install this update and create a logger (Manage Jenkins » System Log) on jenkins.branch.WorkspaceLocatorImpl recording at least FINER and check for messages there when you build a branch project for the first time on a given agent. Also check whether c:\Jenkins\workspace\workspaces.txt exists and, if so, what it contains.
jglick I tried to follow your instruction, Earlier these wasn't any workspace.txt, but I see that now, however its empty and folders are still with %2F
also, no logs being generated as part of jenkins.branch.WorkspaceLocatorImpl logger (note, I have select `All` to record)
jglick this commit (https://github.com/jenkinsci/branch-api-plugin/commit/481e4857c9a450c2904ff5188aa9076cf1db1f1c) resulted in a fairly consistent but slightly random and mind boggling build failure.
We just upgraded to Jenkins ver. 2.151 today (and upgraded all of our plugins at the same time).
We were able to review our plugin update list and then sifted through commits. Eventually we settled on this one and fixed it by downgrading
Branch API from 2.0.21 to 2.0.20 which effectively backed this out.
We have a declarative Jenkinsfile pipeline file which uses parallel in a stage... It looks approximately like this. Note that We define an agent for the outer job and for Scala Full, but not for Frontend Prod. Frontend Prod builds happily (which means it can find the npm.sh file), Scala Full fails with an error trying to say that it can't find the sbt.sh file (these files all exist in a single git repository). If we pin all stages to a single node, it works. We're using the branch plugin to monitor pull requests from github and it's automagically building our commits based on that...
pipeline { agent { label "pipeline" } environment { PIPELINE_SCRIPTS="$WORKSPACE/pipeline/scripts" SBT_SCRIPT="$PIPELINE_SCRIPTS/sbt.sh" NPM_SCRIPT="$PIPELINE_SCRIPTS/npm.sh" } stages { stage ("Checkout") { steps { script { result = sh (script: "git log -1 --pretty=format:'%an' | grep 'Jenkins'", returnStatus: true) if (result != 0) { echo ("Non jenkins user commit") } } } } stage("Compile") { parallel { stage("Scala Full") { agent { label "pipeline" } steps { sh "$SBT_SCRIPT compile" } } stage("Frontend Prod") { steps { sh "$NPM_SCRIPT install" } } } } } }
shankar4ever sorry, I have no idea what is happening on your machine. You may want to install the Support Core plugin, which captures much richer diagnostics, and either attach a generated support bundle here or send it to me privately. (If attaching publicly, I recommend using the anonymization option, unless this is just a test server with no confidential projects.)
jsoref I cannot even guess what might be causing your build failure. If you manage to narrow it down to a minimal, reproducible test case, please file a bug report and Link to it from here.
jsoref: I would expect that if you specify nested 'pipeline' agent then the inner one should get a new workspace, which is why it does not have access to the git repo, which had been cloned to the workspace of the outer agent.
You could get access to it if you `stash`ed the git repo after checkout and then `unstash`ed it in the nested pipeline stage.
However, the pipeline does not look clean to me. What are you trying to achieve with such a nested pipeline anyway?
The nested agents do get their own git clones. The ones without a specified agent run on the same node as the main pipeline. I only need to use stash to share build products (omitted because it isn't relevant to the failure).
I see this message in our console log:
Feb 18, 2019 8:29:06 AM WARNING jenkins.branch.WorkspaceLocatorImpl getWorkspaceRoot
JENKINS-2111 path sanitization ineffective when using legacy Workspace Root Directory ‘${ITEM_ROOTDIR}/workspace’; switch to ‘${JENKINS_HOME}/workspace/${ITEM_FULL_NAME}’ as in JENKINS-8446 / JENKINS-21942
It seems harmless, but how can I fix it?
davida2009 it is not harmless. See JENKINS-21942 esp. my last comment of 2018-04-24. Or just stop doing builds on the master.
jglick Thanks for your reply. I'm afraid I'm stuck on this one. I don't understand what the console message means nor where to look in order to fix it. Please will you explain what I need to do?
Please ask batmat in JENKINS-21942, who was the last to work on that area.
I am using Jenkins 2.204.3 and Pipeline Plugin 2.6 and this issue is clearly reproducible for me.
I delete the job with a "Delete Pipeline" button and I see all these workspaces present on every node:
- <workspace> itself
- <workspace>@1,2,3...
- <workspace>@tmp
Should I reopen this ticket or wait for the next version of the core or plugin?
alexander_samoylov neither. You should file a fresh issue with complete, self-contained, explicit steps to reproduce from scratch, and Link to this one. (By the way “Pipeline Plugin 2.6” is meaningless. This is just an aggregator with no code.)
I just came across the warning referring to this ticket, on a Jenkins deployment with no executors on the Jenkins node (which should not happen).
> JENKINS-2111 path sanitization ineffective when using legacy Workspace Root Directory
There are actually two possible sources of this warning in the Branch API Plugin itself, checked on v2.6.2.
If the message is immediately followed (assuming you have FINE logging visible, otherwise you won't see it) by
> no available workspace root for hudson.model.Hudson@...
then it's come from a call to locate, which will happen when a job is deleted or renamed, as the Deleter runs a task on each node (jenkins.getNodes()), which will trigger a call to locate for the Jenkins node, which hits this warning, returns null, and is ignored.
If no such message followed the warning (and you have FINE logging visible), then it might have come from Collector reacting to the Jenkins node coming online and calling getWorkspaceRoot directly, which will again return null and be ignored.
This warning means it won't clean up orphaned/renamed workspaces on the Jenkins node, but if you have no executor there, then you shouldn't have any to clean up anyway.
Edit: I just noticed that JENKINS-60451 was logged for exactly this.
I think it's possible to reference environment variables in the 'Custom Workspace' option, and if those are only defined during a build or occasionally change values, this will result in the deletion of legitimate workspaces.
Jenkins is a complex system, so any magic here will break things horribly for quite a few users – see the frequent complaints about the existing WorkspceCleanupThread.
What would likely be feasible: If there are workspaces, ask whether they should be deleted as well when the project gets deleted. If a slave is offline, it's your responsibility to clean up.