Status: Closed (View Workflow)
Resolution: Not A Defect
master on Win10x64, slaves on Win10x64,
When using the deleteDir() command in a pipeline it occurs very often that that command fails with the attached exception (the command will pass only about 1 of 4 times which is pretty frustrating...)
Strange thing is, that the same pipeline is executed on 2 different nodes (but same hardware/software/java-vm setup of the nodes) where one of the 2 nodes almost never has the problem and the other node as described just passes in about 1 of 4 times.
Searching a bit has taken me to https://stackoverflow.com/questions/48311252/a-bit-strange-behaviour-of-files-delete-and-files-deleteifexists which might fit to the scenario in FilePath.java in deleteContentsRecursive(...)
- relates to
JENKINS-52402 Job fails randomly when attempt to lock resources
JENKINS-52416 Random AccessDeniedExceptions in jenkins.err.log
JENKINS-52404 Job fails randomly when using copyArtifact
- Fixed but Unreleased
thanks to your recommendation! Looking into fileaccess with ProcessMonitor has also shown the AV-Scanner as root cause of the issue.
We now have excluded the Jenkins-Dirs from the AV-Scanner and the Exception occurs a lot less than before. Anyway in ~1 of 20 times such an exception still occurs (maybe the windows search indexer,...).
Since Jenkins had troubles with file-system access in such scenarios we have done some tests with other CI/CD systems which do not have problems in that area (but other issues, limitations) so there might be a solution to handling files even tough an AV-scanner is active.
Jenkins already has a configurable failover logic: https://github.com/jenkinsci/jenkins/blob/d71ac6ffe98ee62e0353af7a948a4ae1a69b67e9/core/src/main/java/hudson/Util.java#L1702-L1729 . In your case you can play with a number of retries and intervals to find a convenient behavior
It is an architectural limitation in the filesystem you use, and unfortunately we cannot do much excepting retrying. It maybe be possible to add an extra feature flag to give up and proceed with warnings if the logic fails to delete the directory, but somebody would need to propose a PR for that.
Thanks for the hint of configurable retry/timeout limits - did not know that. In that case that issue can be closed.
I have reported some other issues which all seem to be related. By disabling the AV-Scanner these issues also seem to be resolved (as fas as I can see now) (JENKINS-52416
JENKINS-52404 , JENKINS-52402)
Thanks for the support!!
You are welcome. Please feel free to create follow-up feature requests for advanced failover logic if needed
You need to check which process locks the file. It may be antivirus, runaway process from the build, or any other process