-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
Solaris 10, remote nfs filesystem, Jenkins 1.417
The post build action to clean up logs hangs onto the build until completed. On systems where a remote "safe" but slow filesystem must be used - for example corporate environments - this can take a very long time to complete. In my situation over 30mins and counting, blocking the next CI build.
After brief discussion, this activity could happen a-synchronously allowing the build job to complete and kick off the next build.
Here's the thread dump in question:
Executor #1 for SLAVEWIN-D : executing JOB_CI #2824
"Executor #1 for SLAVEWIN-D : executing JOB_CI #2824" Id=78 Group=main RUNNABLE (in native)
at java.io.UnixFileSystem.delete0(Native Method)
at java.io.UnixFileSystem.delete(UnixFileSystem.java:251)
at java.io.File.delete(File.java:904)
at hudson.Util.deleteFile(Util.java:233)
at hudson.Util.deleteRecursive(Util.java:305)
at hudson.Util.deleteContentsRecursive(Util.java:224)
at hudson.Util.deleteRecursive(Util.java:304)
at hudson.model.Run.delete(Run.java:1197)
- locked hudson.maven.MavenModuleSetBuild@e628d8
at hudson.model.AbstractBuild.delete(AbstractBuild.java:344) - locked hudson.maven.MavenModuleSetBuild@e628d8
at hudson.maven.MavenModuleSetBuild.delete(MavenModuleSetBuild.java:389) - locked hudson.maven.MavenModuleSetBuild@e628d8
at hudson.tasks.LogRotator.perform(LogRotator.java:131)
at hudson.model.Job.logRotate(Job.java:319)
at hudson.maven.MavenModuleSet.logRotate(MavenModuleSet.java:573)
at hudson.model.Run.run(Run.java:1440)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:465)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:146)
And a snippet of the discussion so far [29/June/2011]: (http://echelog.matzon.dk/logs/browse/jenkins/1309298400)
[18:04:45] <banoss> http://pastebin.com/Kk7i76xa this happens after every build and "hangs onto" the job. The problem i have is a very slow remote filesystem
[18:43:20] <kohsuke> banoss: Is it slower than "rm -rf"?
[18:57:41] <banoss> kohsuke: No its not. But it blocks the build for a very long time. Todays was over 30mins and counting.
[18:58:01] <kohsuke> any reason why you use remote file system for a build?
[18:58:11] <banoss> because im in a big corp
[18:58:29] <banoss> I've raised a service incident record and am asking for a local file system...
[18:58:50] <kohsuke> oh, you don't even have a local file system? not even tmps?
[18:58:51] <kohsuke> tmpfs
[18:59:08] <banoss> not big enough, everything is virtualised
[19:02:42] <kohsuke> log files are compressed at the end. Is it still too big?
[19:03:18] <banoss> yup. Without workspaces JENKINS_HOME was using about 100GB
[19:03:33] <banoss> Ive trimmed it massively now for performance
[19:04:49] <banoss> Jenkins is victim of his own success here. We use it for loads of historical and trend information
[19:05:00] <banoss> version tracking etc
[19:05:07] <banoss> so we keep loads of logs
[19:19:48] <kohsuke> banoss: I suppose there's no need for log rotation to be synchronous
[19:20:00] <kohsuke> If we push it to background, that should help, no?
[19:20:28] <banoss> massively
[19:21:09] <kohsuke> I think you should file a ticket for this.
[19:21:12] <banoss> it means i can keep JENKINS_HOME on whats considered a "safe" file system - which is a requirement for the build service around here
[19:21:20] <kohsuke> Would you be interested in working on a fix yourself? I can give you some pointers
[19:21:33] <banoss> I'll do so first thing in the morning
[19:21:36] <kohsuke> Jenkins Office Hours is just coming up in 30 mins. This could be our first topic.