-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Apache Software Foundation, Hudson 1.376, Ubuntu server, Sun 1.6 JDK, 4Gb RAM
In the ASF Hudson installation, rebuilding the dependency graph is quite slow (minutes). As rebuilding seems to happen twice on saving project configuration (AbstractProject.doConfigSubmit()), working with the project configuration is very slow and will frequently time out.
We have a large number of jobs (>300) and from a quick look at the code, it seems like building the dependency graph is done even if there are no changes to the dependencies. Also, building the graph seems to be done by iterating over all jobs which will in turn iterator over all other jobs. This doesn't seem to scale to our size
Here's a typical thread dump from saving a project configuration:
"Handling POST /hudson/job/ftpserver-trunk-jdk1.6-osx/configSubmit : http-8090-27" daemon prio=10 tid=0x000000004340a000 nid=0x4576 runnable [0x00007f2fa8e48000]
java.lang.Thread.State: RUNNABLE
at java.lang.String.intern(Native Method)
at hudson.maven.ModuleDependency.<init>(ModuleDependency.java:48)
at hudson.maven.ModuleDependency.<init>(ModuleDependency.java:54)
at hudson.maven.MavenModule.asDependency(MavenModule.java:296)
at hudson.maven.MavenModule.buildDependencyGraph(MavenModule.java:385)
at hudson.maven.MavenModuleSet.buildDependencyGraph(MavenModuleSet.java:487)
at hudson.model.DependencyGraph.<init>(DependencyGraph.java:100)
at hudson.model.Hudson.rebuildDependencyGraph(Hudson.java:3346)
at hudson.model.AbstractProject.doConfigSubmit(AbstractProject.java:588)
at sun.reflect.GeneratedMethodAccessor1225.invoke(Unknown Source)
...
- is blocking
-
JENKINS-9301 Improve dependency graph calculation (for maven jobs)
-
- Resolved
-
[JENKINS-7535] Rebuilding dependency graph slow on large installations
Attachment | New: JENKINS-7535-maven-plugin_evernat.patch [ 20285 ] |
It would be lovely if the dependency computation were to use a more clever algorithm so that it is very fast, but there are some quick wins available:
a) don't do it if nothing changed
b) don't do it if another thread is already doing a global recomputation. At the least, just have a single thread that does this dependency computation and have save mark the graph is needing an update. That way nobody is delayed and the worst that happens is one thread does the updates back-to-back forever.
c) only change as much as is needed by the recently changed dependencies. This is an extension of (a) to make the entire computation incremental. Basically, you can keep an update time on each node and on the graph in general. In doing the dependency computation, you can skip part of the computation as soon as you note that the node under consideration hasn't changed more recently than the last graph update.