-
Task
-
Resolution: Fixed
-
Critical
-
None
-
Powered by SuggestiMate
Various people reported over time that Maven job type builds considerably slowly compared to the freestyle projects.
The feature does have some overhead, in that it definitely does more (for example, artifacts get archived while Maven runs, whereas freestyle projects do that after Maven has run), but it's also good to take a deep look into where the overhead is and see if anything appears out of place.
This issue tracks my investigation of this.
[JENKINS-22354] Maven job type performance improvement
I've run exec sudo tc qdisc add dev lo root netem delay 200ms prior to the experiment to introduce artificial 400ms roundtrip delay into the remoting communication between master and Maven process, to really stretch the problem.
I wrote a script to continuously monitor the thread dumps of Maven process via jstack. And mostly what I see is the classloader related activities. I'm going to have to verify whether the remote jar file cache is properly taking effect or not.
A noticeable amount of classloader activity took place trying to instantiate XStream2 like this:
at com.thoughtworks.xstream.XStream.buildMapper(XStream.java:474) at com.thoughtworks.xstream.XStream.<init>(XStream.java:451) at com.thoughtworks.xstream.XStream.<init>(XStream.java:381) at com.thoughtworks.xstream.XStream.<init>(XStream.java:336) at hudson.util.XStream2.<init>(XStream2.java:88) at jenkins.model.Jenkins.<clinit>(Jenkins.java:3941) at hudson.model.Computer.<clinit>(Computer.java:1358) at hudson.FilePath.act(FilePath.java:914) at hudson.FilePath.act(FilePath.java:887) at hudson.FilePath.digest(FilePath.java:1726) at hudson.maven.reporters.MavenFingerprinter.record(MavenFingerprinter.java:219)
Another instantiation of XStream was induced from MavenArtifact.<clinit> through Run.<clinit>. XStream instantiates a large number of converters, which causes a lot of classloading activities, which in turn requires multiple roundtrips to the master.
When I refactored code so as not to cause initialization of Jenkins nor Run class, I was able to cut down the execution time by more than 30% (3mins+ -> 2mins-)
Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
src/main/java/hudson/maven/PluginImpl.java
src/main/java/hudson/maven/reporters/MavenArtifact.java
http://jenkins-ci.org/commit/maven-plugin/a0d0100183b46294d229336d2f91bfdcadc2e318
Log:
JENKINS-22354
MavenArtifact class is loaded into Maven process, so don't drag too many classes dependencies into it.
Run class refers to a large number of classes, and in particular this code forces XStream instantiation which drags in quite a few number of classes
Compare: https://github.com/jenkinsci/maven-plugin/compare/d62796891bd6...a0d0100183b4
Archiving for 16MB artifacts took unmesurable small amount of time when built on master (29ms reported.) Built over slave, it took 13sec (scp took 7secs to copy.)
I didn't realize but starting Maven plugin 2.0, artifact archiving is queued up until the end of the module, and copy is done between master and slave, not between master and Maven, which I think helps considerably.
I found that one of the inefficiencies is around using RemoteInputStream when launching Maven on a slave.
When a channel is built to a Maven on slave, it'll look like this:
Master slave Maven ======================================================================== Channel Channel +- RemoteInputStream --> SocketInputStream --> <-- SocketOutputStream -+
So each time master's Channel reads something, it would have to wait for a full roundtrip between master and slave. This is very bad if the slave is trying to send large amount of data over channel.
I used a simple Callble from the master to Maven that returns 16MB of data with this latency-induced network, and verified that it took whopping 15mins.
A better way to do this is to have the slave pump SocketInputStream and feed data into master, then have the master buffer it.
Master slave Maven =============================================================================== Channel +- FastPipedInputStream pump thread +- FastPipedOutputStream <-- RemoteOutputStream | SocketInputStream -> ...
This hides latency better. With this change, the artificial 16MB callable completes in just 11secs.
Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
src/main/java/hudson/maven/AbstractMavenProcessFactory.java
http://jenkins-ci.org/commit/maven-plugin/cea15ea5cb11dc9cdafb2caa44d18c4c350017fe
Log:
JENKINS-22354
Avoid using RemoteInputStream that's inherently unsuitable for large
"read till EOF" read workload.
Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
src/main/java/hudson/maven/SplittableBuildListener.java
http://jenkins-ci.org/commit/maven-plugin/606dd38942c2fdea9f74c8bdf03230e652abdcff
Log:
JENKINS-22354
With cea15ea5cb11dc9cdafb2caa44d18c4c350017fe, a previously hidden dead
lock has surfaced. This change solves that problem.
The dead lock is caused if the written mark is propagated back to master before the SendMark method returns.
This is because the channel from Maven process piggy backs on I/O pipe
writer of master<->slave channel as a tunnel.
In the dead lock below, thread #10 holds the 'markCountLock' and waiting
for the response from SendMark task, which is supposed to come back from
master<->maven channel.
But this response is clogged waiting to be read by IO writer thread
(thread #11 below), which in turn blocks trying to acquire the
markCountLock.
"Computer.threadPoolForRemoting 11 : IO ID=116 : seq#=115" daemon prio=10 tid=0x00007f29f0035800 nid=0x69d2 waiting for monitor entry [0x00007f29f68c5000]
java.lang.Thread.State: BLOCKED (on object monitor)
at hudson.maven.SplittableBuildListener$2.onMarkFound(SplittableBuildListener.java:118)
- waiting to lock <0x000000078b996388> (a java.lang.Object)
at jenkins.util.MarkFindingOutputStream.write(MarkFindingOutputStream.java:59)
at java.io.PrintStream.write(PrintStream.java:447) - locked <0x000000078b995e58> (a java.io.PrintStream)
at hudson.util.DelegatingOutputStream.write(DelegatingOutputStream.java:56)
at hudson.tasks._maven.MavenConsoleAnnotator.eol(MavenConsoleAnnotator.java:75)
at hudson.console.LineTransformationOutputStream.eol(LineTransformationOutputStream.java:60)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:74)
at java.io.OutputStream.write(OutputStream.java:75)
at hudson.util.DelegatingOutputStream.write(DelegatingOutputStream.java:51)
at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:250)
at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
"Computer.threadPoolForRemoting 10 for Channel to Maven [java, -cp, /tmp/slave1/maven3-agent.jar:/usr/maven3/boot/plexus-classworlds-2.4.jar, org.jvnet.hudson.maven3.agent.Maven3Main, /usr/maven3, /tmp/slave1/slave.jar, /tmp/slave1/maven3-interceptor.jar, /tmp/slave1/maven3-interceptor-commons.jar, 54343] / waiting for hudson.slaves.Channels$1@5cb62a81:Channel to Maven [java, -cp, /tmp/slave1/maven3-agent.jar:/usr/maven3/boot/plexus-classworlds-2.4.jar, org.jvnet.hudson.maven3.agent.Maven3Main, /usr/maven3, /tmp/slave1/slave.jar, /tmp/slave1/maven3-interceptor.jar, /tmp/slave1/maven3-interceptor-commons.jar, 54343]" daemon prio=10 tid=0x00007f29f0033000 nid=0x6171 in Object.wait() [0x00007f29f62be000]
at hudson.remoting.Request.call(Request.java:146)
- locked <0x00000007da297040> (a hudson.remoting.UserRequest)
at hudson.remoting.Channel.call(Channel.java:722)
at hudson.maven.SplittableBuildListener.synchronizeOnMark(SplittableBuildListener.java:145) - locked <0x000000078b996388> (a java.lang.Object)
at hudson.maven.MavenBuild$ProxyImpl2.sync(MavenBuild.java:606)
at hudson.maven.MavenBuild$ProxyImpl2.start(MavenBuild.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at hudson.model.Executor$1.call(Executor.java:559)
at hudson.util.InterceptingProxy$1.invoke(InterceptingProxy.java:23)
at com.sun.proxy.$Proxy44.start(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:299)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:280)
Compare: https://github.com/jenkinsci/maven-plugin/compare/a0d0100183b4...606dd38942c2
This pattern of switching from pull to push is something the remoting library should support. This change is added to https://github.com/jenkinsci/remoting/commit/42ed097929beac8b588d90e7df092847e10c0a67 and several following commits.
Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
core/src/main/java/hudson/FilePath.java
core/src/main/java/hudson/Launcher.java
core/src/main/java/hudson/cli/ClientAuthenticationCache.java
core/src/main/java/hudson/os/SU.java
core/src/main/java/hudson/slaves/SlaveComputer.java
core/src/main/java/hudson/util/ProcessTree.java
core/src/main/java/jenkins/model/Jenkins.java
core/src/test/java/hudson/LauncherTest.java
http://jenkins-ci.org/commit/jenkins/0b27f364f99ec98cb616d4f0cfc0858ecf339852
Log:
JENKINS-22354
Avoid having FilePath to depend on Jenkins, which increases the amount of classloading that has to happen during remoting
Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
core/src/main/java/hudson/model/Result.java
http://jenkins-ci.org/commit/jenkins/8a530bb14b4325ff1a75ce89b63b53b7271a6702
Log:
JENKINS-22354
Reduce classloader activities.
This brings in Stapler and commons bean-utils that we don't need on the
slaves.
Compare: https://github.com/jenkinsci/jenkins/compare/dcfeaf3c2a12...8a530bb14b43
Integrated in jenkins_main_trunk #3280
JENKINS-22354 (Revision 0b27f364f99ec98cb616d4f0cfc0858ecf339852)
JENKINS-22354 (Revision 8a530bb14b4325ff1a75ce89b63b53b7271a6702)
Result = UNSTABLE
kohsuke : 0b27f364f99ec98cb616d4f0cfc0858ecf339852
Files :
- core/src/main/java/jenkins/model/Jenkins.java
- core/src/main/java/hudson/FilePath.java
- core/src/main/java/hudson/cli/ClientAuthenticationCache.java
- core/src/main/java/hudson/slaves/SlaveComputer.java
- core/src/test/java/hudson/LauncherTest.java
- core/src/main/java/hudson/util/ProcessTree.java
- core/src/main/java/hudson/os/SU.java
- core/src/main/java/hudson/Launcher.java
kohsuke : 8a530bb14b4325ff1a75ce89b63b53b7271a6702
Files :
- core/src/main/java/hudson/model/Result.java
Hi Kohsuke,
any updates to this issue? We have a large project where we want all 24 cores of our machine to build on but it get always stuck with that race condition at SplittableBuildListener. The only "solution" is to disable parallel maven builds.
Thanks in advance!
Max
maxzilla: It's not resolved in Maven plugin 2.2? The commit seems to indicate that the issue is fixed.
No, unfortunately it doesn't seem to fix it. We use Jenkins 1.562 and Maven plugin 2.3, so the fix should be deployed. Is there anything I have to keep in mind for the slave's configuration? We restarted the master and the slaves but issue still exists.
Yeah, this is still the case in Maven plugin 2.4-SNAPSHOT as of today - I opened JENKINS-23340 but then found this.
I've created a dummy single-module Maven project that produces 16MB artifact that consists of random bytes, in an attempt to reproduce artifact archiving overhead reported by some.