Status: Closed (View Workflow)
When artifacts are archived at the end of a build (hudson.tasks.ArtifactArchiver), they get transferred from the slave to the master through a stream of gzipped tar data (jenkins.model.StandardArtifactManager calling hudson.FilePath.copyRecursiveTo). While I understand the intent of the involved compression (using some CPU to reduce bandwidth usage), I think it's not always the right thing to do:
- in some (many?) cases, the biggest artifacts are already compressed (.jar/.deb/.rpm/...), and gzipping the stream won't really reduce its length
- in some cases, bandwidth is much less an issue than CPU (think co-located master and slave VMs on an very loaded host)
For what it's worth (results would differ a lot on a different platform), I've done some benchmarks of archiving 1GB of random data from a slave to a master:
- "scp" (with aes128-bcb cipher) takes ~25 seconds
- Jenkins 1.596.3 (with a custom slave connector, also using aes128-bcb for its the ssh connection) takes ~105 seconds. See flamegraph-1.svg for 30 secs of CPU sampling.
- same Jenkins, plus an ugly hack to disable GZip compression, takes ~65 seconds. See flamegraph-2.svg for 30 secs of CPU sampling.
The flamegraph-1.svg shows (without surprise) that a lot of CPU time is spent in jzlib (~38%, which is actually ~76% average usage of one of the two VCPU). That's what I would like to (optionally) avoid.
My experiment for disabling GZip compression is implemented in a custom ArtifactManager, inheriting from StandardArtifactManager. In the archive method, in the case of a remote->local copy, I've replaced the direct call to FilePath.copyRecursiveTo by a similar implementation, which calls FilePath.writeToTar/FilePath.readFromTar} using an uncompressed stream (I've replaced TarCompression.GZIP with TarCompression.NONE). Because these FilePath methods are private, invocation is done by reflection. Really, this was just an experiment...
Now, here are the two simplest options I can think of for a "clean" implementation of a GZip-disabling option:
- a boolean system property. I would be more than happy with that. But I might also end being the only user of such an hidden option.
- a new public variation of the FilePath.copyRecursiveTo method, with an additional boolean parameter. If we have that, then we can implement a gzip-disabling ArtifactManager without copy/pasting code from FilePath and making reflection calls to private methods.
If one of these solutions sounds acceptable, I can submit a PR.
JENKINS-26008 Use of compression during copying of artifacts cut throughput by a factor of 3
Yes, thanks, you're absolutely right, my issue is a duplicate. Closing it.
This is the same issue as