A user working with files in the 100+ Mb range asked whether fingerprinting them was expensive. Probably it is not that expensive but it is likely the implementation could be optimized and at least measured.

      As far as the I/O goes, FilePath.digest uses FileInputStream rather than NIO—hmm, maybe OK. It does not use buffering, which may or may not be an issue here. Then it calls Util.getDigestOf.

      That uses the JRE’s MD5 implementation which AFAIK takes advantage of any cryptographic hardware acceleration when available. It does read into a 1024-byte buffer however which seems excessively small. (And uses a proxy stream rather than simply calling the update method, which is odd.) http://stackoverflow.com/questions/9321912/very-slow-when-generaing-md5-using-java-with-large-file suggests using a custom library though this may be overkill; others suggest DigestUtils from Commons Codec. A

          [JENKINS-16301] Fingerprint performance

          Jesse Glick created issue -
          kutzi made changes -
          Link New: This issue is related to JENKINS-11814 [ JENKINS-11814 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-7813 [ JENKINS-7813 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-13154 [ JENKINS-13154 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-17412 [ JENKINS-17412 ]

          Jesse Glick added a comment -

          Fingerprint storage seems to spend most of its time in XStream overhead.

          Jesse Glick added a comment - Fingerprint storage seems to spend most of its time in XStream overhead.

          Jesse Glick added a comment -

          Baseline average time (in 1.517-SNAPSHOT) of Util.getDigestOf(new FileInputStream("…/jenkins-war-1.509.1.war")): 220msec.

          Using a BufferedInputStream with default buffer size: 193msec.

          Using direct buffer update rather than DigestInputStream saves nothing, nor does using MappedByteBuffer on FileChannel, nor does using DigestUtils, nor does using a larger temporary buffer.

          Fast MD5 Implementation in Java takes 183msec, which is barely faster, but it requires JNI and is LGPL.

          So I think I will switch to DigestUtils just to reduce custom code, and add buffering.

          Jesse Glick added a comment - Baseline average time (in 1.517-SNAPSHOT ) of Util.getDigestOf(new FileInputStream("…/jenkins-war-1.509.1.war")) : 220msec. Using a BufferedInputStream with default buffer size: 193msec. Using direct buffer update rather than DigestInputStream saves nothing, nor does using MappedByteBuffer on FileChannel , nor does using DigestUtils , nor does using a larger temporary buffer. Fast MD5 Implementation in Java takes 183msec, which is barely faster, but it requires JNI and is LGPL. So I think I will switch to DigestUtils just to reduce custom code, and add buffering.
          Jesse Glick made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

          Jesse Glick added a comment -

          Since the main problem from reports I have seen is Fingerprint.save calling complex code in XStream, with other threads waiting for the monitor, I am looking into whether a hand-coded “fast path” can perform better.

          Jesse Glick added a comment - Since the main problem from reports I have seen is Fingerprint.save calling complex code in XStream, with other threads waiting for the monitor, I am looking into whether a hand-coded “fast path” can perform better.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          core/src/main/java/hudson/Util.java
          http://jenkins-ci.org/commit/jenkins/51fbd2d8675fb3703ce38d935e661abf03e1b83b
          Log:
          JENKINS-16301 Replace impl of getDigestOf with standard (Commons Codec) DigestUtils.md5Hex, for simplicity.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/Util.java http://jenkins-ci.org/commit/jenkins/51fbd2d8675fb3703ce38d935e661abf03e1b83b Log: JENKINS-16301 Replace impl of getDigestOf with standard (Commons Codec) DigestUtils.md5Hex, for simplicity.

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: