Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-67829

JarLoaderImpl is per connection in a controller -causing repeated checking of jars checksums

      When using ephemeral agents it was noticed even after Jenkins had been running for a day that there where still threads calculating the digest for jars with stack traces like the following in the controller

      "Computer.threadPoolForRemoting [#17077] for JNLP4-connect connection from x.x.x.x/x.x.x.x:49778 id=1547" Id=164040 Group=main RUNNABLE
      	at sun.security.provider.DigestBase.implCompressMultiBlock0(DigestBase.java:147)
      	at sun.security.provider.DigestBase.implCompressMultiBlock(DigestBase.java:142)
      	at sun.security.provider.DigestBase.engineUpdate(DigestBase.java:129)
      	at java.security.MessageDigest$Delegate.engineUpdate(MessageDigest.java:601)
      	at java.security.MessageDigest.update(MessageDigest.java:328)
      	at java.security.DigestOutputStream.write(DigestOutputStream.java:147)
      	at hudson.remoting.Util.copy(Util.java:58)
      	at hudson.remoting.Checksum.forURL(Checksum.java:83)
      	at hudson.remoting.JarLoaderImpl.calcChecksum(JarLoaderImpl.java:92)
      	at hudson.remoting.JarLoaderImpl.calcChecksum(JarLoaderImpl.java:64)
      	at hudson.remoting.RemoteClassLoader$ClassLoaderProxy.fetch4(RemoteClassLoader.java:1014)
      	at hudson.remoting.RemoteClassLoader$ClassLoaderProxy.fetch3(RemoteClassLoader.java:1055)
      

      it is highly unlikely that any jar that Jenkins knew about had not been already requested by at least one agent before.

      calcChecksum has a cache https://github.com/jenkinsci/remoting/blob/efb2c4fbe2e3a848f39f96cb99408ea2350a3aa2/src/main/java/hudson/remoting/JarLoaderImpl.java#L88-L97

      however every remoting channel has its own loader - so in effect the cache is per agent not per Jenkins.  As class loading can only work one way (an agent can load classes from a controller) the cache could be shared.  (Jenkins can not change the jars at runtime without a restart, and all agents can see exactly the same jars as all other agents)

      if you have lots of agents connecting then this imposes a scaling limit on Jenkins as it involves both IO to load the file and CPU to compute the digest.

          [JENKINS-67829] JarLoaderImpl is per connection in a controller -causing repeated checking of jars checksums

          Jenkins controller should even cache checksums on disk to avoid recomputing them every time it restarts.

          Vincent Latombe added a comment - Jenkins controller should even cache checksums on disk to avoid recomputing them every time it restarts.

          James Nord added a comment -

          for libraries coming from Jenkins core the digest could be caclulated at build time and stored in the war.

          it could be possible to eventually do the same for plugins, so that computation is not done at startup or request, but eventually is there for all libraries.

          James Nord added a comment - for libraries coming from Jenkins core the digest could be caclulated at build time and stored in the war. it could be possible to eventually do the same for plugins, so that computation is not done at startup or request, but eventually is there for all libraries.

          Jesse Glick added a comment -

          Build-time digests seem like overkill. Even caching across controller restarts may be overkill. Seems like the basic fix to make the in-memory cache be global rather than per connection should solve 99% of the problem.

          Jesse Glick added a comment - Build-time digests seem like overkill. Even caching across controller restarts may be overkill. Seems like the basic fix to make the in-memory cache be global rather than per connection should solve 99% of the problem.

            Unassigned Unassigned
            teilo James Nord
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: