Currently each attempt to load a class in a remote JVM makes a round-trip request to the master, which over a laggy network can make class loading quite slow, thus adding considerable overhead to the first build on a new slave.

      Two possible solutions have been put forward.

      Optimistic prefetch

      The idea: when sending a class file to be loaded, scan its bytecode for other statically linked classes which have not yet been loaded, and send those along as well. In the common case that a network of classes is loaded around the same time, this would avoid some round trips.

      Details

      The first part is for one side to keep track of what classes the other side has already loaded (into which class loader), which is basically memorizing the response from IClassLoader.fetch2.

      The second part is to add Collection<ClassFile> IClassLoader.fetch3 that works like fetch2, except it will also parse the class, figure out some of the referenced classes that are not yet loaded by the other side, then send them along.

      Those prefetched class files would need to be remembered by RemoteClassLoader so that when those are actually requested it can load a class in the right classloader without calling back RemoteClassLoader.proxy. (Assuming there are no side effects, it could also eagerly call RemoteClassLoader.loadClassFile on the prefetched classes.)

      The remoting layer supports talking to an earlier version of the remoting layer. We do this by a bitmask in Capability, so this needs one more bit defined there. There is no point in tracking the classes the other side has loaded if the other side will never call fetch3.

      Bulk transfer

      Send entire JAR files at a time, rather than individual classes; can wind up transferring more than is needed, but the reduction in latency is probably worth it. Since arbitrary class loader graphs might be in use, not just a flat classpath, some custom code needs to be run remotely which will implement the class loader delegation model without hitting the network for each class.

      Details

      An API sketch:

      class Channel {
        void setClassLoaderTrafficCop(TrafficCop cop);
      }
      interface TrafficCop {
        /** do I know/control/own this classloader? */
        boolean controls(ClassLoader cl);
        Set<JarFile> getJarFilesOf(ClassLoader cl);
        RemotePartOfTrafficCop getRemotePart();
      }
      class JarFile {
        String checksum();
        InputStream data();
      }
      /** runs in remote agent */
      interface RemotePartOfTrafficCop implements Serializable {
        /** given a class/resource and an originating loader, what is the defining loader? */
        RemoteClassLoader trafficControl(RemoteClassLoader origin, String resourceName);
      }
      

          [JENKINS-15120] Minimize round trips for slave class loading

          Jesse Glick created issue -

          Jesse Glick added a comment -

          Can add a dependency from remoting on ASM via http://kohsuke.org/2012/03/03/potd-package-renamed-asm/ if needed for prefetch.

          Jesse Glick added a comment - Can add a dependency from remoting on ASM via http://kohsuke.org/2012/03/03/potd-package-renamed-asm/ if needed for prefetch.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/hudson/remoting/Channel.java
          http://jenkins-ci.org/commit/remoting/2b1ec8ab152805f01b4063dabc4dcdef64421fed
          Log:
          JENKINS-15120 Kohsuke’s explanation of why preloadJar does not really help.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/hudson/remoting/Channel.java http://jenkins-ci.org/commit/remoting/2b1ec8ab152805f01b4063dabc4dcdef64421fed Log: JENKINS-15120 Kohsuke’s explanation of why preloadJar does not really help.
          Jesse Glick made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

          Jesse Glick added a comment -

          Jesse Glick added a comment - https://github.com/jenkinsci/remoting/pull/10

          Jesse Glick added a comment -

          https://github.com/jenkinsci/mock-slave-plugin useful for testing impact on performance more controllably than simply connecting to some node in the cloud.

          Jesse Glick added a comment - https://github.com/jenkinsci/mock-slave-plugin useful for testing impact on performance more controllably than simply connecting to some node in the cloud.

          Jesse Glick added a comment -

          remoting #37248f1 also suggests using

          sudo tc qdisc add dev lo root netem delay 100ms
          

          to simulate a laggy network.

          Jesse Glick added a comment - remoting #37248f1 also suggests using sudo tc qdisc add dev lo root netem delay 100ms to simulate a laggy network.
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-16261 [ JENKINS-16261 ]

          Jesse Glick added a comment -

          Using mock-slave with 10ms latency and building a multimodule Maven project in current trunk I get

          Loading Type Time (s) Count
          Classes 223.8 2993
          Resources 2.7 23

          With prefetch-JENKINS-15120 from Jenkins core checked out (which pulls in a branch of the same name from remoting), this was

          Loading Type Time (s) Count
          Classes 165.5 3361 (prefetch cache: 1651)
          Resources 2.3 26

          though timing is not exactly comparable since the branch currently enables rather verbose logging which slows down the connection.

          Jesse Glick added a comment - Using mock-slave with 10ms latency and building a multimodule Maven project in current trunk I get Loading Type Time (s) Count Classes 223.8 2993 Resources 2.7 23 With prefetch-JENKINS-15120 from Jenkins core checked out (which pulls in a branch of the same name from remoting ), this was Loading Type Time (s) Count Classes 165.5 3361 (prefetch cache: 1651) Resources 2.3 26 though timing is not exactly comparable since the branch currently enables rather verbose logging which slows down the connection.

          Jesse Glick added a comment -

          Retesting with svn up for both builds (originally the first used co), and with verbose logging turned off in the branch. Trunk builds takes 6:02 min (of which the Maven build itself was 1:26):

          Loading Type Time (s) Count
          Classes 243.2 3320
          Resources 2.6 26

          I tried to run the branch build again but this time it failed with a java.lang.OutOfMemoryError: PermGen space (on the master) which is disconcerting; did the branch introduce some kind of class loader leak?

          I retried it, this time succeeding in 4:10 (Maven build 1:06):

          Loading Type Time (s) Count
          Classes 141.0 3323 (prefetch cache: 1633)
          Resources 2.2 26

          So that is a 31% reduction in build time, which I would say is pretty good.

          BTW using https://svn.codehaus.org/mojo/trunk/mojo/nbm-maven as the test project, more or less arbitrarily. Intentionally using a native Maven project since that puts far more load on the remoting layer than a freestyle project.

          Jesse Glick added a comment - Retesting with svn up for both builds (originally the first used co ), and with verbose logging turned off in the branch. Trunk builds takes 6:02 min (of which the Maven build itself was 1:26): Loading Type Time (s) Count Classes 243.2 3320 Resources 2.6 26 I tried to run the branch build again but this time it failed with a java.lang.OutOfMemoryError: PermGen space (on the master) which is disconcerting; did the branch introduce some kind of class loader leak? I retried it, this time succeeding in 4:10 (Maven build 1:06): Loading Type Time (s) Count Classes 141.0 3323 (prefetch cache: 1633) Resources 2.2 26 So that is a 31% reduction in build time, which I would say is pretty good. BTW using https://svn.codehaus.org/mojo/trunk/mojo/nbm-maven as the test project, more or less arbitrarily. Intentionally using a native Maven project since that puts far more load on the remoting layer than a freestyle project.

            kohsuke Kohsuke Kawaguchi
            jglick Jesse Glick
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: