Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33273

Optimize Jenkinsfile loading and branch detection


      Currently if use the Git branch source, your repository gets cloned three times:

      1. Once to determine the head revision of the branch.
      2. Once to load Jenkinsfile.
      3. Once when you checkout scm as part of your build. (Normally once. Could be multiple times, or even zero.)

      This could be a performance issue for large repositories. What we would rather do in the second step is use SCMFileSystem to locate Jenkinsfile. Unfortunately this is not currently implemented in any of the current scm-api clients (git, subversion, mercurial, github-branch-source).

      One of the first two clones would still be necessary for Git since the remote protocol functionality here is limited to git-ls-remote, which could yield the head revision needed for step one (which is currently implemented using a local cache of the repository), but not the contents of Jenkinsfile (which, unlike for Literate, we need to have before the build starts). The Git plugin could be enhanced to avoid needing a cache for the first step, but this would not help workflow-multibranch at all since it would wind up needing it for an implementation of SCMFileSystem in the second step anyway.

      For Mercurial, the situation is somewhat similar in that the wire protocol does not support remote file access. It might in principle support remote head revision calculation, but not using the standard hg binary, which this plugin is based on. So the Mercurial plugin when used as a branch source just requests that you enabling the "caching" feature, which existed long before this, and which maintains a local clone of the repository. That could serve as the implementation of SCMFileSystem as well as providing the tip revision. Unlike with the Git plugin, the Mercurial plugin will actually reuse this cache during a regular workspace checkout, which has some advantages—for example, your slave need not be able to make a direct connection to the Mercurial server, so long as the master can—but may also be undesirable for performance, since it uses the Jenkins remoting channel. Probably the Mercurial plugin needs an intermediate mode, whereby caches are maintained on both master and slave, yet the slave cache is synchronized directly with the remote server rather than with the master. Or whereby the master maintains a cache, but the slave does not use caching at all.

      For Subversion the situation is simpler since the wire protocol (and, as far as I know, SVNKit) supports both remote head revision calculation and remote file retrieval. Therefore the only actual checkout would be in the user workspace.

      Note that in principle the first two steps could be collapsed: check out the SCM including Jenkinsfile to the master's jobname@script workspace, as with pre-multibranch CpsScmFlowDefinition, and then inspect the checkout after the fact to find its revision somehow for SCMBinder: for example, using git rev-parse HEAD, or hg log -r . --template '{node}\n', or svnversion .. Unfortunately scm-api offers no generic way of doing this; you need to call build with an SCMRevision but that revision can only be gotten by the repository inspection. So if this approach is to be taken, a new API would need to be introduced and implemented in the major SCM plugins. Anyway this approach is less desirable for the case of a massive working copy.

            jglick Jesse Glick
            jglick Jesse Glick
            59 Vote for this issue
            91 Start watching this issue