Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33273

Optimize Jenkinsfile loading and branch detection

      Currently if use the Git branch source, your repository gets cloned three times:

      1. Once to determine the head revision of the branch.
      2. Once to load Jenkinsfile.
      3. Once when you checkout scm as part of your build. (Normally once. Could be multiple times, or even zero.)

      This could be a performance issue for large repositories. What we would rather do in the second step is use SCMFileSystem to locate Jenkinsfile. Unfortunately this is not currently implemented in any of the current scm-api clients (git, subversion, mercurial, github-branch-source).

      One of the first two clones would still be necessary for Git since the remote protocol functionality here is limited to git-ls-remote, which could yield the head revision needed for step one (which is currently implemented using a local cache of the repository), but not the contents of Jenkinsfile (which, unlike for Literate, we need to have before the build starts). The Git plugin could be enhanced to avoid needing a cache for the first step, but this would not help workflow-multibranch at all since it would wind up needing it for an implementation of SCMFileSystem in the second step anyway.

      For Mercurial, the situation is somewhat similar in that the wire protocol does not support remote file access. It might in principle support remote head revision calculation, but not using the standard hg binary, which this plugin is based on. So the Mercurial plugin when used as a branch source just requests that you enabling the "caching" feature, which existed long before this, and which maintains a local clone of the repository. That could serve as the implementation of SCMFileSystem as well as providing the tip revision. Unlike with the Git plugin, the Mercurial plugin will actually reuse this cache during a regular workspace checkout, which has some advantages—for example, your slave need not be able to make a direct connection to the Mercurial server, so long as the master can—but may also be undesirable for performance, since it uses the Jenkins remoting channel. Probably the Mercurial plugin needs an intermediate mode, whereby caches are maintained on both master and slave, yet the slave cache is synchronized directly with the remote server rather than with the master. Or whereby the master maintains a cache, but the slave does not use caching at all.

      For Subversion the situation is simpler since the wire protocol (and, as far as I know, SVNKit) supports both remote head revision calculation and remote file retrieval. Therefore the only actual checkout would be in the user workspace.

      Note that in principle the first two steps could be collapsed: check out the SCM including Jenkinsfile to the master's jobname@script workspace, as with pre-multibranch CpsScmFlowDefinition, and then inspect the checkout after the fact to find its revision somehow for SCMBinder: for example, using git rev-parse HEAD, or hg log -r . --template '{node}\n', or svnversion .. Unfortunately scm-api offers no generic way of doing this; you need to call build with an SCMRevision but that revision can only be gotten by the repository inspection. So if this approach is to be taken, a new API would need to be introduced and implemented in the major SCM plugins. Anyway this approach is less desirable for the case of a massive working copy.

          [JENKINS-33273] Optimize Jenkinsfile loading and branch detection

          It's case. It run git in parallel for all detected PR and it's the problem.

          Yes, we use monolithic repository. Don't ask why. Yes, we need keep some third party libraries

          Still need some solution. How can I remove PR-* without run indexing?

          Viacheslav Dubrovskyi added a comment - It's case. It run git in parallel for all detected PR and it's the problem. Yes, we use monolithic repository. Don't ask why. Yes, we need keep some third party libraries Still need some solution. How can I remove PR-* without run indexing?

           We had a similar problem. What I ended up was cleaning the repository with all 3rd party libraries with BFG Repo-Cleaner and then added a sub module with all those 3rd party libraries. This has the benefit that the original repo is small and (when checking out) only the latest version of the sub modules needs to be pulled and not the entire history.

          Roman Bäriswyl added a comment -  We had a similar problem. What I ended up was cleaning the repository with all 3rd party libraries with BFG Repo-Cleaner and then added a sub module with all those 3rd party libraries. This has the benefit that the original repo is small and (when checking out) only the latest version of the sub modules needs to be pulled and not the entire history.

          Jesse Glick added a comment -

          Please stick to the user list (or stackoverflow.com, #jenkins IRC, etc.) when discussing tips and usage questions.

          Jesse Glick added a comment - Please stick to the user list (or stackoverflow.com, #jenkins IRC, etc.) when discussing tips and usage questions.

          trejkaz added a comment -

          If we're saying this is fixed, can I assume that the 30 second wait time to check each Jenkinsfile is some other performance issue which is tracked in another ticket?

           

          trejkaz added a comment - If we're saying this is fixed, can I assume that the 30 second wait time to check each Jenkinsfile is some other performance issue which is tracked in another ticket?  

          Steve Berube added a comment -

          Would love to see this fixed too. One of our repos is gigantic and the prep-phase takes forever just to get a single jenkinsfile.

           

          Steve Berube added a comment - Would love to see this fixed too. One of our repos is gigantic and the prep-phase takes forever just to get a single jenkinsfile.  

          Steve Berube added a comment - - edited

          The issue appears to be fixed on main branches, however pull requests still appear to be checking out the entire repo vs just getting the Jenkinsfile.

           

          e.g. Pull Request.

          Pull request #14924 opened05:15:28 Connecting to https://github.houston.softwaregrp.net/api/v3 using steve-berube/****** (GITHUB Service Account (Using Steve Berube))Checking out git 
          https://github.houston.softwaregrp.net/CSA/csa.git
          into /var/jenkins_home/workspace/A_CSA-PIPELINE_csa_PR-14924-Z4QJL7VGNYZFORO2QDHYKCLF6JHNT2KX6T7GPPU2XP5ECHCOF7EQ@script to read Jenkinsfile
          Cloning the remote Git repository
          

           

          E.g. Non-pull request.

          originally caused by:
          Push event to branch v04.93.00022:58:31 Connecting to https://github.houston.softwaregrp.net/api/v3 using steve-berube/****** (GITHUB Service Account (Using Steve Berube))Obtained Jenkinsfile from 2bfe852c7aad46b7dc90ffb6e53c2b177f07ae00
          Running in Durability level: MAX_SURVIVABILITY
          

           

          Is this a limitation or a defect?

           

          Steve Berube added a comment - - edited The issue appears to be fixed on main branches, however pull requests still appear to be checking out the entire repo vs just getting the Jenkinsfile.   e.g. Pull Request. Pull request #14924 opened05:15:28 Connecting to https: //github.houston.softwaregrp.net/api/v3 using steve-berube/****** (GITHUB Service Account (Using Steve Berube))Checking out git  https: //github.houston.softwaregrp.net/CSA/csa.git into / var /jenkins_home/workspace/A_CSA-PIPELINE_csa_PR-14924-Z4QJL7VGNYZFORO2QDHYKCLF6JHNT2KX6T7GPPU2XP5ECHCOF7EQ@script to read Jenkinsfile Cloning the remote Git repository   E.g. Non-pull request. originally caused by: Push event to branch v04.93.00022:58:31 Connecting to https: //github.houston.softwaregrp.net/api/v3 using steve-berube/****** (GITHUB Service Account (Using Steve Berube))Obtained Jenkinsfile from 2bfe852c7aad46b7dc90ffb6e53c2b177f07ae00 Running in Durability level: MAX_SURVIVABILITY   Is this a limitation or a defect?  

          Steve Berube added a comment -

          Steve Berube added a comment - seems a limitation: https://support.cloudbees.com/hc/en-us/articles/115002991272-Why-is-my-multibranch-project-cloning-the-whole-repository-on-the-master-  

          Steve Berube added a comment -

          One more update. If you configure your PR strategy to be Build Pull Request Revision, this works around the issue and it can read the jenkinsfile via the API.

           

          Steve Berube added a comment - One more update. If you configure your PR strategy to be Build Pull Request Revision, this works around the issue and it can read the jenkinsfile via the API.  

          Adam Bialas added a comment -

          I configured my PR strategy to Build Pull Request Revision. However, it throws an exception:
          ERROR: Could not do lightweight checkout, falling back to heavyweight
          java.io.FileNotFoundException: URL: /rest/api/1.0/projects/sample/repos/sample-repo/browse/Jenkinsfile?at=PR-62&start=0&limit=500
          and switches to heavy checkout.

          Adam Bialas added a comment - I configured my PR strategy to Build Pull Request Revision. However, it throws an exception: ERROR: Could not do lightweight checkout, falling back to heavyweight java.io.FileNotFoundException: URL: /rest/api/1.0/projects/sample/repos/sample-repo/browse/Jenkinsfile?at=PR-62&start=0&limit=500 and switches to heavy checkout.

          Jesse Glick added a comment -

          This issue is closed. Please file separate issues with complete steps to reproduce from scratch if you observe any issues using the latest releases of all applicable software that are not already tracked in JIRA.

          Jesse Glick added a comment - This issue is closed. Please file separate issues with complete steps to reproduce from scratch if you observe any issues using the latest releases of all applicable software that are not already tracked in JIRA.

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            59 Vote for this issue
            Watchers:
            91 Start watching this issue

              Created:
              Updated:
              Resolved: