Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37345

Throttle automatic git checkout to prevent timeouts/overload on first indexing

      Problem

      I have a huge repository (~500MB) with many branches (~30) with each containing a Jenkinsfile. The first indexing is quite painful.

      Jenkins will detect all branches, and trigger a build for each one. Each build will start by cloning the git repository on the master to fetch the Jenkinsfile. But this tasks will not use/require a free executor on the master. So, the 30 git clone will be launched at the same time, trying to download 30 times the same 500MB repository. It will end with a really slow Jenkins and a timeout.

      I'll have to manually relaunch every branch, one or two at a time to fix this problem.

      Possible solution

      Is it possible to use an executor on the master during the checkout ? To throttle the git clones ?

      How to reproduce it

      • Create a git repository
      • Create a Jenkinsfile with something in it (echo 'hello world')
      • Add a huge file (an ISO of your favorite Linux distribution or something else)
      • Commit it
      • Create 30 branches
      • Add this project in your Jenkins as a Multibranch Pipeline.

          [JENKINS-37345] Throttle automatic git checkout to prevent timeouts/overload on first indexing

          Actually IMO it's wrong to clone the same git repository 30 times and it is wrong to trigger any builds at all.

          IMO for git a single bare clone per repo URL (across all jobs) should suffice.
          And triggering a build only after the next change after initial indexing should be a must.

          I definitely vote for improved behavior on first branch indexing.

          Ing. Christoph Obexer added a comment - Actually IMO it's wrong to clone the same git repository 30 times and it is wrong to trigger any builds at all. IMO for git a single bare clone per repo URL (across all jobs) should suffice. And triggering a build only after the next change after initial indexing should be a must. I definitely vote for improved behavior on first branch indexing.

          Will Freeman added a comment - - edited

          I'm also seeing this problem at scale.  Even with small <100MB git repositories and thousands of PR's over time, this is an issue.

          We need a way to throttle the total number of git processes spawned up on the master, as it is currently boundless and can bring a system down.

          Latest version of every plugin, still a problem, other issue seems to elude to reducing the number of times a checkout is done vs. the fact that it can start hundreds, even thousands all at the same time, and it's not configurable.

          I suggest JENKINS-33273 is NOT a duplicate.

          Will Freeman added a comment - - edited I'm also seeing this problem at scale.  Even with small <100MB git repositories and thousands of PR's over time, this is an issue. We need a way to throttle the total number of git processes spawned up on the master, as it is currently boundless and can bring a system down. Latest version of every plugin, still a problem, other issue seems to elude to reducing the number of times a checkout is done vs. the fact that it can start hundreds, even thousands all at the same time, and it's not configurable. I suggest  JENKINS-33273  is NOT a duplicate.

            Unassigned Unassigned
            superboum Quentin Dufour
            Votes:
            4 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: