Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-51542

Git checkout is slower than the command line execution

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • git-plugin
    • None
    • Development

      The git checkout is much slower even after using reference repositories, shallow clones etc.

      Running the same commands via the command line is much faster. I am using the latest version of all the plugins.

      If you look at the below output, it runs the fetch more than once.

      09:53:13 Cloning the remote Git repository
      09:53:13 Using shallow clone
      09:53:13 Avoid fetching tags
      09:53:13 Cloning repository git@xxxxxxx:xxxx/xxxxxxx.git
      09:53:13  > git init /srv/jenkins/workspace/shared-buck-2-master # timeout=10
      09:53:14 Using reference repository: /var/lib/jenkins/reference-repositories/xxxxxx.git
      09:53:14 Fetching upstream changes from git@xxxxx:xxxx/xxxxxxx.git
      09:53:14  > git --version # timeout=10
      09:53:14  > git fetch --no-tags --progress git@xxxxxx:xxxx/xxxxxxx.git +refs/heads/*:refs/remotes/xxxxxxx/* --depth=1
      09:54:15  > git config remote.xxxxxxx.url git@xxxxxxxx:xxxxx/xxxxxxxx.git # timeout=10
      09:54:15  > git config --add remote.xxxxxxxx.fetch +refs/heads/*:refs/remotes/xxxxxxx/* # timeout=10
      09:54:15  > git config remote.xxxxxxx.url git@xxxxxxxx:xxxxx/xxxxxxx.git # timeout=10
      09:54:15 Fetching upstream changes from git@xxxxxxxx:xxxx/xxxxxxx.git
      09:54:15  > git fetch --no-tags --progress git@xxxxxxxx:xxxxx/xxxxxxxx.git +refs/heads/*:refs/remotes/xxxxxx/* --depth=1
      09:54:18  > git rev-parse 87dc72cf506dcf684775c7e3be56184e09c44701^{commit} # timeout=10
      09:54:18 Checking out Revision 87dc72cf506dcf684775c7e3be56184e09c44701 (detached)
      09:54:18  > git config core.sparsecheckout # timeout=10
      09:54:18  > git checkout -f 87dc72cf506dcf684775c7e3be56184e09c44701
      09:54:46 Commit message: "@MS-123 - Increase the jvm memory size for the bat tests"

          [JENKINS-51542] Git checkout is slower than the command line execution

          Oliver Pereira created issue -
          Mark Waite made changes -
          Assignee Original: Mark Waite [ markewaite ]

          Mark Waite added a comment -

          And yet the second call to `git fetch` completes in 3 seconds (09:54:15 -> 09:54:18) while the first call to `git fetch` requires 60 seconds. The second call to `git fetch` does not seem to be a dramatic contributor to slower performance. Can you provide more details that compare the same commands from a command line?

          Are the same commands also using the same file system?

          Are there other differences which affect performance?

          Mark Waite added a comment - And yet the second call to `git fetch` completes in 3 seconds (09:54:15 -> 09:54:18) while the first call to `git fetch` requires 60 seconds. The second call to `git fetch` does not seem to be a dramatic contributor to slower performance. Can you provide more details that compare the same commands from a command line? Are the same commands also using the same file system? Are there other differences which affect performance?

          I am also able to get much better performance from a custom "git clone" command than what the Jenkins git plugin does. Specifically, I use "git clone --depth 1 --reference /reference/path git://git.example.com/thing.git", which takes 20 to 30s for a repository of about 900MiB.

          The Jenkins configuration which attempts to duplicate this behavior usually takes over 2.5 minutes to check out the code. The strange thing is that sometimes it only takes 30s. I haven't been able to determine why the time varies so much, but it seems like it may be quicker immediately after the reference repository was updated.

          It seems the Jenkins git plugin never emits a "git clone" command, but rather "git fetch" followed by "git checkout". Perhaps there's no way to exactly duplicate "git clone" with such a sequence?

          Jonathan Rogers added a comment - I am also able to get much better performance from a custom "git clone" command than what the Jenkins git plugin does. Specifically, I use "git clone --depth 1 --reference /reference/path git://git.example.com/thing.git", which takes 20 to 30s for a repository of about 900MiB. The Jenkins configuration which attempts to duplicate this behavior usually takes over 2.5 minutes to check out the code. The strange thing is that sometimes it only takes 30s. I haven't been able to determine why the time varies so much, but it seems like it may be quicker immediately after the reference repository was updated. It seems the Jenkins git plugin never emits a "git clone" command, but rather "git fetch" followed by "git checkout". Perhaps there's no way to exactly duplicate "git clone" with such a sequence?

          Mark Waite added a comment -

          jrogers the textual description of git clone is that it is the combination of git fetch and git merge. However, we've also had reports from a few Jenkins users that they also see faster performance if they use git clone instead of the git fetch which is used by the git plugin.

          When I attempted to duplicate the performance difference in that case, I was unable to do so. At that time, I consistently found comparable performance when comparing git clone and git fetch + git merge.

          That likely means that I don't understand the specific details which cause clone to be faster than fetch. Unfortunately, the connection between fetch and the use cases of the git plugin are strong enough that I don't think there will ever be a way to call git clone directly from the plugin.

          The change from clone to fetch would break many different use cases. As one example, I've seen much better performance if I specify a narrow refspec to the git plugin. The multibranch pipeline code uses that technique to limit a clone to a single branch, rather than cloning all branches in the repository. The narrow refspec limits the amount of information sent from the remote to the local to only the specific named branch. Unfortunately, the git clone command doesn't accept a refspec as an argument. There are command line arguments that will allow adding more refspecs, but there is no way to limit the refspec from the clone command line.

          Mark Waite added a comment - jrogers the textual description of git clone is that it is the combination of git fetch and git merge . However, we've also had reports from a few Jenkins users that they also see faster performance if they use git clone instead of the git fetch which is used by the git plugin. When I attempted to duplicate the performance difference in that case, I was unable to do so. At that time, I consistently found comparable performance when comparing git clone and git fetch + git merge . That likely means that I don't understand the specific details which cause clone to be faster than fetch. Unfortunately, the connection between fetch and the use cases of the git plugin are strong enough that I don't think there will ever be a way to call git clone directly from the plugin. The change from clone to fetch would break many different use cases. As one example, I've seen much better performance if I specify a narrow refspec to the git plugin. The multibranch pipeline code uses that technique to limit a clone to a single branch, rather than cloning all branches in the repository. The narrow refspec limits the amount of information sent from the remote to the local to only the specific named branch. Unfortunately, the git clone command doesn't accept a refspec as an argument. There are command line arguments that will allow adding more refspecs, but there is no way to limit the refspec from the clone command line.

          Does the Jenkins git plugin call "git merge"? I've only seen "git ls-remote", "git rev-parse", "git config", "git fetch" and "git checkout".

          Jonathan Rogers added a comment - Does the Jenkins git plugin call "git merge"? I've only seen "git ls-remote", "git rev-parse", "git config", "git fetch" and "git checkout".

          Mark Waite added a comment -

          jrogers it only uses git merge when requested to perform a pre-build merge.  It uses a detached HEAD checkout unless specifically configured to checkout with a branch name.

          Mark Waite added a comment - jrogers it only uses git merge when requested to perform a pre-build merge.  It uses a detached HEAD checkout unless specifically configured to checkout with a branch name.

          I have done some more experimenting with git commands. For my use case, the difference between using the git plugin to do a checkout vs. using "git clone" is primarily about the reference repository. AFAICT, the commands issued by the git plugin are not able to take advantage of a reference repository.

          Unfortunately, how reference repositories are configured isn't very well documented. The git-clone man page mentions ".git/objects/info/alternates" and after issuing a "git clone" command with the "–reference" option, that file contains the path of the reference repository's "objects" directory. After issuing such a command, I get a new repository in which the ".git" directory only occupies a couple of megabytes.

          Also AFAICT, the git plugin directly writes the correct path to ".git/objects/info/alternates" before issuing the "git fetch" command. However, "git fetch" does not seem to take advantage of the reference repository, since the ".git" directory in the new repository is several hundred megabytes. The git plugin's reference repository option simply has no effect. Since the the "git fetch" man page says nothing about reference repositories or alternates, I can't tell if that command is behaving correctly. BTW, I'm using git 1.7.1.

          Jonathan Rogers added a comment - I have done some more experimenting with git commands. For my use case, the difference between using the git plugin to do a checkout vs. using "git clone" is primarily about the reference repository. AFAICT, the commands issued by the git plugin are not able to take advantage of a reference repository. Unfortunately, how reference repositories are configured isn't very well documented. The git-clone man page mentions " .git/objects/info/alternates " and after issuing a "git clone" command with the "–reference" option, that file contains the path of the reference repository's "objects" directory. After issuing such a command, I get a new repository in which the ".git" directory only occupies a couple of megabytes. Also AFAICT, the git plugin directly writes the correct path to " .git/objects/info/alternates " before issuing the "git fetch" command. However, "git fetch" does not seem to take advantage of the reference repository, since the ".git" directory in the new repository is several hundred megabytes. The git plugin's reference repository option simply has no effect. Since the the "git fetch" man page says nothing about reference repositories or alternates, I can't tell if that command is behaving correctly. BTW, I'm using git 1.7.1.

          Fabian Holler added a comment -

          We also run into the issue with pipeline scripts + git repositories + a local reference repository.

          During initial checkout for the pipeline script, the "lightweight checkout" options increases the checkout a lot.
          With lightweight checkout disabled + reference repo it takes ~5sec.
          With lightweight checkout enabled + reference repo it takes ~15min.

          The checkout done via checkout(scm), takes often ~15min despite the reference repository.

          If I do a "git clone --reference" on the same host, same filesystem it takes ~5sec.

          Freestyle Git checkouts with a reference repository a fast as expected.

          Fabian Holler added a comment - We also run into the issue with pipeline scripts + git repositories + a local reference repository. During initial checkout for the pipeline script, the "lightweight checkout" options increases the checkout a lot. With lightweight checkout disabled + reference repo it takes ~5sec. With lightweight checkout enabled + reference repo it takes ~15min. The checkout done via checkout(scm), takes often ~15min despite the reference repository. If I do a "git clone --reference" on the same host, same filesystem it takes ~5sec. Freestyle Git checkouts with a reference repository a fast as expected.

          Mark Waite added a comment -

          jrogers there are many issues in command line git 1.7.1 that are resolved in later versions of git. Shallow clone (for example) is not fully supported in git 1.7.1. It may be that reference repositories are also not supported in git 1.7.1 using git fetch. You might consider enabling the JGit implementation for that repository and use JGit instead of command line git.

          Mark Waite added a comment - jrogers there are many issues in command line git 1.7.1 that are resolved in later versions of git. Shallow clone (for example) is not fully supported in git 1.7.1. It may be that reference repositories are also not supported in git 1.7.1 using git fetch . You might consider enabling the JGit implementation for that repository and use JGit instead of command line git.

            Unassigned Unassigned
            oliverp Oliver Pereira
            Votes:
            7 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: