-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Development
-
Powered by SuggestiMate
The git checkout is much slower even after using reference repositories, shallow clones etc.
Running the same commands via the command line is much faster. I am using the latest version of all the plugins.
If you look at the below output, it runs the fetch more than once.
09:53:13 Cloning the remote Git repository 09:53:13 Using shallow clone 09:53:13 Avoid fetching tags 09:53:13 Cloning repository git@xxxxxxx:xxxx/xxxxxxx.git 09:53:13 > git init /srv/jenkins/workspace/shared-buck-2-master # timeout=10 09:53:14 Using reference repository: /var/lib/jenkins/reference-repositories/xxxxxx.git 09:53:14 Fetching upstream changes from git@xxxxx:xxxx/xxxxxxx.git 09:53:14 > git --version # timeout=10 09:53:14 > git fetch --no-tags --progress git@xxxxxx:xxxx/xxxxxxx.git +refs/heads/*:refs/remotes/xxxxxxx/* --depth=1 09:54:15 > git config remote.xxxxxxx.url git@xxxxxxxx:xxxxx/xxxxxxxx.git # timeout=10 09:54:15 > git config --add remote.xxxxxxxx.fetch +refs/heads/*:refs/remotes/xxxxxxx/* # timeout=10 09:54:15 > git config remote.xxxxxxx.url git@xxxxxxxx:xxxxx/xxxxxxx.git # timeout=10 09:54:15 Fetching upstream changes from git@xxxxxxxx:xxxx/xxxxxxx.git 09:54:15 > git fetch --no-tags --progress git@xxxxxxxx:xxxxx/xxxxxxxx.git +refs/heads/*:refs/remotes/xxxxxx/* --depth=1 09:54:18 > git rev-parse 87dc72cf506dcf684775c7e3be56184e09c44701^{commit} # timeout=10 09:54:18 Checking out Revision 87dc72cf506dcf684775c7e3be56184e09c44701 (detached) 09:54:18 > git config core.sparsecheckout # timeout=10 09:54:18 > git checkout -f 87dc72cf506dcf684775c7e3be56184e09c44701 09:54:46 Commit message: "@MS-123 - Increase the jvm memory size for the bat tests"
- image-2021-01-06-18-30-47-770.png
- 71 kB
- Hosh
[JENKINS-51542] Git checkout is slower than the command line execution
I am also able to get much better performance from a custom "git clone" command than what the Jenkins git plugin does. Specifically, I use "git clone --depth 1 --reference /reference/path git://git.example.com/thing.git", which takes 20 to 30s for a repository of about 900MiB.
The Jenkins configuration which attempts to duplicate this behavior usually takes over 2.5 minutes to check out the code. The strange thing is that sometimes it only takes 30s. I haven't been able to determine why the time varies so much, but it seems like it may be quicker immediately after the reference repository was updated.
It seems the Jenkins git plugin never emits a "git clone" command, but rather "git fetch" followed by "git checkout". Perhaps there's no way to exactly duplicate "git clone" with such a sequence?
jrogers the textual description of git clone is that it is the combination of git fetch and git merge. However, we've also had reports from a few Jenkins users that they also see faster performance if they use git clone instead of the git fetch which is used by the git plugin.
When I attempted to duplicate the performance difference in that case, I was unable to do so. At that time, I consistently found comparable performance when comparing git clone and git fetch + git merge.
That likely means that I don't understand the specific details which cause clone to be faster than fetch. Unfortunately, the connection between fetch and the use cases of the git plugin are strong enough that I don't think there will ever be a way to call git clone directly from the plugin.
The change from clone to fetch would break many different use cases. As one example, I've seen much better performance if I specify a narrow refspec to the git plugin. The multibranch pipeline code uses that technique to limit a clone to a single branch, rather than cloning all branches in the repository. The narrow refspec limits the amount of information sent from the remote to the local to only the specific named branch. Unfortunately, the git clone command doesn't accept a refspec as an argument. There are command line arguments that will allow adding more refspecs, but there is no way to limit the refspec from the clone command line.
Does the Jenkins git plugin call "git merge"? I've only seen "git ls-remote", "git rev-parse", "git config", "git fetch" and "git checkout".
jrogers it only uses git merge when requested to perform a pre-build merge. It uses a detached HEAD checkout unless specifically configured to checkout with a branch name.
I have done some more experimenting with git commands. For my use case, the difference between using the git plugin to do a checkout vs. using "git clone" is primarily about the reference repository. AFAICT, the commands issued by the git plugin are not able to take advantage of a reference repository.
Unfortunately, how reference repositories are configured isn't very well documented. The git-clone man page mentions ".git/objects/info/alternates" and after issuing a "git clone" command with the "–reference" option, that file contains the path of the reference repository's "objects" directory. After issuing such a command, I get a new repository in which the ".git" directory only occupies a couple of megabytes.
Also AFAICT, the git plugin directly writes the correct path to ".git/objects/info/alternates" before issuing the "git fetch" command. However, "git fetch" does not seem to take advantage of the reference repository, since the ".git" directory in the new repository is several hundred megabytes. The git plugin's reference repository option simply has no effect. Since the the "git fetch" man page says nothing about reference repositories or alternates, I can't tell if that command is behaving correctly. BTW, I'm using git 1.7.1.
We also run into the issue with pipeline scripts + git repositories + a local reference repository.
During initial checkout for the pipeline script, the "lightweight checkout" options increases the checkout a lot.
With lightweight checkout disabled + reference repo it takes ~5sec.
With lightweight checkout enabled + reference repo it takes ~15min.
The checkout done via checkout(scm), takes often ~15min despite the reference repository.
If I do a "git clone --reference" on the same host, same filesystem it takes ~5sec.
Freestyle Git checkouts with a reference repository a fast as expected.
jrogers there are many issues in command line git 1.7.1 that are resolved in later versions of git. Shallow clone (for example) is not fully supported in git 1.7.1. It may be that reference repositories are also not supported in git 1.7.1 using git fetch. You might consider enabling the JGit implementation for that repository and use JGit instead of command line git.
Mark Waite: You are certainly correct that "git fetch" has changed behavior since 1.7.1. After I last commented, I noticed that that version is eight years old so I decided to try a newer one. For me, the difference in checkout time between "git clone" and the sequence of commands issued by the git plugin mostly or completely disappeared with git 2.5. Thanks.
Still really slow using git 2.21.0 on Windows Server 2012 R2 slave.
I am using git version 2.20.0 and still, it's slow while checking out.
bhargavkeshav and antgel while I appreciate "still slow" as an answer, that isn't enough to persuade me to investigate further. The last time that users expressed their strongly held opinion that git clone is significantly faster than git fetch with the same repository and the same environment, I spent several hours running a series of benchmark comparisons. The benchmark comparisons showed no significant difference between git clone and git fetch. The git client plugin uses git fetch instead of git clone because there are optimizations which can be performed with git fetch which are much more difficult to perform with git clone, especially across the wide range of command line git versions supported by the plugin.
I'd be much more persuaded that there is a significant difference between git fetch and git clone if comparative data were presented which showed the difference. That comparative data needs to be documented well enough that others can duplicate the configuration in order to see the problem.
Most often, reports of "slow clone" are best addressed by techniques described in the "Git in the Large" presentation from Jenkins World 2017. The technique presented there include:
- Use reference repositories to reduce clone time and clone disc space use
- Use narrow refspecs to reduce clone time and clone disc space
- Use shallow clone to reduce clone time and clone disc space
- Use sparse checkout to reduce working directory disc space
markewaite Thanks for the response. I should have been more detailed in my comment. I don't know about the difference between fetch and clone, but I do know that on a fresh workspace, it can take 10-15 minutes when the equivalent shell command takes under half a minute (I didn't bother measuring, the order of magnitude is so great). On subsequent builds, it's acceptable aka normal.
Unfortunately, our repo and configuration are (mostly) private, but I'm happy to do a screenshare with you to demonstrate, perhaps we can do some debugging online. This is the second project I've worked on that has experienced this issue, so I don't see it as being hard to reproduce.
For me the issue is with amount of tags in our repository (7k)
I've created repository with 10k tags to reproduce the issue - https://github.com/vitgorbunov/lots-of-tags
git version 2.14.2
git client plugin version 3.0.0-rc
git plugin version 4.0.1-rc-rc3051.45f40fc87c7e
I'm considering to just use shell commands, but I will miss git commits summary on build page in this case.
#clone through git plugin 00:00:00.044 Cloning repository https://github.com/vitgorbunov/lots-of-tags.git 00:00:00.044 > git init /tmp/jenkins-b280d87a/workspace/vgr_slow_git_test # timeout=10 00:00:00.049 Fetching upstream changes from https://github.com/vitgorbunov/lots-of-tags.git 00:00:00.050 > git --version # timeout=10 00:00:00.053 > git fetch --tags --progress https://github.com/vitgorbunov/lots-of-tags.git +refs/heads/*:refs/remotes/origin/* # timeout=10 00:06:50.116 > git config remote.origin.url https://github.com/vitgorbunov/lots-of-tags.git # timeout=10 00:06:50.116 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 00:06:50.119 > git config remote.origin.url https://github.com/vitgorbunov/lots-of-tags.git # timeout=10 00:06:50.127 Fetching upstream changes from https://github.com/vitgorbunov/lots-of-tags.git 00:06:50.127 > git fetch --tags --progress https://github.com/vitgorbunov/lots-of-tags.git +refs/heads/*:refs/remotes/origin/* # timeout=10 00:06:52.919 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10 00:06:52.924 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10 00:06:52.931 Checking out Revision afaccdc4e7350a09bca383c2b9ac458578e1a34b (refs/remotes/origin/master) 00:06:52.933 > git config core.sparsecheckout # timeout=10 00:06:52.936 > git checkout -f afaccdc4e7350a09bca383c2b9ac458578e1a34b # timeout=10 00:06:52.950 Commit message: "10000" 00:06:52.953 > git rev-list --no-walk afaccdc4e7350a09bca383c2b9ac458578e1a34b # timeout=10 #Using shell command 00:06:53.008 [vgr_slow_git_test] $ /bin/sh -xe /tmp/jenkins3386321295493424405.sh 00:06:53.011 + git clone https://github.com/vitgorbunov/lots-of-tags.git lots-of-tags-cloned 00:06:53.013 Cloning into 'lots-of-tags-cloned'... 00:06:54.464 00:06:54.464 real 0m1.453s 00:06:54.464 user 0m0.398s 00:06:54.464 sys 0m0.281s
Extract from thread dump
"git fetch --tags --progress https://github.com/vitgorbunov/lots-of-tags.git +refs/heads/*:refs/remotes/origin/*: stderr copier" #27955 prio=5 os_prio=0 tid=0x00007f57b800b800 nid=0x12fa runnable [0x00007f57f2c18000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x000000076fb22798> (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "git fetch --tags --progress https://github.com/vitgorbunov/lots-of-tags.git +refs/heads/*:refs/remotes/origin/*: stdout copier" #27953 prio=5 os_prio=0 tid=0x00007f57b8001800 nid=0x12f8 runnable [0x00007f57f8e9d000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x000000076fb226e8> (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "pool-1-thread-26485 for channel id=22410560" #27942 prio=5 os_prio=0 tid=0x00007f5774015800 nid=0x12cf in Object.wait() [0x00007f57736f4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395) - locked <0x000000076fb20480> (a java.lang.UNIXProcess) at hudson.Proc$LocalProc.join(Proc.java:324) at hudson.Proc.joinWithTimeout(Proc.java:170) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2311) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1905) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$400(CliGitAPIImpl.java:81) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:488) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:712) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:161) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:154) at hudson.remoting.UserRequest.perform(UserRequest.java:212) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)
Actually I just realized that on other instance with git version 1.8.3.1 it works fast.
The difference is that it prints tags to stdout much faster
You can disregard my comments - when I upgraded to git 2.16.5 it became fast again.
FYI, I was also having this problem and it boiled down to shallow depth = 1 clone, not sure if it's the same but it may help others.
~3m24s with depth =1
00:00:02.788 > /usr/bin/git fetch --tags --progress --depth=1 -- git@git.xyz.com:repo.git +refs/heads/*:refs/remotes/origin/* # timeout=150 00:03:26.147 > /usr/bin/git config remote.origin.url git@git.xyz.com:repo.git # timeout=60
~40s normal fetch
00:00:02.472 > /usr/bin/git fetch --tags --progress -- git@git.xyz.com:apps/repo.git +refs/heads/*:refs/remotes/origin/* # timeout=150 00:00:42.424 > /usr/bin/git config remote.origin.url git@git.xyz.com:apps/repo.git # timeout=60
Node:
git version 2.23.0
git-lfs/2.10.0 (GitHub; darwin amd64; go 1.13.6)
Jenkins:
git-client 3.1.1
git-plugin: 4.1.1.
Git server:
GitLab Enterprise Edition 12.7.2-ee
Maybe the server spends more time figuring out the shallow copy than just send the pack files?
As far as I can tell, must users gain significantly more from using a reference repository and an intentionally narrow refspec than from an isolated shallow clone. Those are just my observations, not anything that I've rigorously compared in a controlled environment.
At the risk of a "me too" comment - I have also been suffering from this problem for a while on the Eclipse CDT project. The normal git fetch/checkout time is 3-4 minutes, but it often gets a timeout (we increased default to 20 minutes).
However, doing the git fetch manually, even issuing the same commands takes a fraction of the amount of time. By running the same commands, but without --progress it is consistently less than 1 minute to fetch/checkout.
As far as I can tell the git operation does almost finish everytime before failure. It seems to always fail after "Resolving deltas: 100% (401124/401124), done." message but before "From https://git.eclipse.org/r/cdt/org.eclipse.cdt
13:40:09 + 4f7f16b36b...da0d1d7df6 master -> origin/master (forced update)" message.
We're using Jenkins ver. 2.222.1
jonahgraham if you have long-lived workspaces in your Jenkins jobs, you may want to consider wiping the workspace and allowing the job to recreate it. Git intentionally allows fragmentation so that it does not waste time performing garbage collection until the user requests garbage collection. Alternately, at the end of a build, you could perform git gc to garbage collect the workspace.
If your workspaces are ephemeral, then garbage collection should not be an issue.
The Eclipse CDT repository appears to be a long-lived repository with multiple branches. If you're building a single branch, you might consider enabling the "Advanced clone behaviours" extension with the "Honor refspec on initial clone", then set the refspec in the "Advanced" section to be +refs/heads/master:refs/remotes/origin/master.
That will limit the clone to a single branch with all the history for that branch. I'm able to clone from the Eclipse CDT repository into a new workspace on an AWS machine in 21 seconds with that setting.
Thanks markewaite for the quick response. The jobs are ephemeral as we are running them on Openshift cluster @ Eclipse Foundation. You can see the "git clone" stage times on https://ci.eclipse.org/cdt/view/Gerrit/job/cdt-verify-code-cleanliness-pipeline/ have reduced from 3-20 minutes down to about 1 minute starting on #1336 which is when I stopped using git plug-in for Jenkins.
I would love to provide more insight to the problem and any diagnostics I can to help identify the cause, but I am not sure where to start.
Since you're using declarative pipeline, you already have an initial implicit checkout in the job, without performing a checkout scm. You should not need to perform a checkout scm from a declarative pipeline unless you've explicitly declared to skipDefaultCheckout in the definition of the declarative pipeline.
I don't see a Jenkinsfile in the CDT repository. Is it stored somewhere public that I could see it?
Thanks markewaite for the extra info. I believe I did have the set up job as suggested - https://github.com/eclipse-cdt/cdt-infra/blob/c26772168c47526c8276a62d80459106535066fd/jenkins/pipelines/cdt/verify/cdt-verify-code-cleanliness-pipeline.Jenkinsfile#L15 that results in the often timing out git fetch command line of:
git fetch --no-tags --force --progress – git://git.eclipse.org/gitroot/cdt/org.eclipse.cdt.git refs/changes/39/162839/10
This was my checkout line:
checkout([$class: 'GitSCM', branches: [[name: '**']], doGenerateSubmoduleConfigurations: false, extensions: [[$class: 'CheckoutOption', timeout: 20], [$class: 'BuildChooserSetting', buildChooser: [$class: 'GerritTriggerBuildChooser']], [$class: 'CloneOption', honorRefspec: true, noTags: true, reference: '', timeout: 20]], submoduleCfg: [], userRemoteConfigs: [[refspec: '$GERRIT_REFSPEC', url: 'git://git.eclipse.org/gitroot/cdt/org.eclipse.cdt.git']]]) | |
https://github.com/eclipse-cdt/cdt-infra/tree/master/jenkins/pipelines - the implicit one gets the infra repo, the explicit one gets the CDT main code base.
(PS I am on https://gitter.im/jenkinsci/jenkins if you want to talk outside filling up this bug).
Thank you markewaite for the help and cross checking on this. I believe I have found the cause of my particular problem that had been affecting us for a while now. Read all about it in https://bugs.eclipse.org/bugs/show_bug.cgi?id=560283#c16 - but short summary is that because checkout's git is being run in the JNLP container, not the declared container, the amount of CPU and memory dedicated to the container was insufficient, causing the slow down (and probably the failures too).
By the way markewaite in reference to https://bugs.eclipse.org/bugs/show_bug.cgi?id=560283#c18 I have long been dissatisfied with how Jenkins does SCM operations such as for Git, and not just because of JENKINS-30600. The SCM plugins are just too bloated with functionality that would be more transparently handled in “user land” as part of a CI script. To generalize JENKINS-28335, it would be nice if Jenkins just set up some environment variables for authentication (in the case of nonpublic servers) and (where applicable) defined a commit hash suitable for replacing checkout scm, then stepped out of the way and let you sh 'git …' and somehow pick up changelog information and other metadata, perhaps with the aid of some Pipeline step that ran quickly in the workspace and just inspected .git/ (no network, no CLI command). Gets trickier when you deal with multibranch, lightweight checkouts, etc.; https://github.com/jglick/jk--/tree/master/userspace-scm-plugin holds a simple prototype.
I like the idea of giving the user more explicit control of the git operations they are performing. A new Pipeline step sounds very attractive to me. It might be called withGitCredentials and provide the necessary files and environment variables pointing to the files. The user would call git commands as they wish and have full control over those commands.
We would probably need to guide users to implement their sh commands to adapt to the possible states of a Jenkins workspace, like:
- Workspace is empty - sh("git clone git@github.com:jenkinsci/git-plugin.git . && git checkout ${SOME_SHA1}")
- Workspace is populated - sh("git pull && git checkout ${SOME_SHA1}")
- Workspace is contaminated (whatever 'contaminated' means to them) - sh(script: 'rm -rf .', returnStatus: true) sh('git clone git@github.com:jenkinsci/git-plugin.git .')
I think those are feasible but will require some effort to implement and then some additional effort to document and describe in the many different scenarios where the git plugin is used for SCM. Know anyone interested in implementing them?
markewaite, jglick, any updates on this? This seems to be affecting us as well. I've tested all the exact same commands Jenkins runs directly on the instance running them (as jenkins user, in /tmp), and the checkout is what we consider normal speeds, that is - with my copy pasting of the commands in the log output, it took less than ~30 seconds. I don't have the exact timings from Jenkins (not sure how to produce them), but looking through BlueOcean UI, it's showing the initial `checkout from version control` step to be over 7 minutes. This isn't a particularly large repository either. Based on GitHub's API it's only about 27 mb.
EDIT: oddly, the blueocean ui changes the time taken for the step to be 4 seconds after a refresh.
thehosh I've started a draft of a Google Summer of Code project idea that proposes to add a new pipeline task that will allow sh, bat, and powershell steps to perform authenticated git commands. However, Google Summer of Code implementations won't start until May 2021.
You may want to investigate other possible causes of the performance difference you are seeing. For example, some of the questions I asked earlier included:
- Are you using a narrow refspec to reduce the amount of data the git client requests from the git server?
- Are you using a reference repository to reduce the amount of data the git client copies from the git server?
- Are you mistakenly using shallow clone with a default refspec and hoping that it will improve performance? Most cases where I've used shallow clone it provided less performance improvement than I hoped. A GitHub performance blog post noted that shallow clone can be especially demanding on the git server
- Is the agent workspace empty before the first fetch or does it have existing content? If it has existing content, is that content needing a "git gc" to restore it to good performance?
markewaite Is there a reason to believe that running that the proposal would fix the issue? I thought under the hood Jenkins simply ran git commands?
- Are you using a narrow refspec to reduce the amount of data the git client requests from the git server? - Not that I'm aware of.
- Are you using a reference repository to reduce the amount of data the git client copies from the git server? - I'm not sure what you mean with this.
- Are you mistakenly using shallow clone with a default refspec and hoping that it will improve performance? - Again, not that I'm aware of. But I might be misunderstanding.
- Is the agent workspace empty before the first fetch or does it have existing content? - As you can see below, we have the "wipe out repository & force clone" enabled,
This is a screenshot of our pipeline setup. Note that wipe out work also seems to take extremely long. However, once the initial clone is done, I can add additional clones, and they seem to run fine.
thehosh I do not know if the proposed withGitCredentials pipeline step would help in your case. I've not yet found a case where the original claims in this bug report could be verified. As far as I can tell, the git plugin uses command line git to fetch and checkout content from the remote repository. As far as I can tell, it does that with speed that is comparable to git clone.
Since you're using Pipeline, you probably don't want "Wipe out repository and force clone". That same operation can be done from the pipeline itself with a pipeline task. Moving that into the pipeline definition places one more part of the job definition inside source control.
You probably also do not want branch specifier as "$BRANCH" because that means the change history in the job will not be usable. The change history shows the changes from one build to the next, but if you build a different branch on job n-1 than is built on job n, then change log is not very useful. In most cases that I've seen, it is better to use a multibranch pipeline and allow Jenkins to create and destroy jobs as branches are created and destroyed in the repository.
If your git provider is GitHub, Bitbucket, GitLab, or Gitea, then you should probably use the plugin that is specific to those implementations, rather than the general purpose "Git" provider that you've selected as your SCM provider. The Git SCM provider does not know that GitHub, Bitbucket, GitLab, and Gitea all provide REST APIs that can make some git operations (like polling and reading the Jenkinsfile) much faster.
Thanks markewaite.
I suppose withGitCredentials will be useful either way. I've had a need for that previously, and instead have had to workaround the issue.
We're using job-dsl, so everything is already in git. Although, I'd love to move wipe out repository option to the pipeline, but I have on found the option in the directive generator. Maybe I'm missing something?
I'm aware of the change history, it's not a huge issue for us. This specific job is running a test suite against our live environment. In this scenario multibranch isn't usable for us. We're planning to multibranch pipelines for other things, though the limitation of not being able to specify subdirectory to look at for monorepo setups (like github-branch-pr-change-filter but for branches). There's a Jira raised on that, but doesn't seem to be have been acted on.
Re github provider, non-multibranch pipeline doesn't seem to support anything but git and mercurial. So we're unable to use that.
That's all useful feedback, and appreciate it. Though the slow checkout is still an issue. It's extremely painful. Is there anything I can do to help have this debugged? Running almost the exact same setup locally (though, through docker), the checkout is as fast as I expect it to be.
thehosh one of the earlier comments mentions that memory pressure on the container process can significantly slow the git process. You might check the memory available to the agent process that performs the git operations.. If it is a kubernetes agent, then you'll need to assure that the JNLP agent has enough memory allocated.
I can't duplicate the problem and haven't seen new information that indicates I should again attempt to duplicate the problem. If you find a way that allows others to duplicate the problem, I would be willing to try to duplicate it.
Thank you markewaite helpful.
After reading your comment, and moments earlier noticing our Jenkins backup job taking >10 hours, something clicked in my brain and made the connection. I realised this might not be a Jenkins issue at all, and after some more investigating, it looked like a disk IO issue. We're using AWS EFS (which is NFS under the hood) to store Jenkins home, so when I said I tested it earlier, sadly, I was testing in /tmp which is not on the NFS storage, so this explains why when I ran it, it ran fine. And these manual tests I ran were the main reason I thought it might be Jenkins.
It turned out that due to our backup job (which we had misconfigured in the first place), all our IOPS were being eaten up. We're going change our EFS storage so that we get a little bit more oomph from it, and of course we'll fix our backup job too.
For anyone else running into this issue, I suggest you look at resources available to Jenkins.
Good to hear that thehosh. Thanks for sharing. Experiences with git on network file systems are often complicated by the different locking semantics and performance characteristics of network file systems. You've provided excellent advice to network file system users. Thanks again.
The withCredentials step that has been implemented in the git plugin allows users to perform their own authenticated operations with command line git in sh, bat, and powershell steps. In those cases where a user finds that the git plugin is much slower than command line git (memory pressure in the JNLP container, need specific settings on git command line, etc.), the user can replace the checkout scm call and use a withCredentials block with the git commands inside an sh, bat, or powershell step
And yet the second call to `git fetch` completes in 3 seconds (09:54:15 -> 09:54:18) while the first call to `git fetch` requires 60 seconds. The second call to `git fetch` does not seem to be a dramatic contributor to slower performance. Can you provide more details that compare the same commands from a command line?
Are the same commands also using the same file system?
Are there other differences which affect performance?