[JENKINS-23345] Git Plugin should have an option to use clone instead of init/fetch

Mark Waite added a comment - 2014-06-06 12:10

This is the first I've heard that clone is faster than fetch at retrieving the remote repository. I don't see any mention in my google searches that indicate "git fetch" is slower than "git clone". Can you give some pointers that provide support for the statement that clone is more efficient than fetch?

Even if clone is faster, I'm hesitant to support such an addition to the plugin. Git fetch was chosen because there are fewer ways it can hang (prompting for authentication information) in the various authentication scenarios supported by the plugin. It is even more challenging because there are currently no unit tests for the authentication scenarios, so they must be tested interactively (or not tested) at each plugin release. Adding the option to use "clone" instead of "fetch" would effectively double the cases we need to test in an already very complicated portion of the code.

Mark Waite added a comment - 2014-06-06 12:10 This is the first I've heard that clone is faster than fetch at retrieving the remote repository. I don't see any mention in my google searches that indicate "git fetch" is slower than "git clone". Can you give some pointers that provide support for the statement that clone is more efficient than fetch? Even if clone is faster, I'm hesitant to support such an addition to the plugin. Git fetch was chosen because there are fewer ways it can hang (prompting for authentication information) in the various authentication scenarios supported by the plugin. It is even more challenging because there are currently no unit tests for the authentication scenarios, so they must be tested interactively (or not tested) at each plugin release. Adding the option to use "clone" instead of "fetch" would effectively double the cases we need to test in an already very complicated portion of the code.

Jason Salaz added a comment - 2014-06-06 20:14

Hey Mark,

I've been working with Dave on this issue so I can expand on some of the details. I'll note that my information is still second hand, as I've received this information from someone else, but that someone works on git, so, I'm satisfied with the answer. Here's hoping I'm repeating the information again correctly .

When you use git clone, you receive cryptographic/verifiable assurance that the server has provided you all the objects. When you run 'git clone', the directory is set up, the server collects all the objects, the transfer happens. After it's complete, the receiving end runs a quick verification on the packfile(s), and then puts all the objects in place on the local filesystem, and the process is done.

This is as opposed to git fetch, which is meant for incremental updates. After a fetch completes, git walks the object graph; In most cases this is merely from your previous head forward through all of the new objects, verifying that everything is intact. In Dave's case, walking the entire history of the repository accounts for the vast majority of the execution time. Longer than the default fetch timeout of 10 minutes, for the record.

Receiving data itself isn't any faster, but the verification that takes place after receiving objects via fetch is dramatically different.

Jason Salaz added a comment - 2014-06-06 20:14 Hey Mark, I've been working with Dave on this issue so I can expand on some of the details. I'll note that my information is still second hand, as I've received this information from someone else, but that someone works on git, so, I'm satisfied with the answer. Here's hoping I'm repeating the information again correctly . When you use git clone, you receive cryptographic/verifiable assurance that the server has provided you all the objects. When you run 'git clone', the directory is set up, the server collects all the objects, the transfer happens. After it's complete, the receiving end runs a quick verification on the packfile(s), and then puts all the objects in place on the local filesystem, and the process is done. This is as opposed to git fetch, which is meant for incremental updates. After a fetch completes, git walks the object graph; In most cases this is merely from your previous head forward through all of the new objects, verifying that everything is intact. In Dave's case, walking the entire history of the repository accounts for the vast majority of the execution time. Longer than the default fetch timeout of 10 minutes, for the record. Receiving data itself isn't any faster, but the verification that takes place after receiving objects via fetch is dramatically different.

Daniel Serodio added a comment - 2014-07-08 14:07

I came across this ticket while searching for the reason why Jenkins is doing init+fetch instead of clone. I understand the reasoning, but I feels like a hack. For instance, it seems I need to set Additional behaviours > Check out to specific local branch to master so the Jenkins workspace is not left in a "detached head" situation. If the default value for Branches to build is */master, why do I have to specify this twice?

Case in point: I'm trying to troubleshoot a release script (editing a version file, tagging, pushing the tag, etc) that works on my machine™ but not in Jenkins, because the repository is setup differently in Jenkins — I don't even know how to simulate the Jenkins behaviour on my computer.

So, while I understand that having both options (init+fetch or clone) would double the possible bugs and necessary tests, I believe having only the clone option would be the best choice — and compliant to the principle of least surprise

Daniel Serodio added a comment - 2014-07-08 14:07 I came across this ticket while searching for the reason why Jenkins is doing init+fetch instead of clone. I understand the reasoning, but I feels like a hack. For instance, it seems I need to set Additional behaviours > Check out to specific local branch to master so the Jenkins workspace is not left in a "detached head" situation. If the default value for Branches to build is */master , why do I have to specify this twice? Case in point: I'm trying to troubleshoot a release script (editing a version file, tagging, pushing the tag, etc) that works on my machine ™ but not in Jenkins, because the repository is setup differently in Jenkins — I don't even know how to simulate the Jenkins behaviour on my computer. So, while I understand that having both options (init+fetch or clone) would double the possible bugs and necessary tests, I believe having only the clone option would be the best choice — and compliant to the principle of least surprise

Mark Waite added a comment - 2014-07-08 16:34

dserodio I think "least surprise" includes "don't break workflows for the 30000+ installations of the git plugin and git client plugin" and also includes "don't hang the git command line by prompting for authentication".

In one sense, it is a hack, since git does not have a "--no-interactive" option to prevent the git command from prompting for authentication. The subversion command line has that option, but git hasn't yet reached the point of adding that option.

It would really be great to be able to use "git clone" instead of "git init + git fetch", but I think it is more important to not break existing users than it is to switch from "git init + git fetch" to "git clone".

Mark Waite added a comment - 2014-07-08 16:34 dserodio I think "least surprise" includes "don't break workflows for the 30000+ installations of the git plugin and git client plugin" and also includes "don't hang the git command line by prompting for authentication". In one sense, it is a hack, since git does not have a "--no-interactive" option to prevent the git command from prompting for authentication. The subversion command line has that option, but git hasn't yet reached the point of adding that option. It would really be great to be able to use "git clone" instead of "git init + git fetch", but I think it is more important to not break existing users than it is to switch from "git init + git fetch" to "git clone".

Mark Waite added a comment - 2015-02-01 23:31

In order to satisfy my curiosity about the performance difference between "git clone" and "git fetch", I ran a pair of tests to compare them. I used git 2.2.1 on a Ubuntu 14.04 64 bit machine with a solid state disc as the file system hosting both the source repository and the destination repository. I used a local copy of the linux kernel repository as it exists at commit 69e273c0b0a3c337a521d083374c918dc52c666f. That repository on my disc is about 1.3 GB and contains many, many objects. For a report on the relative activity of the linux kernel repository, refer to

What I found:

$ time git clone ssh://mark-pc1/var/lib/git/mwaite/linux.git - 2m41s
$ time (mkdir fetch;cd fetch;git init; git fetch ssh://mark-pc1/var/lib/git/mwaite/linux.git) - 2m52s

The "git fetch" time was consistently about 7% slower than the "git clone" time.

Your experience is significantly different, since you note in the original report that "git fetch takes 14 mins, git clone takes about 4 minutes." I don't plan to make any change in the git plugin based on that, but wanted to record my observed results in case others are concerned about the apparent difference between "git clone" and "git fetch".

Mark Waite added a comment - 2015-02-01 23:31 In order to satisfy my curiosity about the performance difference between "git clone" and "git fetch", I ran a pair of tests to compare them. I used git 2.2.1 on a Ubuntu 14.04 64 bit machine with a solid state disc as the file system hosting both the source repository and the destination repository. I used a local copy of the linux kernel repository as it exists at commit 69e273c0b0a3c337a521d083374c918dc52c666f. That repository on my disc is about 1.3 GB and contains many, many objects. For a report on the relative activity of the linux kernel repository, refer to What I found: $ time git clone ssh://mark-pc1/var/lib/git/mwaite/linux.git - 2m41s $ time (mkdir fetch;cd fetch;git init; git fetch ssh://mark-pc1/var/lib/git/mwaite/linux.git) - 2m52s The "git fetch" time was consistently about 7% slower than the "git clone" time. Your experience is significantly different, since you note in the original report that "git fetch takes 14 mins, git clone takes about 4 minutes." I don't plan to make any change in the git plugin based on that, but wanted to record my observed results in case others are concerned about the apparent difference between "git clone" and "git fetch".

michele hallak-stamler added a comment - 2015-03-11 08:18

Another problem with init+fetch instead of clone:
We are using header expansion with git filter + git attribute. The clean and smudge filters are perl scripts.
Unfortunately it doesn't work with the git plugin.
After having read this issue, I recreate manually by using git init + git pull and indeed, the filters don't work.
It is very very annoying since it means that I'll not be able to use the git plugin and will do the clone with a script.
Is there a way to use git just to check changes without pulling the code in the workspace?
The problem is that when configured with filter the git init+plugin takes several hours instead of few minutes.
The scripts are: https://github.com/turon/git-rcs-keywords
I would be grateful to any idea.

michele hallak-stamler added a comment - 2015-03-11 08:18 Another problem with init+fetch instead of clone: We are using header expansion with git filter + git attribute. The clean and smudge filters are perl scripts. Unfortunately it doesn't work with the git plugin. After having read this issue, I recreate manually by using git init + git pull and indeed, the filters don't work. It is very very annoying since it means that I'll not be able to use the git plugin and will do the clone with a script. Is there a way to use git just to check changes without pulling the code in the workspace? The problem is that when configured with filter the git init+plugin takes several hours instead of few minutes. The scripts are: https://github.com/turon/git-rcs-keywords I would be grateful to any idea.

Mark Waite added a comment - 2015-03-12 03:40

mhallak As far as I can tell, yours is the first case of someone using git in Jenkins for header expansion. Even if I were willing to accept the increased complexity and reliability risk of having a git clone based implementation, I would still be unlikely to add support for header expansion.

Mark Waite added a comment - 2015-03-12 03:40 mhallak As far as I can tell, yours is the first case of someone using git in Jenkins for header expansion. Even if I were willing to accept the increased complexity and reliability risk of having a git clone based implementation, I would still be unlikely to add support for header expansion.

michele hallak-stamler added a comment - 2015-03-12 06:24

Of course, there is no need to support header expansion. I just wanted to explain why we need the clone functionality and not init+pull. The support for filters in .gitattributes is an official feature of git and we cannot use it with Jenkins. I'll have to write the cloning script and we'll not be able to rely on the built-in git implementation....

michele hallak-stamler added a comment - 2015-03-12 06:24 Of course, there is no need to support header expansion. I just wanted to explain why we need the clone functionality and not init+pull. The support for filters in .gitattributes is an official feature of git and we cannot use it with Jenkins. I'll have to write the cloning script and we'll not be able to rely on the built-in git implementation....

Eugene Gunov added a comment - 2015-05-22 16:55

The clone functionality is also required to use git extensions for large files such as git-fat.

Eugene Gunov added a comment - 2015-05-22 16:55 The clone functionality is also required to use git extensions for large files such as git-fat .

Scott Richmond added a comment - 2015-09-05 16:06

Without Git clone Git LFS does not work at all as it doesn't init LFS properly. We are going to need a solution to this soon, as Git LFS is becoming quite popular among Game Dev software projects. Is there anything in progress?

Scott Richmond added a comment - 2015-09-05 16:06 Without Git clone Git LFS does not work at all as it doesn't init LFS properly. We are going to need a solution to this soon, as Git LFS is becoming quite popular among Game Dev software projects. Is there anything in progress?

Mark Waite added a comment - 2015-09-05 17:41

Nothing is in progress to switch from init + fetch to clone. There are no pending pull requests for an implementation that will allow switching from one to the other.

Mark Waite added a comment - 2015-09-05 17:41 Nothing is in progress to switch from init + fetch to clone. There are no pending pull requests for an implementation that will allow switching from one to the other.

Scott Richmond added a comment - 2015-09-05 17:50

I had a quick scan over the plugin code and I understand that it would be a complex change, for sure. I've raised a ticket over here to discuss solutions that might be less of an impact: https://issues.jenkins-ci.org/browse/JENKINS-30318

Scott Richmond added a comment - 2015-09-05 17:50 I had a quick scan over the plugin code and I understand that it would be a complex change, for sure. I've raised a ticket over here to discuss solutions that might be less of an impact: https://issues.jenkins-ci.org/browse/JENKINS-30318

Mark Waite added a comment - 2015-09-21 02:40

Jacob Keller has proposed git plugin PR342 and git client plugin PR180 to implement submodule authentication. One of the side effects of his changes for submodule authentication may be that it will be easier / feasible to allow the option to switch from init/fetch to clone.

Because the git client plugin change alters the authentication technique, we'll need a solid beta test phase before delivering that change to the larger community of Jenkins users.

Mark Waite added a comment - 2015-09-21 02:40 Jacob Keller has proposed git plugin PR342 and git client plugin PR180 to implement submodule authentication. One of the side effects of his changes for submodule authentication may be that it will be easier / feasible to allow the option to switch from init/fetch to clone. Because the git client plugin change alters the authentication technique, we'll need a solid beta test phase before delivering that change to the larger community of Jenkins users.

Kevin Chen added a comment - 2015-12-05 00:04

Is there an ETA on when this will be resolved? We're also invested in the use of "git clone" being implemented.

Kevin Chen added a comment - 2015-12-05 00:04 Is there an ETA on when this will be resolved? We're also invested in the use of "git clone" being implemented.

Mark Waite added a comment - 2015-12-06 01:59

There is no ETA for any implementation that will replace the current init/fetch with clone.

Is your use case one of the use cases already described, or something different?

Mark Waite added a comment - 2015-12-06 01:59 There is no ETA for any implementation that will replace the current init/fetch with clone. Is your use case one of the use cases already described, or something different?

Kevin Chen added a comment - 2015-12-07 16:33

Our situation is almost identical to the others reported.

Kevin Chen added a comment - 2015-12-07 16:33 Our situation is almost identical to the others reported.

Mark Waite added a comment - 2020-12-10 04:34

Won't be implemented. We hope in the future to provide a credential binding wrapper so that users who want precise control of the git command can use the wrapper and then call command line git from their pipeline script.

Mark Waite added a comment - 2020-12-10 04:34 Won't be implemented. We hope in the future to provide a credential binding wrapper so that users who want precise control of the git command can use the wrapper and then call command line git from their pipeline script.

Jenkins

Details

Description

Attachments

Issue Links

Activity

Collapse comment: Mark Waite added a comment - 2014-06-06 12:10

Expand comment: Mark Waite added a comment - 2014-06-06 12:10

Collapse comment: Jason Salaz added a comment - 2014-06-06 20:14

Expand comment: Jason Salaz added a comment - 2014-06-06 20:14

Collapse comment: Daniel Serodio added a comment - 2014-07-08 14:07

Expand comment: Daniel Serodio added a comment - 2014-07-08 14:07

Collapse comment: Mark Waite added a comment - 2014-07-08 16:34

Expand comment: Mark Waite added a comment - 2014-07-08 16:34

Collapse comment: Mark Waite added a comment - 2015-02-01 23:31

Expand comment: Mark Waite added a comment - 2015-02-01 23:31

Collapse comment: michele hallak-stamler added a comment - 2015-03-11 08:18

Expand comment: michele hallak-stamler added a comment - 2015-03-11 08:18

Collapse comment: Mark Waite added a comment - 2015-03-12 03:40

Expand comment: Mark Waite added a comment - 2015-03-12 03:40

Collapse comment: michele hallak-stamler added a comment - 2015-03-12 06:24

Expand comment: michele hallak-stamler added a comment - 2015-03-12 06:24

Collapse comment: Eugene Gunov added a comment - 2015-05-22 16:55

Expand comment: Eugene Gunov added a comment - 2015-05-22 16:55

Collapse comment: Scott Richmond added a comment - 2015-09-05 16:06

Expand comment: Scott Richmond added a comment - 2015-09-05 16:06

Collapse comment: Mark Waite added a comment - 2015-09-05 17:41

Expand comment: Mark Waite added a comment - 2015-09-05 17:41

Collapse comment: Scott Richmond added a comment - 2015-09-05 17:50

Expand comment: Scott Richmond added a comment - 2015-09-05 17:50

Collapse comment: Mark Waite added a comment - 2015-09-21 02:40

Expand comment: Mark Waite added a comment - 2015-09-21 02:40

Collapse comment: Kevin Chen added a comment - 2015-12-05 00:04

Expand comment: Kevin Chen added a comment - 2015-12-05 00:04

Collapse comment: Mark Waite added a comment - 2015-12-06 01:59

Expand comment: Mark Waite added a comment - 2015-12-06 01:59

Collapse comment: Kevin Chen added a comment - 2015-12-07 16:33

Expand comment: Kevin Chen added a comment - 2015-12-07 16:33

Collapse comment: Mark Waite added a comment - 2020-12-10 04:34

Expand comment: Mark Waite added a comment - 2020-12-10 04:34

People

Dates