Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14572

Git plug-in fetches all tags even when refspec is provided

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • git-plugin
    • None
    • Windows Server 2008 Master, Jenkins ver. 1.447.2; Linux Jenkins 1.597, Git plugin 2.2.12.

      When a repository is configured to fetch a specific refspec from the upstream repository, all tags are fetched.

      The documentation specifically states that when a refspec is specified, only that refspec will be fetched.

      NOTE: This is not a duplicate of JENKINS-6124. That issue is requesting this to be optional for all cases. This issue is requesting only that the behavior in this specific scenario behave as documented. (i.e. fix the bug)

          [JENKINS-14572] Git plug-in fetches all tags even when refspec is provided

          Douglas Beatty created issue -

          Copied the following comment from JENKINS-6124 as it is relevant here as well....

          For a standard source code repository, this is not significant performance wise and we have not noticed this issue on our source code repositories.

          However, in general, it is not good that it is fetching references and therefore objects that are not required by the build. It is just inefficient. However, as stated, for source code repositories it is not significant enough that we have noticed.

          I have created a new issue that targets the scenario for our specific issue JENKINS-14572.

          Where we have noticed it is in the Git repositories we are using to deploy our web applications which are large repositories of binary files.

          We create each deployment image as an orphan commit, tag it, and push the tag not the branch back to our Git server. This way we can delete the tags after a holding period causing those commits to be unreferenced and eligible for garbage collection.

          We have the Jenkins plug-in configured to pull down an 'empty' branch by specifying a refspec. When all of the tags are fetched, then a large number of unneeded objects are pulled down from the repository greatly increasing the size of the repositories on the Jenkins slaves.

          Having said all of that, if you want to build a specific tag, why not specify it in the refspec? The refspec can use build parameters also. Why fetch all tags when you only need one?

          Douglas Beatty added a comment - Copied the following comment from JENKINS-6124 as it is relevant here as well.... For a standard source code repository, this is not significant performance wise and we have not noticed this issue on our source code repositories. However, in general, it is not good that it is fetching references and therefore objects that are not required by the build. It is just inefficient. However, as stated, for source code repositories it is not significant enough that we have noticed. I have created a new issue that targets the scenario for our specific issue JENKINS-14572 . Where we have noticed it is in the Git repositories we are using to deploy our web applications which are large repositories of binary files. We create each deployment image as an orphan commit, tag it, and push the tag not the branch back to our Git server. This way we can delete the tags after a holding period causing those commits to be unreferenced and eligible for garbage collection. We have the Jenkins plug-in configured to pull down an 'empty' branch by specifying a refspec. When all of the tags are fetched, then a large number of unneeded objects are pulled down from the repository greatly increasing the size of the repositories on the Jenkins slaves. Having said all of that, if you want to build a specific tag, why not specify it in the refspec? The refspec can use build parameters also. Why fetch all tags when you only need one?

          git repository isn't designed to hold large binary files, you're abusing git in this scenario.

          Nicolas De Loof added a comment - git repository isn't designed to hold large binary files, you're abusing git in this scenario.
          Nicolas De Loof made changes -
          Resolution New: Won't Fix [ 2 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          Git is handling our use case better than any other tool currently available to us, and it is doing so flawlessly (wish I could say the same for this plug-in).

          FWIW, I didn't say 'large binary files'. I said 'large repositories of binary files'. Individual file size was not specified. Your assumption is incorrect.

          Regardless, we have a work around for this. We are using Git directly via scripting rather than through this plug-in which in many cases is easier and more efficient anyway.

          Douglas Beatty added a comment - Git is handling our use case better than any other tool currently available to us, and it is doing so flawlessly (wish I could say the same for this plug-in). FWIW, I didn't say 'large binary files'. I said 'large repositories of binary files'. Individual file size was not specified. Your assumption is incorrect. Regardless, we have a work around for this. We are using Git directly via scripting rather than through this plug-in which in many cases is easier and more efficient anyway.

          I would like to reopen the issue.

          The documentation says:

          When do you want to modify this value? A good example is when you want to just retrieve one branch. For example, +refs/heads/master:refs/remotes/origin/master would only retrieve the master branch and nothing else.

          But when I set the refspecs to +refs/heads/master:refs/remotes/origin/master the repo in the workspace contains other references (branches) as well.

          As I try to learn how to use this plugin, I find this confusing. Please, either change the documentation, or the behavior.

          This is on linux, Jenkins 1.597, Git plugin 2.2.12.

          Martin d'Anjou added a comment - I would like to reopen the issue. The documentation says: When do you want to modify this value? A good example is when you want to just retrieve one branch. For example, +refs/heads/master:refs/remotes/origin/master would only retrieve the master branch and nothing else. But when I set the refspecs to +refs/heads/master:refs/remotes/origin/master the repo in the workspace contains other references (branches) as well. As I try to learn how to use this plugin, I find this confusing. Please, either change the documentation, or the behavior. This is on linux, Jenkins 1.597, Git plugin 2.2.12.
          Martin d'Anjou made changes -
          Resolution Original: Won't Fix [ 2 ]
          Status Original: Resolved [ 5 ] New: Reopened [ 4 ]
          Martin d'Anjou made changes -
          Environment Original: Windows Server 2008 Master, Jenkins ver. 1.447.2 New: Windows Server 2008 Master, Jenkins ver. 1.447.2; Linux Jenkins 1.597, Git plugin 2.2.12.

          Robert Moore added a comment -

          I'm seeing this behavior on Jenkins 1.598, Git-Plugin 2.3.5. It appears that git-plugin first fetches all branches with depth=1 (~500MB in our case), and then it fetches the specific change it needs (~20MB). I've tested locally and the second fetch is the only one needed for a build.

          > git init /path/to/.jenkins/workspace/Example-Project-Job
          Fetching upstream changes from ssh://user@example.com/Project.git
          > git --version
          using GIT_SSH to set credentials The Credentials for the user system user
          > git fetch --tags --progress ssh://user@example.com/Project.git +refs/heads/*:refs/remotes/origin/* --depth=1
          > git config remote.origin.url ssh://user@example.com/Project.git
          > git config remote.origin.fetch +refs/heads/*:refs/remotes/origin/*
          > git config remote.origin.url ssh://user@example.com/Project.git
          Fetching upstream changes from ssh://user@example.com/Project.git
          using GIT_SSH to set credentials The Credentials for the user system user
          > git fetch --tags --progress ssh://user@example.com/Project.git refs/changes/93/3393/2
          > git rev-parse 12345...123^

          {commit}

          Checking out Revision 12345...123 (mainline)
          > git config core.sparsecheckout
          > git checkout -f 12345...123

          Robert Moore added a comment - I'm seeing this behavior on Jenkins 1.598, Git-Plugin 2.3.5. It appears that git-plugin first fetches all branches with depth=1 (~500MB in our case), and then it fetches the specific change it needs (~20MB). I've tested locally and the second fetch is the only one needed for a build. > git init /path/to/.jenkins/workspace/Example-Project-Job Fetching upstream changes from ssh://user@example.com/Project.git > git --version using GIT_SSH to set credentials The Credentials for the user system user > git fetch --tags --progress ssh://user@example.com/Project.git +refs/heads/*:refs/remotes/origin/* --depth=1 > git config remote.origin.url ssh://user@example.com/Project.git > git config remote.origin.fetch +refs/heads/*:refs/remotes/origin/* > git config remote.origin.url ssh://user@example.com/Project.git Fetching upstream changes from ssh://user@example.com/Project.git using GIT_SSH to set credentials The Credentials for the user system user > git fetch --tags --progress ssh://user@example.com/Project.git refs/changes/93/3393/2 > git rev-parse 12345...123^ {commit} Checking out Revision 12345...123 (mainline) > git config core.sparsecheckout > git checkout -f 12345...123

          Zeus Minos added a comment -

          Is there any temp workaround ? Fetching all the tags in our case is just a waist of time.
          00:01:28.364 > C:\Program Files\Git\cmd\git.exe -c core.askpass=true fetch --tags --progress git@gitlab.dev.local:superman/b.git +refs/heads/:refs/remotes/origin/ --depth=1
          00:07:05.694 > C:\Program Files\Git\cmd\git.exe config remote.origin.url git@gitlab.dev.local:superman/b.git # timeout=10

          As you can see It takes about 7 min to execute the next cmd. This is just a waist of time :-/

          Please share with me any workaround

          Zeus Minos added a comment - Is there any temp workaround ? Fetching all the tags in our case is just a waist of time. 00:01:28.364 > C:\Program Files\Git\cmd\git.exe -c core.askpass=true fetch --tags --progress git@gitlab.dev.local:superman/b.git +refs/heads/ :refs/remotes/origin/ --depth=1 00:07:05.694 > C:\Program Files\Git\cmd\git.exe config remote.origin.url git@gitlab.dev.local:superman/b.git # timeout=10 As you can see It takes about 7 min to execute the next cmd. This is just a waist of time :-/ Please share with me any workaround

            Unassigned Unassigned
            dt25954 Douglas Beatty
            Votes:
            6 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated: