• Icon: Improvement Improvement
    • Resolution: Won't Do
    • Icon: Minor Minor
    • git-plugin
    • None
    • Jenkins LTS v2.263.3
      Git plugin 4.5.2
      Git 2.26.2

      We have a large Git repository (recently converted from SVN), and many builds so keeping what's cloned/checked out a small as we can is a priority.

      Although the Git plugin offers a shallow clone ('CloneOption', shallow: true, depth: 1), we get no change log (JENKINS-45586).

      I believe offering the partial cloning via the Git --filter=blob:none (and I guess for completeness, 'blob:limit_=_<n>' and 'tree:<depth>'), would be a better option as it keeps the .git folder small and only gets the blobs required for the build.

      Just to show testing on my repo went as follows:

      git fetch --no-tags https://server/repo.git +refs/heads/foo:refs/remotes/origin/foo
      git checkout foo
      .git folder = 12.2 GiB
      repo folder = 19.75 GiB
      
      git fetch --no-tags --depth=1 https://server/repo.git +refs/heads/foo:refs/remotes/origin/foo
      git checkout foo
      .git folder = 2.82 GiB
      repo folder = 9.37 GiB
      NO HISTORY
      
      git fetch --no-tags --filter=blob:none https://server/repo.git +refs/heads/foo:refs/remotes/origin/foo
      git checkout foo
      .git folder = 2.9 GiB
      repo folder = 9.45 GiB
      FULL HISTORY for the foo branch
      

      I did try to workaround the lack of this feature manually configuring the Git repo for partial clone prior to using the Jenkins Checkout GitSCM, but immediately run in to the error:

      git config remote.https://server/repo.git.promisor true
      git config remote.https://server/repo.git.partialclonefilter blob:none
      
      ERROR: Error fetching remote repo 'origin'
      hudson.plugins.git.GitException: Failed to fetch from https://server/repo.git
      at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:909)
      at ...
      Caused by: hudson.plugins.git.GitException: Command "C:\git-2.26.2\bin\git.exe fetch --no-tags --force --progress -- https://server/repo.git +refs/heads/foo:refs/remotes/origin/foo" returned status code 1:
      
      error: https://server/repo.git did not send all necessary objects
      

          [JENKINS-64844] Add Partial Clone ability to Git plugin

          Mark Waite added a comment - - edited

          I don't plan to implement a git plugin configuration that would support the --filter option.

          JENKINS-28335 proposes to provide a Pipeline wrapper method that will allow users to use command line git with whatever commands they prefer. A proposal with more details has been submitted as a Google Summer of Code 2021 project idea. Workarounds until that is available include embedding the credentials into the https URL or using a withCredentials step to provide the ssh private key so that command line git can be used with the command line arguments you prefer.

          The --filter argument appears to only be available with git clone, not with git fetch, and seems to only have been added to git 2.27 and later. The git plugin needs to support command line git versions on common Linux operating systems. Command line git 2.27 is newer than the command line git provided with most Linux operating systems.

          Mark Waite added a comment - - edited I don't plan to implement a git plugin configuration that would support the --filter option. JENKINS-28335 proposes to provide a Pipeline wrapper method that will allow users to use command line git with whatever commands they prefer. A proposal with more details has been submitted as a Google Summer of Code 2021 project idea. Workarounds until that is available include embedding the credentials into the https URL or using a withCredentials step to provide the ssh private key so that command line git can be used with the command line arguments you prefer. The --filter argument appears to only be available with git clone, not with git fetch, and seems to only have been added to git 2.27 and later. The git plugin needs to support command line git versions on common Linux operating systems. Command line git 2.27 is newer than the command line git provided with most Linux operating systems.

          Chris Lake added a comment -

          Thanks for the update Mark.

          For others who may stumble upon this ticket with similar repositories looking for similar solutions, I had not realised the power of the "reference repository". The Git-Plugin documentation doesn't say much, but having the one clone of the repository somewhere on disk accessible to the build for reference keeps the build's own .git folder down to several MiB, which is a far greater saving than even the partial clone to hope to achieve.

          Chris Lake added a comment - Thanks for the update Mark. For others who may stumble upon this ticket with similar repositories looking for similar solutions, I had not realised the power of the "reference repository" . The Git-Plugin documentation doesn't say much, but having the one clone of the repository somewhere on disk accessible to the build for reference keeps the build's own  .git  folder down to several MiB, which is a far greater saving than even the partial clone to hope to achieve.

          Mark Waite added a comment - - edited

          I gave a talk a few years ago related to large repositories and git.

          https://youtu.be/TsWkZLLU-s4?t=137

          Another talk that I gave a year later:

          https://youtu.be/jBGFjFc6Jf8?t=6434

          Mark Waite added a comment - - edited I gave a talk a few years ago related to large repositories and git. https://youtu.be/TsWkZLLU-s4?t=137 Another talk that I gave a year later: https://youtu.be/jBGFjFc6Jf8?t=6434

          maliku added a comment -

          As a reference,

          https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/

           

          Manually I can do the following and I end up in .git folder of 900 KB, without the --filter options my .git is over 1.1 GB

           

          git init
          git remote add origin <url>

          git config core.sparseCheckout true

          git config credential.useHttpPath true

          git rev-list --objects --filter=tree:0 --filter=blob:none
          git fetch --depth 1 origin <branch>
          git sparse-checkout init --cone
          git sparse-checkout set <path>
          git sparse-checkout reapply
          git reset --hard origin/<branch>
          git checkout --force branch

           

          This is particularly important on my case because I load common libraries from different git repositories

           

              def String customLibraryPath = 'jenkins'
              def List<Object> extensionsObjects = [[$class: 'CloneOption', noTags: false, reference: '', shallow: true],[$class: 'SparseCheckoutPaths', sparseCheckoutPaths: [[path: '/'+customLibraryPath+'/']]]]
              library changelog: false, identifier: loading+'@'+testbranch, retriever: modernSCM(scm: [$class: 'GitSCMSource', credentialsId: credentials, remote: urlpath, extensions: extensionsObjects], libraryPath: customLibraryPath)

           

          I would like to have the --filter option sin the load library from scm use case

           

           

          maliku added a comment - As a reference, https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/   Manually I can do the following and I end up in .git folder of 900 KB, without the --filter options my .git is over 1.1 GB   git init git remote add origin <url> git config core.sparseCheckout true git config credential.useHttpPath true git rev-list --objects --filter=tree:0 --filter=blob:none git fetch --depth 1 origin <branch> git sparse-checkout init --cone git sparse-checkout set <path> git sparse-checkout reapply git reset --hard origin/<branch> git checkout --force branch   This is particularly important on my case because I load common libraries from different git repositories       def String customLibraryPath = 'jenkins'     def List<Object> extensionsObjects = [ [$class: 'CloneOption', noTags: false, reference: '', shallow: true] ,[$class: 'SparseCheckoutPaths', sparseCheckoutPaths: [ [path: '/'+customLibraryPath+'/'] ]]]     library changelog: false, identifier: loading+'@'+testbranch, retriever: modernSCM(scm: [$class: 'GitSCMSource', credentialsId: credentials, remote: urlpath, extensions: extensionsObjects] , libraryPath: customLibraryPath)   I would like to have the --filter option sin the load library from scm use case    

          Would you be open to make the filter parameter configurable via a pipeline option?

          Our pipelines would really benefit from blobless cloning.

          Michael Kriese added a comment - Would you be open to make the filter parameter configurable via a pipeline option? Our pipelines would really benefit from blobless cloning.

            Unassigned Unassigned
            amolago Chris Lake
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: