Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59139

don't set GIT_LFS_SKIP_SMUDGE=1 anymore while cloning with git 2.15+

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • git-client-plugin
    • None

      Since commit 853603cccd4434b116ef9b8e094c3f5b815aa75a, we set GIT_LFS_SKIP_SMUDGE=1  to improve performance, and require users to add the "Git LFS pull after checkout" additional behaviour to resolve pointers. 

      This was mostly done because git clone performance was pretty bad with the smudge filter enabled, and in some configurations, (ssh clone URLs), git lfs pull didn't work.

      Since then, a lot has improved:

      Clone performance through the smudge filter has improved since https://github.com/git-lfs/git-lfs/pull/2511.

      Backends should support the git-lfs-authenticate dance as described in https://github.com/git-lfs/git-lfs/blob/master/docs/api/authentication.md#ssh, which retrieves a temporary authorization token to do the clone over https.

      I propose to just do a git clone without setting GIT_LFS_SKIP_SMUDGE=1. Most systems should have done a git lfs install before, so they will resolve the lfs pointers during a clone. We can keep the "Git LFS pull after checkout" additional behaviour, for systems that don't have setup global smudge filters via git lfs install.

          [JENKINS-59139] don't set GIT_LFS_SKIP_SMUDGE=1 anymore while cloning with git 2.15+

          Mark Waite added a comment - - edited

          Your description seems reasonable to me. I want to see the results of evaluating a pull request on the Linux (Amazon Linux, CentOS, Debian, Ubuntu), Windows, and FreeBSD environments where I run tests regularly. Will you submit the pull request?

          PR 2511 to git-lfs seems to indicate that specific versions of command line git and of git lfs are needed in order to gain the benefit of the change. Can you confirm that command line git versions 2.15 and later are needed and that git lfs 2.3.0 and later are needed?

          If so, then the pull request will need to check the command line git version and the git lfs version before removing the GIT_LFS_SKIP_SMUDGE environment variable. If that conditional is not included in the change, then many operating systems that deliver older versions of command line git or older versions of git lfs will suffer a performance loss from removing that environment variable.

          Operating System CLI git version delivered
          Amazon Linux 2 2.17
          CentOS 6 1.7.1
          CentOS 7 1.8
          Debian 9 2.11
          Debian 10 2.20
          Ubuntu 16 2.7
          Ubuntu 18 2.17
          FreeBSD 12 2.21

          Mark Waite added a comment - - edited Your description seems reasonable to me. I want to see the results of evaluating a pull request on the Linux (Amazon Linux, CentOS, Debian, Ubuntu), Windows, and FreeBSD environments where I run tests regularly. Will you submit the pull request? PR 2511 to git-lfs seems to indicate that specific versions of command line git and of git lfs are needed in order to gain the benefit of the change. Can you confirm that command line git versions 2.15 and later are needed and that git lfs 2.3.0 and later are needed? If so, then the pull request will need to check the command line git version and the git lfs version before removing the GIT_LFS_SKIP_SMUDGE environment variable. If that conditional is not included in the change, then many operating systems that deliver older versions of command line git or older versions of git lfs will suffer a performance loss from removing that environment variable. Operating System CLI git version delivered Amazon Linux 2 2.17 CentOS 6 1.7.1 CentOS 7 1.8 Debian 9 2.11 Debian 10 2.20 Ubuntu 16 2.7 Ubuntu 18 2.17 FreeBSD 12 2.21

          Florian Klink added a comment -

          I did some quick digging, git >= 2.15 and git-lfs >= 2.3.0 sounds reasonable, by looking at the tags containing commits part of those PRs.

          `git-client-plugin` currently lacks a way to determine the version of git-lfs, but that should not be too hard to add.

           

          However, changing the logic might be a bit more involved if we want to do the version check as described above (and I think it makes sense to do so).

           

          Currently, the `GitLFSPull` `GitSCMExtension` ("Git LFS pull after checkout") simply calls `CliGitAPIImpl.lfsRemote` with `repos.get(0).getName()` - and inside `CliGitAPIImpl`, all LFS-related functionality is conditional on `(lfsRemote != null)`.

           

          If we now want to rely on the smudge filter for recent enough git(-lfs) versions, and only have the `GitLFSPull` `GitSCMExtension` to trigger `git lfs pull` for old git(-lfs) versions,`GitLFSPull` should be changed to simply flip a boolean in `CliGitAPIImpl`, and the logic in there should  be made around that. Does that sound right to you?

          Florian Klink added a comment - I did some quick digging, git >= 2.15 and git-lfs >= 2.3.0 sounds reasonable, by looking at the tags containing commits part of those PRs. `git-client-plugin` currently lacks a way to determine the version of git-lfs, but that should not be too hard to add.   However, changing the logic might be a bit more involved if we want to do the version check as described above (and I think it makes sense to do so).   Currently, the `GitLFSPull` `GitSCMExtension` ("Git LFS pull after checkout") simply calls `CliGitAPIImpl.lfsRemote` with `repos.get(0).getName()` - and inside `CliGitAPIImpl`, all LFS-related functionality is conditional on `(lfsRemote != null)`.   If we now want to rely on the smudge filter for recent enough git(-lfs) versions, and only have the `GitLFSPull` `GitSCMExtension` to trigger `git lfs pull` for old git(-lfs) versions,`GitLFSPull` should be changed to simply flip a boolean in `CliGitAPIImpl`, and the logic in there should  be made around that. Does that sound right to you?

          Mark Waite added a comment -

          That sounds reasonable to me.

          Mark Waite added a comment - That sounds reasonable to me.

          Florian Klink added a comment -

          I fear I won't find the time to actually do this, sorry - hopefully somebody else can pick up from here.

          Florian Klink added a comment - I fear I won't find the time to actually do this, sorry - hopefully somebody else can pick up from here.

          René Scheibe added a comment -

          Please also keep JENKINS-47531 in mind. When a password is provided, it can only be used if git-lfs did not yet run automatically on checkout.

          See https://github.com/jenkinsci/git-client-plugin/pull/280.

          René Scheibe added a comment - Please also keep JENKINS-47531 in mind. When a password is provided, it can only be used if git-lfs did not yet run automatically on checkout. See https://github.com/jenkinsci/git-client-plugin/pull/280 .

          yamhai added a comment -

          I've submitted a PR for this: https://github.com/jenkinsci/git-client-plugin/pull/476. Can you review it?

          Should I modify it as per Florian's last comment, where as far as I understand, all I need to do is change (lfsRemote != null) logic to something like (isLfsPresent) being initialized from `GitLFSPull`. Is that right?

          yamhai added a comment - I've submitted a PR for this:  https://github.com/jenkinsci/git-client-plugin/pull/476 . Can you review it? Should I modify it as per Florian's last comment, where as far as I understand, all I need to do is change  (lfsRemote != null)  logic to something like (isLfsPresent) being initialized from `GitLFSPull`. Is that right?

          Dan Dumont added a comment -

          If this ticket is resolved, I would hope it would be with an option to disable the lfs pull (optionally set the skip smudge env var)

          Dan Dumont added a comment - If this ticket is resolved, I would hope it would be with an option to disable the lfs pull (optionally set the skip smudge env var)

          Mark Waite added a comment -

          ddumont you're encouraged to test the pull request build that yamhai submitted. I'm not working on this area of the plugin right now so that I can focus my efforts to prepare for the plugin release that uses symbols in the pipeline definition instead of using Java class names.

          Mark Waite added a comment - ddumont you're encouraged to test the pull request build that yamhai submitted. I'm not working on this area of the plugin right now so that I can focus my efforts to prepare for the plugin release that uses symbols in the pipeline definition instead of using Java class names.

          Mark Waite added a comment -

          Based on the comment from ttaylorr on GitHub, we'll need to do this before the release of git LFS 3.0.0.

          ttaylorr on GitHub says:

          Hi everybody, I am going to close this issue since 'git-lfs-clone(1)' is marked as deprecated. We have marked the command as such since the process filter makes pure 'git-clone(1)''s just as fast, without the need for this "wrapper" command.

          Since the command is deprecated, we'll be removing it in v3.0.0, and not updating it between now and then (save for things such as security vulnerabilities, etc.)

          Since they will be removing git lfs clone I'm confident they are also likely to remove git lfs fetch. Another good excuse to create an administrative monitor that warns admins when the git plugin detects an older version of git lfs or an older version of git or a surprising combination of git and git lfs.

          Mark Waite added a comment - Based on the comment from ttaylorr on GitHub, we'll need to do this before the release of git LFS 3.0.0. ttaylorr on GitHub says: Hi everybody, I am going to close this issue since 'git-lfs-clone(1)' is marked as deprecated. We have marked the command as such since the process filter makes pure 'git-clone(1)''s just as fast, without the need for this "wrapper" command. Since the command is deprecated, we'll be removing it in v3.0.0, and not updating it between now and then (save for things such as security vulnerabilities, etc.) Since they will be removing git lfs clone I'm confident they are also likely to remove git lfs fetch . Another good excuse to create an administrative monitor that warns admins when the git plugin detects an older version of git lfs or an older version of git or a surprising combination of git and git lfs .

          Patrick Lang added a comment - - edited

          A few points:

          1. re: "Backends should support the git-lfs-authenticate dance as described in https://github.com/git-lfs/git-lfs/blob/master/docs/api/authentication.md#ssh, which retrieves a temporary authorization token to do the clone over https."

          I don't think this statement is true in non-SSH cases and not all backends. I'm using GitHub token auth (HTTPS) and Artifactory tokens (also HTTPS). I don't think we should require people to switch to SSH to use LFS. Additionally, the credentials can differ between GitHub & Artifactory as mentioned in JENKINS-47531 .

          2. I don't think git lfs fetch can be deprecated. checkout was briefly deprecated, then undeprecated as they decided to add new functionality to it (https://github.com/git-lfs/git-lfs/pull/3303). The checkout man page states "Try to ensure that the working copy contains file content for Git LFS objects for the current ref, if the object data is available. Does not download any content, see git-lfs-fetch(1) for that."

           

          So in conclusion - even if we change the default behavior to allow the smudge filter, there will be cases where additional LFS configuration may be possible (deferring pull or changing checkout settings) or even required (auth). If this is merged, I think we need to be able to continue to opt out of git LFS pull and allow a customized pull or fetch+checkout.

          Patrick Lang added a comment - - edited A few points: 1. re: "Backends should support the  git-lfs-authenticate  dance as described in  https://github.com/git-lfs/git-lfs/blob/master/docs/api/authentication.md#ssh , which retrieves a temporary authorization token to do the clone over https." I don't think this statement is true in non-SSH cases and not all backends. I'm using GitHub token auth (HTTPS) and Artifactory tokens (also HTTPS). I don't think we should require people to switch to SSH to use LFS. Additionally, the credentials can differ between GitHub & Artifactory as mentioned in JENKINS-47531 . 2. I don't think git lfs fetch can be deprecated. checkout was briefly deprecated, then undeprecated as they decided to add new functionality to it ( https://github.com/git-lfs/git-lfs/pull/3303 ). The checkout man page states "Try to ensure that the working copy contains file content for Git LFS objects for the current ref, if the object data is available. Does not download any content, see git-lfs-fetch(1) for that."   So in conclusion - even if we change the default behavior to allow the smudge filter, there will be cases where additional LFS configuration may be possible (deferring pull or changing checkout settings) or even required (auth). If this is merged, I think we need to be able to continue to opt out of git LFS pull and allow a customized pull or fetch+checkout.

            Unassigned Unassigned
            flokli Florian Klink
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: