Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-29631

Non-workspace polling using git ls-remote doesn't work for several edge cases

      Right now, there are several cases where 'git ls-remote' is used by default to check for changes that result in incorrect behavior.

      1. Currently, '-h' is passed to ls-remote, which automatically ignores any refs not under refs/heads or refs/tags. If the refspec is set to fetch refs/pull-requests/*/from for example, passing -h will cause that to be ignored (this example comes from Atlassian Stash)

      2. The plugin only uses the first result of ls-remote. In the case of branch strings with wildcards, there could be multiple results.

      3. If the refspec remaps between the remote and the local, ls-remote isn't suitable to check for changes because the branch might have a different name. For example, if the refspec '+refs/pull-requests/*/from:refs/heads/origin/pr/*' is used with a branch string of 'origin/pr/**', checking the remote for changes to origin/pr/** won't work.

      It looks like the plugin is already smart enough to switch over to workspace-based polling in some of these cases if the branch string is specified directly, but we've run into cases where it reverts to using 'git ls-remote' when branch comes from a job parameter.

      All of our projects have a BRANCH parameter with a default value that's used to fill in the git plugin's branch field.

      In addition, we have two refspecs for normal jobs - the regular all branches one and one to match tags. Pull request jobs point at Atlassian Stash pull requests (+refs/pull-requests/*/from:refs/heads/origin/pr/*).

      When Jenkins gets notified to poll for changes, it appears to outright ignore anything that wouldn't be matched by a default refspec - in this case, it won't match tags or pull requests since they aren't present in a standard fetch, despite explicitly setting the refspec. Worse, if both the default refspec and the tags refspec are listed, it won't trigger on any changes, reporting that nothing matches the configured refspec, not even master.

      This worked just fine in the prior version of the git plugin (2.2.7). I've reproduced the problem on both Jenkins 1.584 and 1.609.1. I've tried forcing re-clone and forcing polling with a workspace to no effect.

      It seems to have something to do with the parameterized branch, as using a concrete branch parameter seemed to make the issue go away.

          [JENKINS-29631] Non-workspace polling using git ls-remote doesn't work for several edge cases

          Mark Waite added a comment -

          Thanks for the report and my apologies that 2.4.0 introduced a regression. I had high hopes to introduce no regressions with that release.

          The parameterized branch polling in 2.3.5 (and possibly before) would incorrectly poll the most recently used branch, instead of polling the branch named by the default value of the parameter. In your use case, were you relying on it to poll the most recently used branch?

          JENKINS-27352 was fixed in 2.4.0 to poll with the default value of the "branch to build" parameter. That fix may also have caused it to ignore the refspec's in the polling.

          JENKINS-17348, JENKINS-27327, and JENKINS-27351 were also fixed in 2.4.0 and might have had an impact.

          Any chance you'd be willing to construct a JUnit based test case in the code which would show the problem?

          Mark Waite added a comment - Thanks for the report and my apologies that 2.4.0 introduced a regression. I had high hopes to introduce no regressions with that release. The parameterized branch polling in 2.3.5 (and possibly before) would incorrectly poll the most recently used branch, instead of polling the branch named by the default value of the parameter. In your use case, were you relying on it to poll the most recently used branch? JENKINS-27352 was fixed in 2.4.0 to poll with the default value of the "branch to build" parameter. That fix may also have caused it to ignore the refspec's in the polling. JENKINS-17348 , JENKINS-27327 , and JENKINS-27351 were also fixed in 2.4.0 and might have had an impact. Any chance you'd be willing to construct a JUnit based test case in the code which would show the problem?

          Jason Miller added a comment - - edited

          I'm no longer certain this was working properly in 2.3.5 either, as I'm now unable to make it work again even by reverting back to that version. I'm going to continue investigating and see if I can pin down when this stopped working.

          I suspect the problem has something to do with the fact that the pull request refs aren't present unless an explicit fetch is done using the correct refspec, because it works just fine if the branch parameter resolves to a normal branch under refs/heads (e.g. "master"). Pull request refs aren't under refs/heads on the remote - they're under refs/pull-request.

          I don't know much about Jenkins plugin development or JUnit, but I'll take a look at it if I can pin down a working case again.

          Edit: I re-checked our update logs, and it turns out the version we had when this worked was 2.2.7, not 2.3.5. I reverted my 1.609.1 instance to use 2.2.7, and it worked! Will test between 2.2.8 and 2.3.5 to try and pin it down.

          Jason Miller added a comment - - edited I'm no longer certain this was working properly in 2.3.5 either, as I'm now unable to make it work again even by reverting back to that version. I'm going to continue investigating and see if I can pin down when this stopped working. I suspect the problem has something to do with the fact that the pull request refs aren't present unless an explicit fetch is done using the correct refspec, because it works just fine if the branch parameter resolves to a normal branch under refs/heads (e.g. "master"). Pull request refs aren't under refs/heads on the remote - they're under refs/pull-request. I don't know much about Jenkins plugin development or JUnit, but I'll take a look at it if I can pin down a working case again. Edit: I re-checked our update logs, and it turns out the version we had when this worked was 2.2.7, not 2.3.5. I reverted my 1.609.1 instance to use 2.2.7, and it worked! Will test between 2.2.8 and 2.3.5 to try and pin it down.

          Jason Miller added a comment -

          Short version:

          This never worked, 2.3.5 / 2.4.0 just made it more obvious that it didn't work. It only works with workspace polling, which is used automatically when the branch spec is concrete.

          -------

          It turns out this is actually a problem between workspace polling and polling using the master. I suspect this has never worked correctly, and we didn't realize because the time between switching to parameterized branch for pull request builds and updating the plugins was fairly short.

          For non-workspace polling, it looks like the master tries to run 'git ls-remote -h ..... BRANCH'.
          In pre-2.3.5, it would try to pass 'origin/pr/**' as just '**' and afterwards it passed nothing (hence why the trigger appeared to break). But neither way works - just by passing the -h flag, it ensures that anything outside of refs/heads and refs/tags won't be visible. Moreover, the refspec remaps the remote refs, so trying to use the branch string won't work period.

          The reason it works with a concrete value for branch is that it apparently switches the project into polling using the workspace - which works just fine.

          Jason Miller added a comment - Short version: This never worked, 2.3.5 / 2.4.0 just made it more obvious that it didn't work. It only works with workspace polling, which is used automatically when the branch spec is concrete. ------- It turns out this is actually a problem between workspace polling and polling using the master. I suspect this has never worked correctly, and we didn't realize because the time between switching to parameterized branch for pull request builds and updating the plugins was fairly short. For non-workspace polling, it looks like the master tries to run 'git ls-remote -h ..... BRANCH'. In pre-2.3.5, it would try to pass 'origin/pr/**' as just '**' and afterwards it passed nothing (hence why the trigger appeared to break). But neither way works - just by passing the -h flag, it ensures that anything outside of refs/heads and refs/tags won't be visible. Moreover, the refspec remaps the remote refs, so trying to use the branch string won't work period. The reason it works with a concrete value for branch is that it apparently switches the project into polling using the workspace - which works just fine.

          Mark Waite added a comment -

          Thanks for the detailed investigation. Much appreciated!

          It looks like there are several interesting cases the plugin does not handle with non-default refspecs (like your refs/pull-requests refspec). Thanks for pointing to those.

          You mentioned that "workspace polling, which is used automatically when the branch spec is concrete". That's not quite the case. Remote polling is enabled by default on all jobs using git ls-remote but can be switched to poll using workspace by adding the "Additional Behaviour" to "Force polling using workspace". As far as I know, force polling using workspace is not automatically enabled unless there are conditions in the job (like user exclusions or commit exclusions) which don't get enough information from remote polling.

          Mark Waite added a comment - Thanks for the detailed investigation. Much appreciated! It looks like there are several interesting cases the plugin does not handle with non-default refspecs (like your refs/pull-requests refspec). Thanks for pointing to those. You mentioned that "workspace polling, which is used automatically when the branch spec is concrete". That's not quite the case. Remote polling is enabled by default on all jobs using git ls-remote but can be switched to poll using workspace by adding the "Additional Behaviour" to "Force polling using workspace". As far as I know, force polling using workspace is not automatically enabled unless there are conditions in the job (like user exclusions or commit exclusions) which don't get enough information from remote polling.

          Jason Miller added a comment - - edited

          At least on 2.4.0 (git client 1.18.0, Jenkins 1.609.1), when I switched to a concrete branch with the only other additional behaviors being a Stash browser link and automatic git clean, it switched to using workspace polling automatically. If I switched it back to a parameterized branch, it would go back to using `git ls-remote`.

          Forcing workspace polling also fixes the secondary problem I saw with having multiple refspecs and a parameterized branch as well (e.g. '+refs/heads/*:refs/heads/origin/* +refs/heads/tags/*:refs/heads/origin/tags/*')

          Given that 2.4.0 can auto-provision a workspace for polling even if there isn't one yet (or it's been wiped), I've gone ahead and forced workspace polling for all of our jobs to play it safe.

          Do you want me to update this ticket to focus on the ls-remote limitations?

          Jason Miller added a comment - - edited At least on 2.4.0 (git client 1.18.0, Jenkins 1.609.1), when I switched to a concrete branch with the only other additional behaviors being a Stash browser link and automatic git clean, it switched to using workspace polling automatically. If I switched it back to a parameterized branch, it would go back to using `git ls-remote`. Forcing workspace polling also fixes the secondary problem I saw with having multiple refspecs and a parameterized branch as well (e.g. '+refs/heads/*:refs/heads/origin/* +refs/heads/tags/*:refs/heads/origin/tags/*') Given that 2.4.0 can auto-provision a workspace for polling even if there isn't one yet (or it's been wiped), I've gone ahead and forced workspace polling for all of our jobs to play it safe. Do you want me to update this ticket to focus on the ls-remote limitations?

          Mark Waite added a comment -

          Yes, please, update it to focus on the ls-remote limitations so that later searches can more easily identify the specific details.

          Mark Waite added a comment - Yes, please, update it to focus on the ls-remote limitations so that later searches can more easily identify the specific details.

          Brian Moyles added a comment -

          We're also bitten by this. We use Stash and watch origin/pull-requests/*/from to build pull requests, but some teams have that parameterized so * is the default with the option of specifying a PR number to re-build a given PR. Forcing workspace polling works, but ideally, ls-remote wouldn't filter unless explicitly configured to do so.

          Even better, ls-remote can take refs as an argument and filters results...

          here: https://github.com/jenkinsci/git-plugin/blob/master/src/main/java/hudson/plugins/git/GitSCM.java#L595

          after the list of heads is returned, each item in the list is compared against each refspec. If the respec keys were instead passed to ls-remote as arguments, you would immediately have a filtered list of candidates if I'm not crazy.

          An example, one of our PR build jobs has this refspec
          +refs/pull-requests/:refs/remotes/origin/pull-requests/

          which should ideally result in a ls-remote call like this:

          git ls-remote ssh://git@stash.xxx.yyy.com:7999/someproject/somerepo.git "refs/pull-requests/*"
          63b659d67aa23b7a27c46bec0a9b44a1d2db4bbd        refs/pull-requests/379/from
          adec80914109d97c88d3d1bb37b1e77c3ea7416b        refs/pull-requests/379/merge
          bd7df014503882f9g6763c29ead0c76a93f6f880        refs/pull-requests/387/from
          7db4486b6bb703f6ae0b352fa6e980df94b2458c        refs/pull-requests/395/from
          98c971201b24fbf95fa0b9d69288c9651995ac46        refs/pull-requests/395/merge
          63696e5b897d874fdf8e8ba573f1877436e3b23d        refs/pull-requests/402/from
          d09bcf6ade59e8bfa47c49fd34b3034e2168cd44        refs/pull-requests/402/merge
          e1d6d6c99e2936f87784a3cbb6c693c0e8aaffa7        refs/pull-requests/403/from
          9ca722399ef458f3dae6ed5ef0182581d58eee01        refs/pull-requests/403/merge
          

          which seems like it would limit the amount of work Jenkins is doing up-front...

          Brian Moyles added a comment - We're also bitten by this. We use Stash and watch origin/pull-requests/*/from to build pull requests, but some teams have that parameterized so * is the default with the option of specifying a PR number to re-build a given PR. Forcing workspace polling works, but ideally, ls-remote wouldn't filter unless explicitly configured to do so. Even better, ls-remote can take refs as an argument and filters results... here: https://github.com/jenkinsci/git-plugin/blob/master/src/main/java/hudson/plugins/git/GitSCM.java#L595 after the list of heads is returned, each item in the list is compared against each refspec. If the respec keys were instead passed to ls-remote as arguments, you would immediately have a filtered list of candidates if I'm not crazy. An example, one of our PR build jobs has this refspec +refs/pull-requests/ :refs/remotes/origin/pull-requests/ which should ideally result in a ls-remote call like this: git ls-remote ssh: //git@stash.xxx.yyy.com:7999/someproject/somerepo.git "refs/pull-requests/*" 63b659d67aa23b7a27c46bec0a9b44a1d2db4bbd refs/pull-requests/379/from adec80914109d97c88d3d1bb37b1e77c3ea7416b refs/pull-requests/379/merge bd7df014503882f9g6763c29ead0c76a93f6f880 refs/pull-requests/387/from 7db4486b6bb703f6ae0b352fa6e980df94b2458c refs/pull-requests/395/from 98c971201b24fbf95fa0b9d69288c9651995ac46 refs/pull-requests/395/merge 63696e5b897d874fdf8e8ba573f1877436e3b23d refs/pull-requests/402/from d09bcf6ade59e8bfa47c49fd34b3034e2168cd44 refs/pull-requests/402/merge e1d6d6c99e2936f87784a3cbb6c693c0e8aaffa7 refs/pull-requests/403/from 9ca722399ef458f3dae6ed5ef0182581d58eee01 refs/pull-requests/403/merge which seems like it would limit the amount of work Jenkins is doing up-front...

          Brian Moyles added a comment -

          To add onto that, it looks like (at-a-glance, again) the git client has a perfect GitClient#getRemoteReferences call that does just that (albeit for a single pattern). The CLI implementation passes the pattern as an arg whereas the JGit implementation appears to convert to a regex. Perhaps a convenience getRemoteReferences method that takes a list of patterns in addition to simplify things?

          Brian Moyles added a comment - To add onto that, it looks like (at-a-glance, again) the git client has a perfect GitClient#getRemoteReferences call that does just that (albeit for a single pattern). The CLI implementation passes the pattern as an arg whereas the JGit implementation appears to convert to a regex. Perhaps a convenience getRemoteReferences method that takes a list of patterns in addition to simplify things?

            Unassigned Unassigned
            stormtau Jason Miller
            Votes:
            9 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated: