Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-29840

when workflow uses multiple git repos the "git build data" and "tags" become next to useless.

    • Pipeline - October, Pipeline - April 2018

      if you have a workflow which uses multiple git repositories the actions and data contributed to the build by the git plugin produce almost unusable visual spam.

      The tag action does not let you know which repository you are tagging, nor does the build data tell you which repository it is that has the specified hash.

      Coupled with this you end up with 2 * the number of repos used (plus another 2 if you use workflow from SCM) actions - which simply does not scale. The actions should be refactored so there is one action that can display the data from multiple repositories / invocations and it should be clear which revision comes from which repo.

      node {
        git url: 'git@github.com:jenkinsci/git-client-plugin.git' 
        git url: 'git@github.com:jenkinsci/git-plugin.git' 
        git url: 'git@github.com:jenkinsci/github-plugin.git' 
      // just add more random repos to get the picture...
      }
      

          [JENKINS-29840] when workflow uses multiple git repos the "git build data" and "tags" become next to useless.

          Andrew Bayer added a comment -

          So this isn't actually a dupe of JENKINS-29326 - it's a separate symptom masked by that JIRA.

          Andrew Bayer added a comment - So this isn't actually a dupe of JENKINS-29326 - it's a separate symptom masked by that JIRA.

          Andrew Bayer added a comment -

          From JENKINS-29326:

          So there is another aspect to this - with my fix, you don't get duplicate BuildData added to the build, but if you have multiple repos checked out during the build, the "Git Build Data" links are all actually the same link, which makes sense. The build page shows the multiple BuildData actions, but .../(job)/(build number)/git is just the first one. I'm honestly not sure the right way to make those distinct, though - something involving getUrlName for sure.

          Andrew Bayer added a comment - From JENKINS-29326 : So there is another aspect to this - with my fix, you don't get duplicate BuildData added to the build, but if you have multiple repos checked out during the build, the "Git Build Data" links are all actually the same link, which makes sense. The build page shows the multiple BuildData actions, but .../(job)/(build number)/git is just the first one. I'm honestly not sure the right way to make those distinct, though - something involving getUrlName for sure.

          Andrew Bayer added a comment -

          jglick, oleg_nenashev - do either of you, by any chance, know of a case where an Action can have multiples on a single build and the sidebar links end up different? Trying to figure out an example to derive from for reworking that here.

          Andrew Bayer added a comment - jglick , oleg_nenashev - do either of you, by any chance, know of a case where an Action can have multiples on a single build and the sidebar links end up different? Trying to figure out an example to derive from for reworking that here.

          Andrew Bayer added a comment -

          Ok, figured it out - getUrlName needs to be unique for each Action. So I have to figure out a reliable and useful way to get a unique string for each BuildData and each GitTagAction, and I need to determine a way to distinguish the display names as well...if scmName is specified, that gets done already for BuildData, but if it isn't, there's no distinction, and GitTagAction switches its display name based on whether tagging has been done already.

          Andrew Bayer added a comment - Ok, figured it out - getUrlName needs to be unique for each Action . So I have to figure out a reliable and useful way to get a unique string for each BuildData and each GitTagAction , and I need to determine a way to distinguish the display names as well...if scmName is specified, that gets done already for BuildData , but if it isn't, there's no distinction, and GitTagAction switches its display name based on whether tagging has been done already.

          Jesse Glick added a comment -

          scmName is a crappy workaround. Use GitSCM.getKey() where necessary to determine whether two configurations are the same. Or if you have access to BuildData then you have a simpler and more reliable piece of information: the actual SHA-1 of the commit.

          Jesse Glick added a comment - scmName is a crappy workaround. Use GitSCM.getKey() where necessary to determine whether two configurations are the same. Or if you have access to BuildData then you have a simpler and more reliable piece of information: the actual SHA-1 of the commit.

          Andrew Bayer added a comment -

          Annoyingly, BuildData can have multiple sha1s, thanks to the dementedness of the git plugin. =) And yeah, I didn't like the idea of scmName, just was observing that it was already being used. I can easily distinguish between the BuildData, but I don't know what to use to distinguish them as URLs (my first tests were using hashCode(), which works fine but is not exactly aesthetically pleasing in the URL) and I really have no idea how to distinguish the display names without it getting ridiculously long. I could use toString stuff from BuildData and Build, but...well, repo URL(s), branch(es) and hash(es) gets pretty long pretty quick, probably too long for the sidebar link display name.

          Andrew Bayer added a comment - Annoyingly, BuildData can have multiple sha1s, thanks to the dementedness of the git plugin. =) And yeah, I didn't like the idea of scmName , just was observing that it was already being used. I can easily distinguish between the BuildData , but I don't know what to use to distinguish them as URLs (my first tests were using hashCode() , which works fine but is not exactly aesthetically pleasing in the URL) and I really have no idea how to distinguish the display names without it getting ridiculously long. I could use toString stuff from BuildData and Build , but...well, repo URL(s), branch(es) and hash(es) gets pretty long pretty quick, probably too long for the sidebar link display name.

          Andrew Bayer added a comment -

          So - I think hashCode() for the URL actually does make sense. It's more or less guaranteed to be unique here - I'd like to use the sha1, but then I have to put in some weird logic to deal with cases where there are multiple remote repos pulling into one local repo and merging and things just get odd.

          For the display name (both on the BuildData and GitTagAction), I'm thinking something like this logic:

          • If scmName is defined, use that. Nice and simple.
          • Otherwise, if there's only one remote repository/branch spec combo, use the repo name (i.e., just just the whatever.git, not full URL) and branch spec (everything after the first slash, that is, so the branch name or pattern excluding the remote).
          • Lastly, if there are multiple remote repositories, forget it, we just punt and put a "#1", "#2", whatever there. In cases where there could be multiple BuildData added, it'll be really, really, really rare that there are multiple remote repositories in a single BuildData, I think, so I'm willing to compromise on the aesthetics in those edge cases in order to get a win on the other far more common cases.

          Actually, maybe don't bother for GitTagAction and instead change the text to say "Tag the above" or something along those lines, since it's always going to be the next Action after its associated BuildData, which will present more information. And I'm leaning towards rewording "Git Build Data" to something shorter if possible, so that the additional text we'll be throwing on the display text has some room before the word wrapping/line count goes bonkers.

          Thoughts?

          Andrew Bayer added a comment - So - I think hashCode() for the URL actually does make sense. It's more or less guaranteed to be unique here - I'd like to use the sha1, but then I have to put in some weird logic to deal with cases where there are multiple remote repos pulling into one local repo and merging and things just get odd. For the display name (both on the BuildData and GitTagAction ), I'm thinking something like this logic: If scmName is defined, use that. Nice and simple. Otherwise, if there's only one remote repository/branch spec combo, use the repo name (i.e., just just the whatever.git, not full URL) and branch spec (everything after the first slash, that is, so the branch name or pattern excluding the remote). Lastly, if there are multiple remote repositories, forget it, we just punt and put a "#1", "#2", whatever there. In cases where there could be multiple BuildData added, it'll be really, really, really rare that there are multiple remote repositories in a single BuildData , I think, so I'm willing to compromise on the aesthetics in those edge cases in order to get a win on the other far more common cases. Actually, maybe don't bother for GitTagAction and instead change the text to say "Tag the above" or something along those lines, since it's always going to be the next Action after its associated BuildData , which will present more information. And I'm leaning towards rewording "Git Build Data" to something shorter if possible, so that the additional text we'll be throwing on the display text has some room before the word wrapping/line count goes bonkers. Thoughts?

          I do not know what scmName comes out as, for example, does say just "git", or "svn"? In this case I agree that it is not sufficient, esp when you see two SHA1s on the build page (the one from the "Workflow from SCM", and the other from the actual project being built).

          Showing the repo name works for me, at least this way my users will be able to tell which is which between the "Workflow from SCM" repo and their project repo. Hopefully this case does not fall into the 3rd scenario (numbering them without showing a name). Also I would never want my users to be able to tag the "Workflow from SCM" repository, they should only be allowed to tag their project.

          Martin d'Anjou added a comment - I do not know what scmName comes out as, for example, does say just "git", or "svn"? In this case I agree that it is not sufficient, esp when you see two SHA1s on the build page (the one from the "Workflow from SCM", and the other from the actual project being built). Showing the repo name works for me, at least this way my users will be able to tell which is which between the "Workflow from SCM" repo and their project repo. Hopefully this case does not fall into the 3rd scenario (numbering them without showing a name). Also I would never want my users to be able to tag the "Workflow from SCM" repository, they should only be allowed to tag their project.

          Andrew Bayer added a comment -

          In the Additional Behaviors section of the git plugin config, you can specify a "Custom SCM Name" - this is particularly useful historically when using the Multiple SCMs plugin, e.g., in part because it makes the UI a bit nicer as things are now. So while most cases aren't going to have this set, I'd still like to use that as the first possibility when it is set, since it gives more user control, etc.

          The third scenario is for Weird Git Usage Cases - i.e., when in the normal Git SCM UI, you're specifying multiple remote repositories but not multiple clones...it's a weird feature, IMO, but it is there, so I need to factor it in. Pretty much every Workflow scenario wouldn't fit that - each "git" Workflow DSL step call would result in a separate BuildData with a single remote repo/repo name, so falling into the second scenario.

          I'm...not sure how to deal with the conditional tagging rights thing. That's an interesting question, but not an easy answer. I don't see a smooth way to do that.

          Andrew Bayer added a comment - In the Additional Behaviors section of the git plugin config, you can specify a "Custom SCM Name" - this is particularly useful historically when using the Multiple SCMs plugin, e.g., in part because it makes the UI a bit nicer as things are now. So while most cases aren't going to have this set, I'd still like to use that as the first possibility when it is set, since it gives more user control, etc. The third scenario is for Weird Git Usage Cases - i.e., when in the normal Git SCM UI, you're specifying multiple remote repositories but not multiple clones...it's a weird feature, IMO, but it is there, so I need to factor it in. Pretty much every Workflow scenario wouldn't fit that - each "git" Workflow DSL step call would result in a separate BuildData with a single remote repo/repo name, so falling into the second scenario. I'm...not sure how to deal with the conditional tagging rights thing. That's an interesting question, but not an easy answer. I don't see a smooth way to do that.

          Thanks for the clarifications regarding scmName, it makes sense to me to use it when it is set. Since the user sets it, the user can control what shows up on the build page.

          Regarding the fact that the "Workflow from SCM" also produces links on the build page, could those specific links be made visible only to administrators? At the minimum, a link like "Tag my repo", which can change the "Workflow from SCM" repo, should not be visible to an ordinary user. In any case, I think this is a different problem than the one this Jira intended to solve.

          Martin d'Anjou added a comment - Thanks for the clarifications regarding scmName , it makes sense to me to use it when it is set. Since the user sets it, the user can control what shows up on the build page. Regarding the fact that the "Workflow from SCM" also produces links on the build page, could those specific links be made visible only to administrators? At the minimum, a link like "Tag my repo", which can change the "Workflow from SCM" repo, should not be visible to an ordinary user. In any case, I think this is a different problem than the one this Jira intended to solve.

          Since Git Plugin 2.4.1 the scmName is shown when set, but the Workflow does not expose the Additional Behaviors section in the code snippet, so I do not know how to configure the Custom SCM name with the git SCM step:

          For example:

          // Workflow code
          node ('remote') {
              git url: 'https://github.com/allegro/axion-release-plugin.git',
                  <how do I set the custom scm name?>: 'axion-release-plugin'
          }
          

          There are multiple little aspects to this problem.

          Martin d'Anjou added a comment - Since Git Plugin 2.4.1 the scmName is shown when set, but the Workflow does not expose the Additional Behaviors section in the code snippet, so I do not know how to configure the Custom SCM name with the git SCM step: For example: // Workflow code node ('remote') { git url: 'https://github.com/allegro/axion-release-plugin.git', <how do I set the custom scm name?>: 'axion-release-plugin' } There are multiple little aspects to this problem.

          Jesse Glick added a comment -

          Additional Behaviors and other features can be selected when using the checkout step, not the deliberately simplistic git step.

          Jesse Glick added a comment - Additional Behaviors and other features can be selected when using the checkout step, not the deliberately simplistic git step.

          James Dumay added a comment -

          This this fix get anywhere abayer?

          James Dumay added a comment - This this fix get anywhere abayer ?

          Andrew Bayer added a comment -

          Not really. I think we probably need an epic for reworking the backend for a bunch of APIs around reporting checkouts - there's this, there's dupes, there's shared libraries always showing up, etc...

          Andrew Bayer added a comment - Not really. I think we probably need an epic for reworking the backend for a bunch of APIs around reporting checkouts - there's this, there's dupes, there's shared libraries always showing up, etc...

          James Nord added a comment -

          It would consider it a bug if I did not see the version/changes of the shared library, after all a shared library can change how your build happens and I would not like to see a build failure due to a new library with no changes, at the end time ss a lib owner I would want to be able to check that all projects had built correctly using the latest lib

          James Nord added a comment - It would consider it a bug if I did not see the version/changes of the shared library, after all a shared library can change how your build happens and I would not like to see a build failure due to a new library with no changes, at the end time ss a lib owner I would want to be able to check that all projects had built correctly using the latest lib

          Jesse Glick added a comment -

          teilo I think you meant to comment on JENKINS-41497. Last I looked at it, the proposal was to make this configurable, whether that be on the library or in the job.

          Jesse Glick added a comment - teilo I think you meant to comment on  JENKINS-41497 . Last I looked at it, the proposal was to make this configurable, whether that be on the library or in the job.

          James Nord added a comment -

          jglick no, I was responding to abayer's comment.  JENKINS-41497 is about triggering a build when a shared library is updated as I understand the original problem report.

          James Nord added a comment - jglick no, I was responding to abayer 's comment.   JENKINS-41497 is about triggering a build when a shared library is updated as I understand the original problem report.

          Andrew Bayer added a comment -

          teilo - So JENKINS-41497 covers both triggering and listing the shared library's changes. It's the same mechanisms behind the scenes, as it turns out. So the shared libraries aspect of this is dealt with.

          jamesdumay - I believe this needs some thought and triage to see what other tickets are related to this - JENKINS-29326 may have popped up again, e.g.

          Andrew Bayer added a comment - teilo - So JENKINS-41497 covers both triggering and listing the shared library's changes. It's the same mechanisms behind the scenes, as it turns out. So the shared libraries aspect of this is dealt with. jamesdumay - I believe this needs some thought and triage to see what other tickets are related to this - JENKINS-29326 may have popped up again, e.g.

            Unassigned Unassigned
            teilo James Nord
            Votes:
            26 Vote for this issue
            Watchers:
            32 Start watching this issue

              Created:
              Updated: