Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52926

"Git Build Data" should not appear more than once in the side menu for pipeline builds

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • git-plugin
    • None

      Currently, for pipeline builds, every time the pipeline runs on a new node and merges the branch you're building with master, you get an extra "Git Build Data" entry in the sidebar, even though you're building exactly the same thing.

      The individual screens show different revisions being built:


      But actually, these are the same PR branch, merged with the same master branch. In this situation, I would expect a single entry, and I would expect the linked page to list the branches which were merged, not to show the commit hash of the merge result. The merge result is literally impossible to access after the build has completed anyway, so there is no point having this information at all.

       

          [JENKINS-52926] "Git Build Data" should not appear more than once in the side menu for pipeline builds

          trejkaz added a comment -

          Ah, there are even more problems with stash, so it's definitely unusable in its current state. For instance, if you stash from an Ubuntu slave and then unstash from a macOS slave, you wind up with the build in a weird state where some of the up-to-date checks don't work, because Gradle is storing absolute paths.

          It sounds like this has been separately reported to Gradle as well, but they didn't seem to understand the problem.

          trejkaz added a comment - Ah, there are even more problems with stash, so it's definitely unusable in its current state. For instance, if you stash from an Ubuntu slave and then unstash from a macOS slave, you wind up with the build in a weird state where some of the up-to-date checks don't work, because Gradle is storing absolute paths. It sounds like this has been separately reported to Gradle as well, but they didn't seem to understand the problem.

          Kevin Bruer added a comment -

          Also seeing this issue; less duplicates in our setup, but just as confusing to confusing to users. 

          Kevin Bruer added a comment - Also seeing this issue; less duplicates in our setup, but just as confusing to confusing to users. 

          Mark Waite added a comment -

          kbruer are you also performing the merge yourself inside the pipeline step as trejkaz is, or are you using the multibranch pipeline facilities that perform the merge step for you?

          Multiple Git Build Data entries also appear when using a pipeline shared library. Each pipeline shared library will cause another entry to be added.

          Mark Waite added a comment - kbruer are you also performing the merge yourself inside the pipeline step as trejkaz is, or are you using the multibranch pipeline facilities that perform the merge step for you? Multiple Git Build Data entries also appear when using a pipeline shared library. Each pipeline shared library will cause another entry to be added.

          user77 added a comment - - edited

          We are facing similar issue where we are using pipeline job with jenkins shared library.
          For each and every build we are getting Git build data related to jenkins shared library. Is there any solution for this?

          user77 added a comment - - edited We are facing similar issue where we are using pipeline job with jenkins shared library. For each and every build we are getting Git build data related to jenkins shared library. Is there any solution for this?

          Mark Waite added a comment -

          If you're seeing duplicate git build data in each build, you may be able to resolve it with the script from https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script .

          If all that you're seeing is one entry for the Pipeline shared library build data and one entry for the primary repository build data, then there is no solution for that. The Pipeline job is the combination of the pipeline shared library and the primary repository. I think it would be a mistake to hide the Pipeline shared library information, since it can have a significant impact on the build.

          Mark Waite added a comment - If you're seeing duplicate git build data in each build, you may be able to resolve it with the script from https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script . If all that you're seeing is one entry for the Pipeline shared library build data and one entry for the primary repository build data, then there is no solution for that. The Pipeline job is the combination of the pipeline shared library and the primary repository. I think it would be a mistake to hide the Pipeline shared library information, since it can have a significant impact on the build.

          user77 added a comment -

          Thanks markewaite for your answer.

          As you said:

          "If you're seeing duplicate git build data in each build, you may be able to resolve it with the script from https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script ."

          Where is this script suppose to go and how to use it. Can you please guide me more on this?

          user77 added a comment - Thanks markewaite for your answer. As you said: "If you're seeing duplicate git build data in each build, you may be able to resolve it with the script from https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script ." Where is this script suppose to go and how to use it. Can you please guide me more on this?

          Mark Waite added a comment - - edited

          Thanks for asking the clarifying question. Your question highlights the weakness in that section of the git plugin documentation. Here is the text that should precede that block of code in the documentation. Can you let me know if that makes it clearer?

          The git plugin has an issue (JENKINS-19022) that sometimes causes excessive memory use and disc use in the build history of a job. The problem occurs because in some cases the git plugin copies the git build data from previous builds to the most recent build, even though the git build data from the previous build is not used in the most recent build. The issue can be especially challenging when a job retains a very large number of historical builds or when a job builds a wide range of commits during its history.

          Multiple attempts to resolve the core issue without breaking compatibility have been unsuccessful. A workaround is provided below that will remove the git build data from the build records. The workaround is a system groovy script that needs to be run from the Jenkins Administrator's Script Console (as in https://jenkins.example.com/script ). Administrator permission is required to run system groovy scripts.

          I've submitted that text to the git plugin documentation as https://github.com/jenkinsci/git-plugin/pull/1103

          Mark Waite added a comment - - edited Thanks for asking the clarifying question. Your question highlights the weakness in that section of the git plugin documentation. Here is the text that should precede that block of code in the documentation. Can you let me know if that makes it clearer? The git plugin has an issue ( JENKINS-19022 ) that sometimes causes excessive memory use and disc use in the build history of a job. The problem occurs because in some cases the git plugin copies the git build data from previous builds to the most recent build, even though the git build data from the previous build is not used in the most recent build. The issue can be especially challenging when a job retains a very large number of historical builds or when a job builds a wide range of commits during its history. Multiple attempts to resolve the core issue without breaking compatibility have been unsuccessful. A workaround is provided below that will remove the git build data from the build records. The workaround is a system groovy script that needs to be run from the Jenkins Administrator's Script Console (as in https://jenkins.example.com/script ). Administrator permission is required to run system groovy scripts. I've submitted that text to the git plugin documentation as https://github.com/jenkinsci/git-plugin/pull/1103

          Jan Gałda added a comment -

          Hi,

          I'm also interested in adding this feature.

          In my pipeline, placed in test repository, I run build on N nodes in parallel. On each node I checkout product repository (not the one with Jenkinsfile) using checkout step with PreBuildMerge extension and then run tests. As a result I have N Build Data entries, each of them  with different sha.

          It's problematic because:

          1. There are N Build Data entries.
          2. I cannot easily check which revision has been tested, because shas do not exist in product repository

          I know, that I can checkout repo once, stash them and unstash on nodes, but it's not user friendly solution.

           

          I can imagine, that there are projects, which checkout their repository and then push merged commits. In that case having sha of merge commit in Build Data makes sense, but in my case this information is useless.

           

          So, my proposition is to add new option to PreBuildMerge extension. Something like buildData:

           

          [$class: 'PreBuildMerge', options: [*buildData: 'source'*, mergeRemote: 'origin', mergeTarget: "main"]]

           

          If buildData is set to:

          • 'merged' (default) - Build Data contains sha of merged commit - current implementation, backward compatible
          • 'source' - Build Data contains sha of PR branch

          If I understand how git plugin works, in second case there should be only 1 entry in my case, because plugin detects that they are duplicates.

           

          What do you think about such solution?

          Jan Gałda added a comment - Hi, I'm also interested in adding this feature. In my pipeline, placed in test repository , I run build on N nodes in parallel. On each node I checkout product repository (not the one with Jenkinsfile) using checkout step with PreBuildMerge extension and then run tests. As a result I have N Build Data entries, each of them  with different sha. It's problematic because: There are N Build Data entries. I cannot easily check which revision has been tested, because shas do not exist in product repository I know, that I can checkout repo once, stash them and unstash on nodes, but it's not user friendly solution.   I can imagine, that there are projects, which checkout their repository and then push merged commits. In that case having sha of merge commit in Build Data makes sense, but in my case this information is useless.   So, my proposition is to add new option to PreBuildMerge extension. Something like buildData :   [$class: 'PreBuildMerge', options: [*buildData: 'source'*, mergeRemote: 'origin', mergeTarget: "main"] ]   If buildData is set to: 'merged' (default) - Build Data contains sha of merged commit - current implementation, backward compatible 'source' - Build Data contains sha of PR branch If I understand how git plugin works, in second case there should be only 1 entry in my case, because plugin detects that they are duplicates.   What do you think about such solution?

          Mark Waite added a comment -

          jgalda I think your proposal sounds very interesting. It would preserve compatibility and still allow the reduction of BuildData for users that do not want the duplication.

          Mark Waite added a comment - jgalda I think your proposal sounds very interesting. It would preserve compatibility and still allow the reduction of BuildData for users that do not want the duplication.

          Carl Verbiest added a comment -

          In my use case, I actually extract multiple repos, but I'd like the git build data to be under 1 git build data item instead of a number of entries.

          The multiple entries prevent other sidebar items, such as pipeline console, from fitting on the screen.

           

          items that require to scroll down in the sidebar

          Carl Verbiest added a comment - In my use case, I actually extract multiple repos, but I'd like the git build data to be under 1 git build data item instead of a number of entries. The multiple entries prevent other sidebar items, such as pipeline console, from fitting on the screen.   items that require to scroll down in the sidebar

            Unassigned Unassigned
            trejkaz trejkaz
            Votes:
            6 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: