Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19022

GIT Plugin (any version) heavily bloats memory use and size of build.xml with "BuildData" fields

      Hello everyone.

      Months ago, we've noticed a bug/issue with the GIT plug-in. Previously, it was only a minor nuisance but now, it causes each build that we start to use up ~3MB of main memory and ~5MB of disk space in the build.xml.

      The issue is due to the following behaviour of the GIT plug-in:
      For every build that has the GIT SCM defined, it retrieves the list of branches in the remote repository. For each branch, it retrieves the last build in Jenkins that was run against this branch.

      This information is then stored in the Build object in form of the "BuildData" field. This means, that the full list of all branches, plus their last builds is stored in each and every build – thus using up main memory and using up disk space in the "build.xml" file allocated for the build.

      It uses this information to populate a page for the build with the association of branches to builds:
      http://<SERVER>/job/<JOBNAME>/<BUILD-ID>/git/?

      For normal repositories, this data is relatively small, as only a limited number of unmerged branches exist. Unfortunately, we use GIT in an automated manner, where thousands of tags and branches are spawned without merging back into the mainline.

      This means that each build saves several hundred to thousand pointless key-value pairs for GIT branches and Jenkins builds that serve no purpose whatsoever.

      In our case, this means – as outlined above – we waste 3MB of RAM per build and 5 MB of disk space. With 10k builds per day, you can imagine that this is quite a predicament.

      As a workaround, we've written a Jenkins job that removes the tags contained in "<hudson.plugins.git.util.BuildData>" in the "build.xml". This cuts down its size from 5MB down to 16kB (~0.156MB). This of course also greatly boosts the speed of deserealizing the builds from disk.

      Our request would be: Either remove the collections/deserialization of this (from our POV) pointless data, or make its generation optional via a configuration option.

      Best regards,
      Martin Schröder
      Intel Mobile Communications GmbH

          [JENKINS-19022] GIT Plugin (any version) heavily bloats memory use and size of build.xml with "BuildData" fields

          Jan Prach added a comment -

          This has quadratic memory complexity for modern workflows!

          ndeloof any progress fixing this upstream? Can I help?

          Thank you for the workaround scoheb!

          This is very bad. It renders jenkins unusable. In our case jenkins lazy.BuildReference took 95% heap. But actually 91% heap is the git.util.BuildData. Those builds are by default hold as soft references. It sounds good. Problem is once those deserialized build.xml files take more than a heap it will deserialize gigabytes of data on every job view. It will not OOM because it is freeing those softreferences within a single request. But it can be extremely slow. It can read GB of data on every request. It is not ok. It cause timeouts and get progressively worse with every branch and/or every record in job history.

          I would be surprised if some (many) people would not leave jenkins for this issue. It is not easy to say for most people why page load takes 20 seconds or more.

          Jan Prach added a comment - This has quadratic memory complexity for modern workflows! ndeloof any progress fixing this upstream? Can I help? Thank you for the workaround scoheb ! This is very bad. It renders jenkins unusable. In our case jenkins lazy.BuildReference took 95% heap. But actually 91% heap is the git.util.BuildData. Those builds are by default hold as soft references. It sounds good. Problem is once those deserialized build.xml files take more than a heap it will deserialize gigabytes of data on every job view. It will not OOM because it is freeing those softreferences within a single request. But it can be extremely slow. It can read GB of data on every request. It is not ok. It cause timeouts and get progressively worse with every branch and/or every record in job history. I would be surprised if some (many) people would not leave jenkins for this issue. It is not easy to say for most people why page load takes 20 seconds or more.

          Jan Prach added a comment -

          We have run the script. I was suspicious if keeping only getLastBuiltRevision() is the right thing to do. It is not. At least not in combination with multijob plugin. It has triggered to rebuild all the branches again. Thousands of builds. Luckily I did it no 30 seconds sanity job. If I would have done it on all our jobs it would trigger about 20k 30 minutes long builds.

          The only safe option is to remove the git branches and keep build history short. The script is not good enough.

          Jan Prach added a comment - We have run the script. I was suspicious if keeping only getLastBuiltRevision() is the right thing to do. It is not. At least not in combination with multijob plugin. It has triggered to rebuild all the branches again. Thousands of builds. Luckily I did it no 30 seconds sanity job. If I would have done it on all our jobs it would trigger about 20k 30 minutes long builds. The only safe option is to remove the git branches and keep build history short. The script is not good enough.

          Kevin Phillips added a comment - - edited

          Also, as a corollary to the previous comment, when using the MultiJob plugin the problem being discussed here is further exacerbated because a copy of the Git BuildData node is being stored for each child job (recursively) of the parent multi-job. This results in very large log files being generated (i.e.: 5-10MB or more) in cases when there are many Git branches being reported and many sub-jobs managed by a single multijob. In our test cases approximately 90% of that data is consumed by the Git BuildData sections.

          Also, in cases when a multijob has any more than 20 or 30 builds in the history the Jenkins master can take minutes to load the parent job on the dashboard, and in some extreme cases (i.e.: with hundreds or thousands of builds in the history) it can actually cause the entire dashboard to become unresponsive and time out.

          This defect has severe consequences and, imo, must be fixed sooner rather than later ... and with an update to the Git plugin, not with a hack or workaround like a script that gets run out-of-process.

          Kevin Phillips added a comment - - edited Also, as a corollary to the previous comment, when using the MultiJob plugin the problem being discussed here is further exacerbated because a copy of the Git BuildData node is being stored for each child job (recursively) of the parent multi-job. This results in very large log files being generated (i.e.: 5-10MB or more) in cases when there are many Git branches being reported and many sub-jobs managed by a single multijob. In our test cases approximately 90% of that data is consumed by the Git BuildData sections. Also, in cases when a multijob has any more than 20 or 30 builds in the history the Jenkins master can take minutes to load the parent job on the dashboard, and in some extreme cases (i.e.: with hundreds or thousands of builds in the history) it can actually cause the entire dashboard to become unresponsive and time out. This defect has severe consequences and, imo, must be fixed sooner rather than later ... and with an update to the Git plugin, not with a hack or workaround like a script that gets run out-of-process.

          Laine Walker-Avina added a comment - - edited

          +1 To what Kevin said. We have a MultiJob that 3-levels deep with other MultiJobs and MatrixJobs, and new runs typically have 4-5 MB build.xml files with most of that being GitBuildData. We're using the StashBuildTrigger plugin so wiping the build data is ok for us, and we saw the disk usage go from 3GB to 187MB after purging the Git BuildData with a modified version of the script above.
          As a workaround we have to have the jenkns master on a server with 32 GB of RAM and hope that everything fits into the page cache or heap.

          import hudson.matrix.*
          import hudson.model.*
          import com.tikal.jenkins.plugins.multijob.*
          
          hudsonInstance = hudson.model.Hudson.instance
          allItems = hudsonInstance.getAllItems(AbstractProject.class);
          
          // Iterate over all jobs and find the ones that have a hudson.plugins.git.util.BuildData
          // as an action.
          //
          // We then clean it by removing the useless array action.buildsByBranchName
          //
          def numJobs = 0;
          def runcounter = 0;
          def cleanGit;
          cleanGit = { build ->
            gitActions = build.getActions(hudson.plugins.git.util.BuildData.class);
            if (gitActions != null) {
              for (action in gitActions) {
                action.buildsByBranchName = new HashMap<String, Build>();
                hudson.plugins.git.Revision r = action.getLastBuiltRevision();
                if (r != null) {
                  for (branch in r.getBranches()) {
                   action.buildsByBranchName.put(branch.getName(), action.lastBuild)
                 }
                }
                build.actions.remove(action)
                build.actions.add(action)
                build.save();
                runcounter++;
              }
            }
          };
          
          for (job in allItems) {
            numJobs++;
            def counter = 0;
            for (locbuild in job.getBuilds()) {
              // It is possible for a build to have multiple BuildData actions
              // since we can use the Mulitple SCM plugin.
              def gitActions = locbuild.getActions(hudson.plugins.git.util.BuildData.class)
              if (gitActions != null) {
                for (action in gitActions) {
                  counter++;
                }
              }
              if (job instanceof MatrixProject) {
                runcounter = 0;
                for (run in locbuild.getRuns()) {
                  cleanGit(run);
                }
                if (runcounter > 0) {
                  println(" -->> cleaned: " + runcounter + " runs");
                }
              }
          
              if (job instanceof MultiJobProject) {
                runcounter = 0;
                cleanGit(locbuild);
                def recurseSubBuild;
                recurseSubBuild = { sb ->
                    for(bld in sb)
                    {
                      if(bld.build != null)
                      {
                        cleanGit(bld.build);
                        if(bld.build instanceof MultiJobBuild)
                        {
                          if(bld.build.getSubBuilds().size() != 0) {
                            recurseSubBuild(bld.build.getSubBuilds());
                          }
                        }
                      }
                    }
                };
                recurseSubBuild(locbuild.getSubBuilds());
                println("***************");
                if (runcounter > 0) {
                  println(" -->> cleaned: " + runcounter + " runs");
                }
              }
              if (counter > 0) {
                println("-- cleaned: " + counter + " builds");
              }
            }
          }
          

          Laine Walker-Avina added a comment - - edited +1 To what Kevin said. We have a MultiJob that 3-levels deep with other MultiJobs and MatrixJobs, and new runs typically have 4-5 MB build.xml files with most of that being GitBuildData. We're using the StashBuildTrigger plugin so wiping the build data is ok for us, and we saw the disk usage go from 3GB to 187MB after purging the Git BuildData with a modified version of the script above. As a workaround we have to have the jenkns master on a server with 32 GB of RAM and hope that everything fits into the page cache or heap. import hudson.matrix.* import hudson.model.* import com.tikal.jenkins.plugins.multijob.* hudsonInstance = hudson.model.Hudson.instance allItems = hudsonInstance.getAllItems(AbstractProject.class); // Iterate over all jobs and find the ones that have a hudson.plugins.git.util.BuildData // as an action. // // We then clean it by removing the useless array action.buildsByBranchName // def numJobs = 0; def runcounter = 0; def cleanGit; cleanGit = { build -> gitActions = build.getActions(hudson.plugins.git.util.BuildData.class); if (gitActions != null ) { for (action in gitActions) { action.buildsByBranchName = new HashMap< String , Build>(); hudson.plugins.git.Revision r = action.getLastBuiltRevision(); if (r != null ) { for (branch in r.getBranches()) { action.buildsByBranchName.put(branch.getName(), action.lastBuild) } } build.actions.remove(action) build.actions.add(action) build.save(); runcounter++; } } }; for (job in allItems) { numJobs++; def counter = 0; for (locbuild in job.getBuilds()) { // It is possible for a build to have multiple BuildData actions // since we can use the Mulitple SCM plugin. def gitActions = locbuild.getActions(hudson.plugins.git.util.BuildData.class) if (gitActions != null ) { for (action in gitActions) { counter++; } } if (job instanceof MatrixProject) { runcounter = 0; for (run in locbuild.getRuns()) { cleanGit(run); } if (runcounter > 0) { println( " -->> cleaned: " + runcounter + " runs" ); } } if (job instanceof MultiJobProject) { runcounter = 0; cleanGit(locbuild); def recurseSubBuild; recurseSubBuild = { sb -> for (bld in sb) { if (bld.build != null ) { cleanGit(bld.build); if (bld.build instanceof MultiJobBuild) { if (bld.build.getSubBuilds().size() != 0) { recurseSubBuild(bld.build.getSubBuilds()); } } } } }; recurseSubBuild(locbuild.getSubBuilds()); println( "***************" ); if (runcounter > 0) { println( " -->> cleaned: " + runcounter + " runs" ); } } if (counter > 0) { println( "-- cleaned: " + counter + " builds" ); } } }

          I have successfully tested a variation of the groovy scripts mentioned previously to remove the 90% bloat in some of our build logs. One modification I did make was to only process builds other than the last build. This appears to prevent the premature build triggers mentioned earlier since the plugin appears to cache all of the relevant build history for every branch built previously in the last build. So by excluding that one build from the script we prevent jobs from rebuilding every branch immediately afterwards. I believe this solution is preferable to preserving the last built revision being done above.

          Kevin Phillips added a comment - I have successfully tested a variation of the groovy scripts mentioned previously to remove the 90% bloat in some of our build logs. One modification I did make was to only process builds other than the last build. This appears to prevent the premature build triggers mentioned earlier since the plugin appears to cache all of the relevant build history for every branch built previously in the last build. So by excluding that one build from the script we prevent jobs from rebuilding every branch immediately afterwards. I believe this solution is preferable to preserving the last built revision being done above.

          Question: does anyone know what other plugins / behaviours might be affected by the purging of this Git build data from the build logs? I am trying to figure out what other subsystems might be affected so I can better test the impact of running this script before rolling it out into a production environment.

          Kevin Phillips added a comment - Question: does anyone know what other plugins / behaviours might be affected by the purging of this Git build data from the build logs? I am trying to figure out what other subsystems might be affected so I can better test the impact of running this script before rolling it out into a production environment.

          Was fixed by ndeloof in Git plugin >= 2.4.0
          https://github.com/jenkinsci/git-plugin/pull/312

          Arnaud Héritier added a comment - Was fixed by ndeloof in Git plugin >= 2.4.0 https://github.com/jenkinsci/git-plugin/pull/312

          Jesse Glick added a comment -

          No, the fix was reverted.

          Jesse Glick added a comment - No, the fix was reverted.

          jglick ok. Thx

          Arnaud Héritier added a comment - jglick ok. Thx

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/hudson/plugins/git/GitSCM.java
          http://jenkins-ci.org/commit/git-plugin/5bb7eed6a9231af15e7c0f5a964f3044381a979d
          Log:
          JENKINS-19022 Print a warning to the build log when the job seems to be in trouble due to buildsByBranchName bloat.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/hudson/plugins/git/GitSCM.java http://jenkins-ci.org/commit/git-plugin/5bb7eed6a9231af15e7c0f5a964f3044381a979d Log: JENKINS-19022 Print a warning to the build log when the job seems to be in trouble due to buildsByBranchName bloat.

          Code changed in jenkins
          User: Mark Waite
          Path:
          src/main/java/hudson/plugins/git/GitSCM.java
          http://jenkins-ci.org/commit/git-plugin/cae9fb61ffcd2144278aec2255cef897cca569d0
          Log:
          Merge pull request #472 from jglick/buildsByBranchName-JENKINS-19022

          JENKINS-19022 Print a warning to the build log when the job seems to be in trouble due to buildsByBranchName bloat

          Compare: https://github.com/jenkinsci/git-plugin/compare/16e366e146c6...cae9fb61ffcd

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Mark Waite Path: src/main/java/hudson/plugins/git/GitSCM.java http://jenkins-ci.org/commit/git-plugin/cae9fb61ffcd2144278aec2255cef897cca569d0 Log: Merge pull request #472 from jglick/buildsByBranchName- JENKINS-19022 JENKINS-19022 Print a warning to the build log when the job seems to be in trouble due to buildsByBranchName bloat Compare: https://github.com/jenkinsci/git-plugin/compare/16e366e146c6...cae9fb61ffcd

          Mark Waite added a comment -

          A warning is reported to the log by git plugin 3.1.0 released 4 Mar 2017.

          Mark Waite added a comment - A warning is reported to the log by git plugin 3.1.0 released 4 Mar 2017.

          Lucie Votypkova added a comment - - edited

          if I understand correctly this issue, won't be a little relief if BuilData history is kept only per job (not per build, since 99 percent of content is the same) and its items only for existing builds? Is there any reason to have information about branch which was built in a build which does not exist anymore? 

          Lucie Votypkova added a comment - - edited if I understand correctly this issue, won't be a little relief if BuilData history is kept only per job (not per build, since 99 percent of content is the same) and its items only for existing builds? Is there any reason to have information about branch which was built in a build which does not exist anymore? 

          Mark Waite added a comment -

          lvotypkova as far as I know, BuildData history is kept per build. Attempts to switch it to something else have failed in the past. No recent attempt has been made to resolve this. You're welcome to (and encouraged) to make that switch, support it with tests, and submit it as a pull request for review.

          Mark Waite added a comment - lvotypkova as far as I know, BuildData history is kept per build. Attempts to switch it to something else have failed in the past. No recent attempt has been made to resolve this. You're welcome to (and encouraged) to make that switch, support it with tests, and submit it as a pull request for review.

          Thank you for response, I will look at it.

          Lucie Votypkova added a comment - Thank you for response, I will look at it.

          Jesse Glick added a comment -

          For sanely arranged jobs, which just build a branch and do not do anything weird, BuildData is indeed useless. But it supports the weird 5% and previous attempts have indeed bombed badly, not to mention incompatibilities in the plugin API.

          It is kept per build even though it is mostly redundant. Hence the issue here.

          Jesse Glick added a comment - For sanely arranged jobs, which just build a branch and do not do anything weird, BuildData is indeed useless. But it supports the weird 5% and previous attempts have indeed bombed badly, not to mention incompatibilities in the plugin API. It is kept per build even though it is mostly redundant. Hence the issue here.

          Tristan Zhang added a comment -

          Hello,

          Log rotation strategy of "Discard old builds" seems that it only removes related build num directories, while the old rotation discarded build num entry still resides in new created build.xml.

          If new created build.xml can remove old rotation discarded build num entry, it may relief and control the main memory & disk space growth.

          Tristan Zhang added a comment - Hello, Log rotation strategy of "Discard old builds" seems that it only removes related build num directories, while the old rotation discarded build num entry still resides in new created build.xml. If new created build.xml can remove old rotation discarded build num entry, it may relief and control the main memory & disk space growth.

          Jacob Keller added a comment - - edited

          lvotypkova as far as I know, BuildData history is kept per build. Attempts to switch it to something else have failed in the past. No recent attempt has been made to resolve this. You're welcome to (and encouraged) to make that switch, support it with tests, and submit it as a pull request for review

          I assume this isn't a GitPlugin specific issue and is related to a general SCM issue?

          Jacob Keller added a comment - - edited lvotypkova  as far as I know, BuildData history is kept per build. Attempts to switch it to something else have failed in the past. No recent attempt has been made to resolve this. You're welcome to (and encouraged) to make that switch, support it with tests, and submit it as a pull request for review I assume this isn't a GitPlugin specific issue and is related to a general SCM issue?

          Mark Waite added a comment -

          jekeller as far as I know, it is a git plugin specific issue. There may be external dependencies which make it harder to extricate from the git plugin, but I've only seen the issue mentioned in context of the git plugin, not any other SCM plugin.

          Mark Waite added a comment - jekeller as far as I know, it is a git plugin specific issue. There may be external dependencies which make it harder to extricate from the git plugin, but I've only seen the issue mentioned in context of the git plugin, not any other SCM plugin.

          Jacob Keller added a comment -

          jekeller as far as I know, it is a git plugin specific issue. There may be external dependencies which make it harder to extricate from the git plugin, but I've only seen the issue mentioned in context of the git plugin, not any other SCM plugin.

          Yea, I did some more digging, and it's definitely git specific. I'm not 100% sure how to fix it, but I have a possible mitigation, of adding a new extension to disable tracking beyond the "current build" for those that do not wish to use a groovy script based mitigation technique described elsewhere. I'll have a pull request Soon(TM)

           

          Actually extricating this out separately is difficult, and I'm not sure it gains us that much since it would still have to write the contents to disk each time we add a new branch, which is every build when using gerrit as the build source.

           

          Jacob Keller added a comment - jekeller  as far as I know, it  is  a git plugin specific issue. There may be external dependencies which make it harder to extricate from the git plugin, but I've only seen the issue mentioned in context of the git plugin, not any other SCM plugin. Yea, I did some more digging, and it's definitely git specific. I'm not 100% sure how to fix it, but I have a possible mitigation, of adding a new extension to disable tracking beyond the "current build" for those that do not wish to use a groovy script based mitigation technique described elsewhere. I'll have a pull request Soon(TM)   Actually extricating this out separately is difficult, and I'm not sure it gains us that much since it would still have to write the contents to disk each time we add a new branch, which is every build when using gerrit as the build source.  

          I'm getting cryptic serialization errors, apparently related to Blueocean plugin when trying to run the workaround script. It would be nice if someone knowing about serialization and/or blueocean could check https://issues.jenkins-ci.org/browse/JENKINS-48941

          Jakub Bochenski added a comment - I'm getting cryptic serialization errors, apparently related to Blueocean plugin when trying to run the workaround script. It would be nice if someone knowing about serialization and/or blueocean could check https://issues.jenkins-ci.org/browse/JENKINS-48941

          Jesse Glick added a comment -

          This is definitely a specific problem of the Git plugin, which in the interest of trying to be all things to all people has historically supported rather dubious use cases involving complicated refspecs. For projects which simply build a single Git branch with no tricks—including branch project beneath a multibranch folder—BuildData is completely unnecessary. But some downstream plugins rely on it to support more exotic stuff, including the pre-multibranch system of processing pull requests as builds of a single job.

          Jesse Glick added a comment - This is definitely a specific problem of the Git plugin, which in the interest of trying to be all things to all people has historically supported rather dubious use cases involving complicated refspecs. For projects which simply build a single Git branch with no tricks—including branch project beneath a multibranch folder— BuildData is completely unnecessary. But some downstream plugins rely on it to support more exotic stuff, including the pre-multibranch system of processing pull requests as builds of a single job.

          Jacob Keller added a comment -

          Yea, it's sort of an unfortunate side effect. I wonder what effort it would take to deprecate, and then remove, the support over some duration?

          Even with solutions which migrate build data out of the per-job storage, you still have problems because you still store a huge hash for each completed branch, at least not storing it once per build, but that still is somewhat expensive.

          I do have a solution based on using an extension to simply tell the git plugin this job does not need proper build data (and it will thus only store the current build's branches, rather than all the history). That should help enterprising administrators who could enable such an extension. Unfortunately that still leaves such configuration up to "did this administrator understand and become aware of the problem". This is not really the ideal solution, since it basically requires every job creator to decide up front whether they need this build data or not. I would much rather it be opt-in, but that has its problems of backwards compatibility.

          Jacob Keller added a comment - Yea, it's sort of an unfortunate side effect. I wonder what effort it would take to deprecate, and then remove, the support over some duration? Even with solutions which migrate build data out of the per-job storage, you still have problems because you still store a huge hash for each completed branch, at least not storing it once per build, but that still is somewhat expensive. I do have a solution based on using an extension to simply tell the git plugin this job does not need proper build data (and it will thus only store the current build's branches, rather than all the history). That should help enterprising administrators who could enable such an extension. Unfortunately that still leaves such configuration up to "did this administrator understand and become aware of the problem". This is not really the ideal solution, since it basically requires every job creator to decide up front whether they need this build data or not. I would much rather it be opt-in, but that has its problems of backwards compatibility.

          Jacob Keller added a comment -

          It's not a perfect solution, but I added an extension which prevents the builddata from maintaining more than that specific build's branchname mapping. See https://github.com/jenkinsci/git-plugin/pull/568

           

          It's implemented as an extension, so users would have to enable the extension, but once done for a given SCM instance, they never have to clean the job data using a groovy scriptlet, so it's a better solution.

           

          I do not have a real solution for what to do or whether we should work towards deprecating and removing the branch history.

          Jacob Keller added a comment - It's not a perfect solution, but I added an extension which prevents the builddata from maintaining more than that specific build's branchname mapping. See https://github.com/jenkinsci/git-plugin/pull/568   It's implemented as an extension, so users would have to enable the extension, but once done for a given SCM instance, they never have to clean the job data using a groovy scriptlet, so it's a better solution.   I do not have a real solution for what to do or whether we should work towards deprecating and removing the branch history.

          There is a bit of movement on the pull request on GitHub, which gives me hope, but for those stuck with the groovy script to do periodic cleanups, I've measured that the cost is about 65ms per build on a typical dedicated server with 2 spinning drives in a RAID1.  This means that the periodic cleanups quickly get increasingly expensive once thousands of builds start to accumulate.

          Benoit Sigoure added a comment - There is a bit of movement on the pull request on GitHub, which gives me hope, but for those stuck with the groovy script to do periodic cleanups, I've measured that the cost is about 65ms per build on a typical dedicated server with 2 spinning drives in a RAID1.  This means that the periodic cleanups quickly get increasingly expensive once thousands of builds start to accumulate.

          Mark Waite added a comment -

          tsuna have you tried the build from the pull request in your environment? It has been running well for jekeller and would be good to have additional users report their results.

          Mark Waite added a comment - tsuna have you tried the build from the pull request in your environment? It has been running well for jekeller and would be good to have additional users report their results.

          Jacob Keller added a comment -

          Yes, please. Additional testing would be a huge benefit.

          You can even use the one compiled by the pull request build tester found at https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fgit-plugin/detail/PR-578/11/artifacts

          Specifically https://ci.jenkins.io/job/Plugins/job/git-plugin/job/PR-578/11/artifact/target/git.hpi

          Jacob Keller added a comment - Yes, please. Additional testing would be a huge benefit. You can even use the one compiled by the pull request build tester found at https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fgit-plugin/detail/PR-578/11/artifacts Specifically https://ci.jenkins.io/job/Plugins/job/git-plugin/job/PR-578/11/artifact/target/git.hpi

          Thanks for the link, I've installed the plugin [version: 3.9.0-SNAPSHOT (private-69892a21-jenkins)] and disabled the daily job that runs the groovy script.  Anything special I should look out for?  The issue with the bloat is going to take time to manifest itself again as usually it requires thousands of build.xml files to accumulate.

          Benoit Sigoure added a comment - Thanks for the link, I've installed the plugin [version: 3.9.0-SNAPSHOT (private-69892a21-jenkins)]  and disabled the daily job that runs the groovy script.  Anything special I should look out for?  The issue with the bloat is going to take time to manifest itself again as usually it requires thousands of build.xml files to accumulate.

          Mark Waite added a comment -

          The biggest concern (for me) is that you enable the extension and watch for regressions in other areas of the git plugin. Does change history still display as expected? Are builds triggered as expected? Are there places where you detect the absence of the build data?

          Mark Waite added a comment - The biggest concern (for me) is that you enable the extension and watch for regressions in other areas of the git plugin. Does change history still display as expected? Are builds triggered as expected? Are there places where you detect the absence of the build data?

          Jacob Keller added a comment -

          Make sure that commits aren't rebuilt when they've already been covered by a build. Make sure that job start time isn't significantly worse.

          Jacob Keller added a comment - Make sure that commits aren't rebuilt when they've already been covered by a build. Make sure that job start time isn't significantly worse.

          Things have been working fine for me, although we recently inadvertently upgraded to 3.9.1 and lost the fix in the process.  Is there a newer version of this fix, based on 3.9.1, we could try?  Any idea on the approximate timeline to merge this fix?

          Benoit Sigoure added a comment - Things have been working fine for me, although we recently inadvertently upgraded to 3.9.1 and lost the fix in the process.  Is there a newer version of this fix, based on 3.9.1, we could try?  Any idea on the approximate timeline to merge this fix?

          Mark Waite added a comment -

          tsuna I've made my attempt to resolve a merge conflict. The resulting artifacts should include a git.hpi that you can use.

          Mark Waite added a comment - tsuna I've made my attempt to resolve a merge conflict. The resulting artifacts should include a git.hpi that you can use.

          Looks like a mistake was made in the initial rebase and the latest attempt didn't pass the build..?

          Benoit Sigoure added a comment - Looks like a mistake was made in the initial rebase and the latest attempt didn't pass the build..?

          Mark Waite added a comment -

          tsuna unlikely that there was any mistake in the rebase. The build definitely ran but there appears to have been an infrastructure error on ci.jenkins.io. I've made a trivial change to one of the files in the pull request. The build has started again.

          The "resulting artifacts" link referenced in my previous comment still includes a git.hpi which you could use for testing.

          Mark Waite added a comment - tsuna unlikely that there was any mistake in the rebase. The build definitely ran but there appears to have been an infrastructure error on ci.jenkins.io. I've made a trivial change to one of the files in the pull request. The build has started again. The "resulting artifacts" link referenced in my previous comment still includes a git.hpi which you could use for testing.

          Sorry I was going by the comment on GitHub that the wrong side of the merge had been initially picked to resolve a conflict. I'll try the new artifact produced by that latest build, thanks!

          Benoit Sigoure added a comment - Sorry I was going by the comment on GitHub that the wrong side of the merge had been initially picked to resolve a conflict. I'll try the new artifact produced by that latest build, thanks!

          Benoit Sigoure added a comment - - edited

          Hmm, the new plugin won't load because:

          SEVERE: Failed Loading plugin Jenkins Git plugin v4.0.0-rc1685.659d6dcce0e8 (git)
          java.io.IOException: Jenkins Git plugin v4.0.0-rc1685.659d6dcce0e8 failed to load.
           - Jenkins Git client plugin v2.7.2 is older than required. To fix, install v3.0.0-beta3 or later.
                  at hudson.PluginWrapper.resolvePluginDependencies(PluginWrapper.java:655)
                  at hudson.PluginManager$2$1$1.run(PluginManager.java:515)
                  at org.jvnet.hudson.reactor.TaskGraphBuilder$TaskImpl.run(TaskGraphBuilder.java:169)
                  at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
                  at jenkins.model.Jenkins$5.runTask(Jenkins.java:1068)
                  at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
                  at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748) 

          I downloaded the latest .hpi from here.  I hope that's all that was needed

          Benoit Sigoure added a comment - - edited Hmm, the new plugin won't load because: SEVERE: Failed Loading plugin Jenkins Git plugin v4.0.0-rc1685.659d6dcce0e8 (git) java.io.IOException: Jenkins Git plugin v4.0.0-rc1685.659d6dcce0e8 failed to load. - Jenkins Git client plugin v2.7.2 is older than required. To fix, install v3.0.0-beta3 or later.         at hudson.PluginWrapper.resolvePluginDependencies(PluginWrapper.java:655)         at hudson.PluginManager$2$1$1.run(PluginManager.java:515)         at org.jvnet.hudson.reactor.TaskGraphBuilder$TaskImpl.run(TaskGraphBuilder.java:169)         at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)         at jenkins.model.Jenkins$5.runTask(Jenkins.java:1068)         at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)         at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)         at java.lang. Thread .run( Thread .java:748) I downloaded the latest .hpi from here .  I hope that's all that was needed

          Mark Waite added a comment -

          It would be enough to install git client plugin 3.0.0-beta3, the latest git client plugin from the experimental update center, but you're also welcome to use one from the CI build. That is slightly newer than 3.0.0-beta3 and should work just as well as the CI build you used.

          Mark Waite added a comment - It would be enough to install git client plugin 3.0.0-beta3, the latest git client plugin from the experimental update center, but you're also welcome to use one from the CI build. That is slightly newer than 3.0.0-beta3 and should work just as well as the CI build you used.

          Mark Waite added a comment -

          I have completed my testing of the pull request from jekeller. I need to write some documentation of the change for the release notes, then it will be ready to merge.

          A week or less after that merge is completed, I intend to deliver the git client plugin 3.0.0-rc and the git plugin 4.0.0-rc builds.

          Mark Waite added a comment - I have completed my testing of the pull request from jekeller . I need to write some documentation of the change for the release notes, then it will be ready to merge. A week or less after that merge is completed, I intend to deliver the git client plugin 3.0.0-rc and the git plugin 4.0.0-rc builds.

          >> leedega added a comment - 2016-08-03 17:07
          > Question: does anyone know what other plugins / behaviours might be affected by the purging of this Git build data from the build logs? I am trying to figure out what other subsystems > might be affected so I can better test the impact of running this script before rolling it out into a production environment.

           

          This "fix" broke SCM polling for our use case:

          • We use an GIT SCM build trigger with a pattern, so all branches that match "FOOBAR-*" are built. With this change now our jenkins tries to rebuild ALL OF OUR BRANCHES from the last years. As you can imagine that is quite the joke and would take weeks until all of our builds ran through. Do you have a solution how to fix this? I'm not so keen on having to stay on the old plugin versions forever.
          • Probably not related but another issue: "ERROR: [GitHub Commit Status Setter] - Cannot retrieve Git metadata for the build, setting build result to FAILURE"

          Stefan Hengelein added a comment - >> leedega added a comment - 2016-08-03 17:07 > Question: does anyone know what other plugins / behaviours might be affected by the purging of this Git build data from the build logs? I am trying to figure out what other subsystems > might be affected so I can better test the impact of running this script before rolling it out into a production environment.   This "fix" broke SCM polling for our use case: We use an GIT SCM build trigger with a pattern, so all branches that match "FOOBAR-*" are built. With this change now our jenkins tries to rebuild ALL OF OUR BRANCHES from the last years. As you can imagine that is quite the joke and would take weeks until all of our builds ran through. Do you have a solution how to fix this? I'm not so keen on having to stay on the old plugin versions forever. Probably not related but another issue: "ERROR: [GitHub Commit Status Setter] - Cannot retrieve Git metadata for the build, setting build result to FAILURE"

          Jesse Glick added a comment -

          does anyone know what other plugins / behaviours might be affected

          I would start with this, though I would not expect that to be an exhaustive list of affected functionality.

          Jesse Glick added a comment - does anyone know what other plugins / behaviours might be affected I would start with this , though I would not expect that to be an exhaustive list of affected functionality.

          jglick What? I've just cited what leedega wrote and later expanded on issues we're seeing.

          I'm looking for a strategy to migrate to the new version of the plugin that doesn't involve our CI running for weeks to catch up the mess that was created here.

          Stefan Hengelein added a comment - jglick What? I've just cited what leedega wrote and later expanded on issues we're seeing. I'm looking for a strategy to migrate to the new version of the plugin that doesn't involve our CI running for weeks to catch up the mess that was created here.

          Tagging onto this ticket, because something here caught my eye that ilendir posted

          > With this change now our jenkins tries to rebuild ALL OF OUR BRANCHES from the last years.

          With the upgrade of (git-plugin) 4.0.0-rc, all our branches are being rebuilt (years old branches) after a commit to the project. This is blowing up reports, status reports, slack messages and more. As you can guess, years of builds rerunning is creating the quite the panic. I'm not even sure if the issue is resolved once this backlog of builds is done, because it might take days to get there.

          What can I do to help? My first instinct is to downgrade back to 3.9.2 (git-plugin) but not sure if that will be worth it.

          Connor Tumbleson added a comment - Tagging onto this ticket, because something here caught my eye that ilendir posted > With this change now our jenkins tries to rebuild ALL OF OUR BRANCHES from the last years. With the upgrade of (git-plugin) 4.0.0-rc, all our branches are being rebuilt (years old branches) after a commit to the project. This is blowing up reports, status reports, slack messages and more. As you can guess, years of builds rerunning is creating the quite the panic. I'm not even sure if the issue is resolved once this backlog of builds is done, because it might take days to get there. What can I do to help? My first instinct is to downgrade back to 3.9.2 (git-plugin) but not sure if that will be worth it.

          Mark Waite added a comment -

          You should downgrade to git plugin 3.9.2 then install git plugin 3.9.3 which includes a fix for an agent / tool interaction issue that was first introduced in git plugin 3.9.2.

          The git plugin 4.0.0-rc is only a release candidate, not a production ready release. I made a mistake when I chose the version number. I assumed that the "-rc" suffix would deliver the plugin only from the experimental update center and not from the general update center. I was wrong. I apologize sincerely for my mistake.

          The issues detected with git plugin 4.0.0-rc indicate that it was not as close to release as my testing indicated. More testing and more fixes will be needed before git plugin 4.0.0 will be released for general availability.

          You can help assure that the problems you've found are fixed in git plugin 4.0.0-rc by providing a bug report which contains numbered steps that describe how someone else can see the same bug that you are seeing. The descriptions in the comments of this specific bug report seem to indicate at least two different bugs which are outside of this bug report and should be reported separately so that they can be tracked separately to resolution.

          Mark Waite added a comment - You should downgrade to git plugin 3.9.2 then install git plugin 3.9.3 which includes a fix for an agent / tool interaction issue that was first introduced in git plugin 3.9.2. The git plugin 4.0.0-rc is only a release candidate, not a production ready release. I made a mistake when I chose the version number. I assumed that the "-rc" suffix would deliver the plugin only from the experimental update center and not from the general update center. I was wrong. I apologize sincerely for my mistake. The issues detected with git plugin 4.0.0-rc indicate that it was not as close to release as my testing indicated. More testing and more fixes will be needed before git plugin 4.0.0 will be released for general availability. You can help assure that the problems you've found are fixed in git plugin 4.0.0-rc by providing a bug report which contains numbered steps that describe how someone else can see the same bug that you are seeing. The descriptions in the comments of this specific bug report seem to indicate at least two different bugs which are outside of this bug report and should be reported separately so that they can be tracked separately to resolution.

          Thanks markewaite - we reverted 3.9.2, then re-upgraded to 3.9.3. Unfortunately our builds were in quite a messed up shape. New commits to a branch were triggering all branches that still lived on the remote. We tried deleting the repos from the filesystem of Jenkins, hoping a new clone would resolve that - It did not. For other googlers who stumble upon this page, we had to restore a server backup and lose about 2 days of history on builds. We lost minor configurations, but it was way better than having our system bloated with thousands of builds and building nearly everything on each commit.

          Mistakes happen mate, no worries and I'll try and file a bug to keep these bugs organized.

          Connor Tumbleson added a comment - Thanks markewaite - we reverted 3.9.2, then re-upgraded to 3.9.3. Unfortunately our builds were in quite a messed up shape. New commits to a branch were triggering all branches that still lived on the remote. We tried deleting the repos from the filesystem of Jenkins, hoping a new clone would resolve that - It did not. For other googlers who stumble upon this page, we had to restore a server backup and lose about 2 days of history on builds. We lost minor configurations, but it was way better than having our system bloated with thousands of builds and building nearly everything on each commit. Mistakes happen mate, no worries and I'll try and file a bug to keep these bugs organized.

          tzafrir added a comment -

          The buildData is also required for gitlab plugin.
          gitlab plugin pipeline step "gitlabCommitStatus" isn't working anymore with 4.0.0-rc

           It does not find any BuildData object.  Had to downgrade to 3.9.3 where all was working fine.

          tzafrir added a comment - The buildData is also required for gitlab plugin. gitlab plugin pipeline step "gitlabCommitStatus" isn't working anymore with 4.0.0-rc  It does not find any BuildData object.  Had to downgrade to 3.9.3 where all was working fine.

          Thanks for your quick responses!

          ibotpeaches The downgrade to the previous version (3.9.1) fixed the issue for now. Just took some time to identify the culprit.

          markewaite No problem mate, mistakes happen. I'll have a closer look on monday so hopefully I can help to provide steps to reproduce the issue!

           

          Stefan Hengelein added a comment - Thanks for your quick responses! ibotpeaches The downgrade to the previous version (3.9.1) fixed the issue for now. Just took some time to identify the culprit. markewaite No problem mate, mistakes happen. I'll have a closer look on monday so hopefully I can help to provide steps to reproduce the issue!  

          Jacob Keller added a comment -

          tzafrir11, the gitlab plugin probably just needs to be updated to look for BuildDetails instead of BuildData.

          As for the rebuilding issue, that is likely due to the fact that the new plugin no longer maintains history of "I built this already" if the build which did it was deleted.

          If the build hasn't been deleted it's plausible there is a bug in how we look up the old data again....

          It may be worth more seriously considering the XmlFile approach.

          Jacob Keller added a comment - tzafrir11 , the gitlab plugin probably just needs to be updated to look for BuildDetails instead of BuildData. As for the rebuilding issue, that is likely due to the fact that the new plugin no longer maintains history of "I built this already" if the build which did it was deleted. If the build hasn't been deleted it's plausible there is a bug in how we look up the old data again.... It may be worth more seriously considering the XmlFile approach.

          Jim D added a comment -

          markewaite, I'm glad I found the comments in this issue, thank you!  When the final 4.0.0 is released with the changes for this issue, will it make sure to handle both BuildDetails and BuildData for previous build runs?  We updated to 4.0.0-rc some time back, and I just came across this thread recently trying to resolve a BuildData issue, and rolled back to 3.9.3.  At this point, Git plugin is working and creating BuildData for each build run, but all those old builds done with 4.0.0-rc don't have any kind of "Git Build Data" in the Jenkins UI anymore, and no BuildData or BuildDetails when retrieved programmatically from WorkflowRun.getAllActions.  I think it must be that 3.9.3 isn't able to read the new BuildDetails object info.  Will 4.0.0 be able to read both, from both 3.9.X builds and 4.0.0 builds?  Thanks again.

           

          Jim D added a comment - markewaite , I'm glad I found the comments in this issue, thank you!  When the final 4.0.0 is released with the changes for this issue, will it make sure to handle both BuildDetails and BuildData for previous build runs?  We updated to 4.0.0-rc some time back, and I just came across this thread recently trying to resolve a BuildData issue, and rolled back to 3.9.3.  At this point, Git plugin is working and creating BuildData for each build run, but all those old builds done with 4.0.0-rc don't have any kind of "Git Build Data" in the Jenkins UI anymore, and no BuildData or BuildDetails when retrieved programmatically from WorkflowRun.getAllActions.  I think it must be that 3.9.3 isn't able to read the new BuildDetails object info.  Will 4.0.0 be able to read both, from both 3.9.X builds and 4.0.0 builds?  Thanks again.  

          Mark Waite added a comment -

          jkd this issue won't be fixed in 4.0.0. The incompatibilities from the BuildDetails change were too great for the community. The accidental release of git plugin 4.0.0-rc to the production update centers showed incompatibilities that I had missed in my testing and that others had missed in their testing.

          BuildData will be the same bloated memory user in 4.0.0 that it is in 3.x.

          Mark Waite added a comment - jkd this issue won't be fixed in 4.0.0. The incompatibilities from the BuildDetails change were too great for the community. The accidental release of git plugin 4.0.0-rc to the production update centers showed incompatibilities that I had missed in my testing and that others had missed in their testing. BuildData will be the same bloated memory user in 4.0.0 that it is in 3.x.

          Jim D added a comment -

          Thanks for the update!

          Jim D added a comment - Thanks for the update!

          zy zhang added a comment -

          Hi, you can use the below groovy script to delete git revisions for the current build.

          import jenkins.model.*

          jenkinsInstance = jenkins.model.Jenkins.get()

          def job = jenkinsInstance.getItemByFullName(JOB_NAME);
          def build = job.getBuild(BUILD_NUMBER)
          def prj = build.project
          def gitActions = build.getActions(hudson.plugins.git.util.BuildData.class)

          if (gitActions != null) {
          for (action in gitActions)

          { build.actions.remove(action) //build.actions.add(action) build.save() }

          }

          zy zhang added a comment - Hi, you can use the below groovy script to delete git revisions for the current build. import jenkins.model.* jenkinsInstance = jenkins.model.Jenkins.get() def job = jenkinsInstance.getItemByFullName(JOB_NAME); def build = job.getBuild(BUILD_NUMBER) def prj = build.project def gitActions = build.getActions(hudson.plugins.git.util.BuildData.class) if (gitActions != null) { for (action in gitActions) { build.actions.remove(action) //build.actions.add(action) build.save() } }

          markewaite I think the "BuildData" structure has been heavily refactored isn't it? Should this be closed maybe? Thanks

          Baptiste Mathus added a comment - markewaite I think the "BuildData" structure has been heavily refactored isn't it? Should this be closed maybe? Thanks

          Mark Waite added a comment - - edited

          Unfortunately batmat, the three attempts (two by ndeloof and one by jekeller ) were unable to significantly refactor BuildData in a compatible fashion. The most recent attempt by jekeller passed multiple months of my testing but showed compatibility issues in the accidental release of git plugin 4.0.0-rc.

          The changes were reverted before the release of git plugin 4.0.0.

          The git plugin documentation now includes instructions as a system groovy script that removes BuildData. See https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script

          Mark Waite added a comment - - edited Unfortunately batmat , the three attempts (two by ndeloof and one by jekeller ) were unable to significantly refactor BuildData in a compatible fashion. The most recent attempt by jekeller passed multiple months of my testing but showed compatibility issues in the accidental release of git plugin 4.0.0-rc. The changes were reverted before the release of git plugin 4.0.0. The git plugin documentation now includes instructions as a system groovy script that removes BuildData. See https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script

          Jacob Keller added a comment -

          batmat the refactor was reverted because it had unexpected side effects.

          My solution involved doing a search/lookup mechanism against all old builds and "rebuilding" the build data every job. This works but slows down significantly once you have a lot of jobs.

          I believe a better solution exists using a plugin-specific XML file, so we basically just stop storing the build data per-build and start storing it per-job as a separate file. I've thought about it on-and-off for a while but never got around to trying to implement it.

          Jacob Keller added a comment - batmat the refactor was reverted because it had unexpected side effects. My solution involved doing a search/lookup mechanism against all old builds and "rebuilding" the build data every job. This works but slows down significantly once you have a lot of jobs. I believe a better solution exists using a plugin-specific XML file, so we basically just stop storing the build data per-build and start storing it per-job as a separate file. I've thought about it on-and-off for a while but never got around to trying to implement it.

          Jason Jardina added a comment - - edited

          markewaite I ran that script you listed and it kicked several, meaning over 50, old builds that had been built previously.  I use regex to scan my repositories by naming convention using git polling.  A build is kicked when commit hash has changed on a regex named branch.  I am glad I ran that on my older code server and not my currently shipping code.  That script is dangerous.  It may solve your problems, but it definitely does not solve mine.  I have to have the build history in order for Jenkins to know what it has built previously, so it doesn't get stuck in a build loop.  That script is like sticking a loaded gun to Jenkins head and pulling the trigger.  Before you tell everyone to run that script and delete their build data, you should warn them they may see unexpected results, exactly like I saw when we updated to git plugin 4.0.0-rc that was accidentally released in the wild last year.

          The best solution I found is to only keep 10-20 build history on Jenkins by using Discard Old Builds, log rotation settings.  That lets me keep my current git history, without the history file size getting out of hand and slowing builds/reboots. 

          Jason Jardina added a comment - - edited markewaite I ran that script you listed and it kicked several, meaning over 50, old builds that had been built previously.  I use regex to scan my repositories by naming convention using git polling.  A build is kicked when commit hash has changed on a regex named branch.  I am glad I ran that on my older code server and not my currently shipping code.  That script is dangerous.  It may solve your problems, but it definitely does not solve mine.  I have to have the build history in order for Jenkins to know what it has built previously, so it doesn't get stuck in a build loop.  That script is like sticking a loaded gun to Jenkins head and pulling the trigger.  Before you tell everyone to run that script and delete their build data, you should warn them they may see unexpected results, exactly like I saw when we updated to git plugin 4.0.0-rc that was accidentally released in the wild last year. The best solution I found is to only keep 10-20 build history on Jenkins by using Discard Old Builds, log rotation settings.  That lets me keep my current git history, without the history file size getting out of hand and slowing builds/reboots. 

          Jacob Keller added a comment -

          jjardina, yes that's part of the problem. The current build data solution is stored as a map once per build. The script there will delete all build data to conserve on memory and reduce the bloat. The ultimate issue is not that the single map is that much space but that every build keeps a map of history up to that point. Ultimately the issue is that this scales by N^2. If we have 10k builds, we have roughly N^2 (i know it's slightly less since it's more like n*(n-1)/2 ) number of things being stored in a map.

          I firmly believe that the git plugin should be modified to store this data per job in an XmlFile in the job root. This way, we can maintain this history (as you and many others obviously require), while avoiding both the cost-complexity of storing the build data repeatably and of trying to rebuild the data from previous jobs.

          This task shouldn't be too difficult, but it does require someone investing time, and unfortunately I don't have time to work on this at $DAYJOB right now, so it's not something I can commit to doing in a timely manner.

          Now, one could argue that the git plugin shouldn't be saving data about builds which have been deleted, but that's neither here nor their as clearly people desire this behavior and it's how the plugin has behaved for many years now.

          Jacob Keller added a comment - jjardina , yes that's part of the problem. The current build data solution is stored as a map once per build. The script there will delete all build data to conserve on memory and reduce the bloat. The ultimate issue is not that the single map is that much space but that every build keeps a map of history up to that point. Ultimately the issue is that this scales by N^2. If we have 10k builds, we have roughly N^2 (i know it's slightly less since it's more like n*(n-1)/2 ) number of things being stored in a map. I firmly believe that the git plugin should be modified to store this data per job in an XmlFile in the job root. This way, we can maintain this history (as you and many others obviously require), while avoiding both the cost-complexity of storing the build data repeatably and of trying to rebuild the data from previous jobs. This task shouldn't be too difficult, but it does require someone investing time, and unfortunately I don't have time to work on this at $DAYJOB right now, so it's not something I can commit to doing in a timely manner. Now, one could argue that the git plugin shouldn't be saving data about builds which have been deleted, but that's neither here nor their as clearly people desire this behavior and it's how the plugin has behaved for many years now.

          Brittany added a comment - - edited

          Hi! I ran into this issue in my work and recently fixed it for our job runs in a way I didn't find noted anywhere. I'm not sure this will solve it for anyone else, but just in case, here's what I found.

          In our Jenkins configuration we were using:

          `url: 'https://github.com/my-awesome-org/my-even-better-repo'` (here's hoping this isn't an actual repo)

          But when I changed it to:

          `url: 'https://github.com/my-awesome-org/my-even-better-repo.git'`,

          (note the `.git` extension), the warning was gone (and it also changed the Jenkins console output from "The recommended git tool is: NONE" to "The recommended git tool is: git"). It also drastically reduced the output on the Build Data page.

          Hope this helps someone else!
           

          Brittany added a comment - - edited Hi! I ran into this issue in my work and recently fixed it for our job runs in a way I didn't find noted anywhere. I'm not sure this will solve it for anyone else, but just in case, here's what I found. In our Jenkins configuration we were using: `url: 'https://github.com/my-awesome-org/my-even-better-repo'` (here's hoping this isn't an actual repo) But when I changed it to: `url: 'https://github.com/my-awesome-org/my-even-better-repo.git'`, (note the `.git` extension), the warning was gone (and it also changed the Jenkins console output from "The recommended git tool is: NONE" to "The recommended git tool is: git"). It also drastically reduced the output on the Build Data page. Hope this helps someone else!  

            Unassigned Unassigned
            mhschroe Martin Schröder
            Votes:
            40 Vote for this issue
            Watchers:
            90 Start watching this issue

              Created:
              Updated: