Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-15747

Logging all UpstreamCause's floods Jenkins in large setups

      In version 1.482 the feature "Report root causes of UpstreamCause in log and status pages" has been added. In certain scenarios (as stated below) this is absolutely not feasible because the amount of data logged per build might become dozens of megabytes. The result is that the jobs folder grows for several thousand builds in tens of gigabytes (within a couple of hours) which lets Jenkins hit memory limits and become unusable.

      Some more words on the scenario which shows that problem. We have a Jenkins instance with 30 executors, and about 5000 jobs. I think the specific thing is that these jobs are not independent (or slightly connected) but have a lot of up/downstream relationships. The problem is that when Jenkins hits one of the leaf jobs the list of hierarchic causes which triggered that job is tens of megabytes long (I am not attaching a full log I guess the content is pretty obvious). On the one hand because the nesting level is very high and on the other hand since there are several paths through the dependency graph.

      So there is an urgent need to optionally disable that feature. It basically makes Jenkins unusable in such scenarios.

          [JENKINS-15747] Logging all UpstreamCause's floods Jenkins in large setups

          Dirk Thomas created issue -
          Jesse Glick made changes -
          Labels New: performance

          Jesse Glick added a comment -

          Similar to JENKINS-14814 except now relating to the log file rather than build.xml.

          Jesse Glick added a comment - Similar to JENKINS-14814 except now relating to the log file rather than build.xml .
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-14814 [ JENKINS-14814 ]

          Jesse Glick added a comment -

          Jesse Glick added a comment - Originating commit: https://github.com/jenkinsci/jenkins/commit/0626f28965a773ffa73921561df1a9b5279f33bc

          Jesse Glick added a comment -

          Finally managed to reproduce. Create three freestyle jobs, each of which triggers the other two in a post-build step, and start one of them on a Jenkins instance with two executors. After you get to around build #10 of each, the JENKINS-14814 fix kicks in and prunes the very old causes—but the breadth of the cause tree causes it to be unmanageably large at that depth.

          0626f28 did not really introduce the problem, it just made it more visible and added new symptoms. Even without that, build.xml runs to over 3Mb per build, which is very expensive to parse during startup and requires a great deal of heap to retain.

          Jesse Glick added a comment - Finally managed to reproduce. Create three freestyle jobs, each of which triggers the other two in a post-build step, and start one of them on a Jenkins instance with two executors. After you get to around build #10 of each, the JENKINS-14814 fix kicks in and prunes the very old causes—but the breadth of the cause tree causes it to be unmanageably large at that depth. 0626f28 did not really introduce the problem, it just made it more visible and added new symptoms. Even without that, build.xml runs to over 3Mb per build, which is very expensive to parse during startup and requires a great deal of heap to retain.
          Jesse Glick made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

          Jesse Glick added a comment -

          Fixing by limiting how many transitive upstream causes are recorded to begin with.

          Note that this will not help with existing build records, only for new builds. But as in JENKINS-14814 you can run some simple scripts to erase upstream cause information from old build records, if you need to keep those builds for whatever reason.

          I made a couple of other commits (6023716 and 9e58d94) with some related cosmetic fixes; the change in 1.482 failed to display transitive causes in a tree so they were nearly impossible to read.

          Jesse Glick added a comment - Fixing by limiting how many transitive upstream causes are recorded to begin with. Note that this will not help with existing build records, only for new builds. But as in JENKINS-14814 you can run some simple scripts to erase upstream cause information from old build records, if you need to keep those builds for whatever reason. I made a couple of other commits ( 6023716 and 9e58d94 ) with some related cosmetic fixes; the change in 1.482 failed to display transitive causes in a tree so they were nearly impossible to read.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          changelog.html
          core/src/main/java/hudson/model/Cause.java
          test/src/test/java/hudson/model/CauseTest.java
          http://jenkins-ci.org/commit/jenkins/d506b32f1fbaab6fd055cd5c430c764dc887e8f4
          Log:
          [FIXED JENKINS-15747] Avoid recording too many upstream causes at any depth.

          Compare: https://github.com/jenkinsci/jenkins/compare/6cc360d80ff8...d506b32f1fba


          You received this message because you are subscribed to the Google Groups "Jenkins Commits" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-commits+unsubscribe@googlegroups.com.
          For more options, visit https://groups.google.com/groups/opt_out.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: changelog.html core/src/main/java/hudson/model/Cause.java test/src/test/java/hudson/model/CauseTest.java http://jenkins-ci.org/commit/jenkins/d506b32f1fbaab6fd055cd5c430c764dc887e8f4 Log: [FIXED JENKINS-15747] Avoid recording too many upstream causes at any depth. Compare: https://github.com/jenkinsci/jenkins/compare/6cc360d80ff8...d506b32f1fba – You received this message because you are subscribed to the Google Groups "Jenkins Commits" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-commits+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out .
          SCM/JIRA link daemon made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: In Progress [ 3 ] New: Resolved [ 5 ]

            Unassigned Unassigned
            dthomas Dirk Thomas
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: