Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19544

OutOfMemory due to unbounded storage in OldDataMonitor

      The data map in hudson.diagnosis.OldDataMonitor keeps growing on my machine.

      I have removed an old plugin, which has stored it's settings in all project files.

      Now, on each load/access of a project a new FreeStyleProject object seems to be created and each time the old data is found and the project (as it is a Saveable) is stored as containing old data in the data map of the OldDataMonitor instance.
      One would think that this should not be a problem with only 20 projects but FreeStyleProject does not implement equals and hashCode and therefore every single project object is kept around in this map.
      After a week since the last restart it has accumulated a total of over 200k map entries totaling a whopping 1GB of heap memory.

      I would love to remove this old data but because of bug JENKINS-18809 I can't even do that. I will remove it by hand.

        1. 1.png
          1.png
          217 kB
        2. 2.png
          2.png
          78 kB

          [JENKINS-19544] OutOfMemory due to unbounded storage in OldDataMonitor

          Tim Drury added a comment -

          Here is more information on my particular issue: https://groups.google.com/d/msg/jenkinsci-users/ZK1J6R2ej4I/-vnEFc-yr0gJ

          Tim Drury added a comment - Here is more information on my particular issue: https://groups.google.com/d/msg/jenkinsci-users/ZK1J6R2ej4I/-vnEFc-yr0gJ

          I am running into the same problem with Jenkins 1.552 on Debian 5.0.

          The error shows up when the FingerprintCleanupThread starts running (usually during the night). The process fills the heap memory with one very large object: "hudson.diagnosis.OldDataMonitor". The object is never deleted out of the heap memory. I increased memory for Jenkins and triggered the process manually (FingerprintCleanupThread.invoke() in the script console). Each run stacks on the memory until no more memory is available causing the UI to crash and the jenkins.log filling up with errors of "Process leaked file discriptor".

          The "Manage Old Data" screen (reachable via Manage Jenkins) shows a lot of “failed to locate class: com.thoughtworks.xstream.mapper.CannotResolveClassException: hudson.plugins.disk_usage.BuildDiskUsageAction" errors. I was able to discard them using the button at the bottom of the screen. After I discarded the errors i started the FingerprintCleanupThread again and it used far less memory. There are new errors showing up in the "Manage Old Data" screen during the day, so I have to check them every day.

          Is there any possibility for a complete removal of the DiskUsagePlugin?

          Michael Niestegge added a comment - I am running into the same problem with Jenkins 1.552 on Debian 5.0. The error shows up when the FingerprintCleanupThread starts running (usually during the night). The process fills the heap memory with one very large object: "hudson.diagnosis.OldDataMonitor". The object is never deleted out of the heap memory. I increased memory for Jenkins and triggered the process manually (FingerprintCleanupThread.invoke() in the script console). Each run stacks on the memory until no more memory is available causing the UI to crash and the jenkins.log filling up with errors of "Process leaked file discriptor". The "Manage Old Data" screen (reachable via Manage Jenkins) shows a lot of “failed to locate class: com.thoughtworks.xstream.mapper.CannotResolveClassException: hudson.plugins.disk_usage.BuildDiskUsageAction" errors. I was able to discard them using the button at the bottom of the screen. After I discarded the errors i started the FingerprintCleanupThread again and it used far less memory. There are new errors showing up in the "Manage Old Data" screen during the day, so I have to check them every day. Is there any possibility for a complete removal of the DiskUsagePlugin?

          Attached two screenshots:

          1) is showing the increasing memory usage for each FingerprintCleanupThread run.
          2) is showing the heap-dump after the last run.

          Michael Niestegge added a comment - Attached two screenshots: 1) is showing the increasing memory usage for each FingerprintCleanupThread run. 2) is showing the heap-dump after the last run.

          Jesse Glick added a comment -

          OldDataMonitor is supposed to be releasing memory when either you agree to Discard Old Data, or the job or build is deleted. However other bugs like JENKINS-20950 can prevent discarding of unreadable data (the first scenario) from completing and thus releasing objects. (OldDataMonitor fails to remove entries when saving them fails—another buglet.)

          Also it may be possible to avoid holding hard references to begin with in certain cases, namely when the system object is in fact a job or build and can thus be relocated by a string identifier.

          Jesse Glick added a comment - OldDataMonitor is supposed to be releasing memory when either you agree to Discard Old Data , or the job or build is deleted. However other bugs like JENKINS-20950 can prevent discarding of unreadable data (the first scenario) from completing and thus releasing objects. ( OldDataMonitor fails to remove entries when saving them fails—another buglet.) Also it may be possible to avoid holding hard references to begin with in certain cases, namely when the system object is in fact a job or build and can thus be relocated by a string identifier.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          core/src/main/java/hudson/diagnosis/OldDataMonitor.java
          test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
          test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip
          http://jenkins-ci.org/commit/jenkins/8508fc365ac9faa4fa6ccee116e820c0455f0988
          Log:
          JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects.
          Otherwise a bug like JENKINS-20950 can prevent anything from being saved, and holds references to all the old data.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip http://jenkins-ci.org/commit/jenkins/8508fc365ac9faa4fa6ccee116e820c0455f0988 Log: JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. Otherwise a bug like JENKINS-20950 can prevent anything from being saved, and holds references to all the old data.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          changelog.html
          core/src/main/java/hudson/diagnosis/OldDataMonitor.java
          test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
          http://jenkins-ci.org/commit/jenkins/681a8ff3070736610f338972ba433379723346fb
          Log:
          [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference.

          Compare: https://github.com/jenkinsci/jenkins/compare/10eca374e5eb...681a8ff30707

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: changelog.html core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java http://jenkins-ci.org/commit/jenkins/681a8ff3070736610f338972ba433379723346fb Log: [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. Compare: https://github.com/jenkinsci/jenkins/compare/10eca374e5eb...681a8ff30707

          dogfood added a comment -

          Integrated in jenkins_main_trunk #3248
          JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. (Revision 8508fc365ac9faa4fa6ccee116e820c0455f0988)
          [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. (Revision 681a8ff3070736610f338972ba433379723346fb)

          Result = SUCCESS
          Jesse Glick : 8508fc365ac9faa4fa6ccee116e820c0455f0988
          Files :

          • test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
          • core/src/main/java/hudson/diagnosis/OldDataMonitor.java
          • test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip

          Jesse Glick : 681a8ff3070736610f338972ba433379723346fb
          Files :

          • test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
          • changelog.html
          • core/src/main/java/hudson/diagnosis/OldDataMonitor.java

          dogfood added a comment - Integrated in jenkins_main_trunk #3248 JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. (Revision 8508fc365ac9faa4fa6ccee116e820c0455f0988) [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. (Revision 681a8ff3070736610f338972ba433379723346fb) Result = SUCCESS Jesse Glick : 8508fc365ac9faa4fa6ccee116e820c0455f0988 Files : test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip Jesse Glick : 681a8ff3070736610f338972ba433379723346fb Files : test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java changelog.html core/src/main/java/hudson/diagnosis/OldDataMonitor.java

          Will this fix be back-ported to LTS versions?

          Sagi Sinai-Glazer added a comment - Will this fix be back-ported to LTS versions?

          Jesse Glick added a comment -

          @esinsag I can mark it as a candidate for consideration.

          Jesse Glick added a comment - @esinsag I can mark it as a candidate for consideration.

          rsandell added a comment -

          could this issue be the cause of PermGen growing indefinitely as well?

          rsandell added a comment - could this issue be the cause of PermGen growing indefinitely as well?

            jglick Jesse Glick
            rbaradari Ramin Baradari
            Votes:
            7 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: