Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19544

OutOfMemory due to unbounded storage in OldDataMonitor

    XMLWordPrintable

Details

    Description

      The data map in hudson.diagnosis.OldDataMonitor keeps growing on my machine.

      I have removed an old plugin, which has stored it's settings in all project files.

      Now, on each load/access of a project a new FreeStyleProject object seems to be created and each time the old data is found and the project (as it is a Saveable) is stored as containing old data in the data map of the OldDataMonitor instance.
      One would think that this should not be a problem with only 20 projects but FreeStyleProject does not implement equals and hashCode and therefore every single project object is kept around in this map.
      After a week since the last restart it has accumulated a total of over 200k map entries totaling a whopping 1GB of heap memory.

      I would love to remove this old data but because of bug JENKINS-18809 I can't even do that. I will remove it by hand.

      Attachments

        1. 1.png
          1.png
          217 kB
        2. 2.png
          2.png
          78 kB

        Issue Links

          Activity

            rbaradari Ramin Baradari created issue -
            rbaradari Ramin Baradari made changes -
            Field Original Value New Value
            Link This issue is related to JENKINS-18809 [ JENKINS-18809 ]
            rbaradari Ramin Baradari made changes -
            Labels memory-leak
            timdrury Tim Drury added a comment -

            After updating to 1.542 (then downgrading to 1.532.1) on Windows Server 2008 R2, I had to disable the disk-usage plugin because it appeared hung. After disabling it, I now get daily OOM errors. Analyzing a couple heap dumps shows the 'data' hashmap in hudson.diagnosis.OldDataMonitor holding 90% of my 3.5GB heap.

            timdrury Tim Drury added a comment - After updating to 1.542 (then downgrading to 1.532.1) on Windows Server 2008 R2, I had to disable the disk-usage plugin because it appeared hung. After disabling it, I now get daily OOM errors. Analyzing a couple heap dumps shows the 'data' hashmap in hudson.diagnosis.OldDataMonitor holding 90% of my 3.5GB heap.
            timdrury Tim Drury added a comment -

            Here is more information on my particular issue: https://groups.google.com/d/msg/jenkinsci-users/ZK1J6R2ej4I/-vnEFc-yr0gJ

            timdrury Tim Drury added a comment - Here is more information on my particular issue: https://groups.google.com/d/msg/jenkinsci-users/ZK1J6R2ej4I/-vnEFc-yr0gJ
            jglick Jesse Glick made changes -
            Labels memory-leak memory-leak performance

            I am running into the same problem with Jenkins 1.552 on Debian 5.0.

            The error shows up when the FingerprintCleanupThread starts running (usually during the night). The process fills the heap memory with one very large object: "hudson.diagnosis.OldDataMonitor". The object is never deleted out of the heap memory. I increased memory for Jenkins and triggered the process manually (FingerprintCleanupThread.invoke() in the script console). Each run stacks on the memory until no more memory is available causing the UI to crash and the jenkins.log filling up with errors of "Process leaked file discriptor".

            The "Manage Old Data" screen (reachable via Manage Jenkins) shows a lot of “failed to locate class: com.thoughtworks.xstream.mapper.CannotResolveClassException: hudson.plugins.disk_usage.BuildDiskUsageAction" errors. I was able to discard them using the button at the bottom of the screen. After I discarded the errors i started the FingerprintCleanupThread again and it used far less memory. There are new errors showing up in the "Manage Old Data" screen during the day, so I have to check them every day.

            Is there any possibility for a complete removal of the DiskUsagePlugin?

            nieschinhio Michael Niestegge added a comment - I am running into the same problem with Jenkins 1.552 on Debian 5.0. The error shows up when the FingerprintCleanupThread starts running (usually during the night). The process fills the heap memory with one very large object: "hudson.diagnosis.OldDataMonitor". The object is never deleted out of the heap memory. I increased memory for Jenkins and triggered the process manually (FingerprintCleanupThread.invoke() in the script console). Each run stacks on the memory until no more memory is available causing the UI to crash and the jenkins.log filling up with errors of "Process leaked file discriptor". The "Manage Old Data" screen (reachable via Manage Jenkins) shows a lot of “failed to locate class: com.thoughtworks.xstream.mapper.CannotResolveClassException: hudson.plugins.disk_usage.BuildDiskUsageAction" errors. I was able to discard them using the button at the bottom of the screen. After I discarded the errors i started the FingerprintCleanupThread again and it used far less memory. There are new errors showing up in the "Manage Old Data" screen during the day, so I have to check them every day. Is there any possibility for a complete removal of the DiskUsagePlugin?

            Attached two screenshots:

            1) is showing the increasing memory usage for each FingerprintCleanupThread run.
            2) is showing the heap-dump after the last run.

            nieschinhio Michael Niestegge added a comment - Attached two screenshots: 1) is showing the increasing memory usage for each FingerprintCleanupThread run. 2) is showing the heap-dump after the last run.
            nieschinhio Michael Niestegge made changes -
            Attachment 1.png [ 25504 ]
            Attachment 2.png [ 25505 ]
            jglick Jesse Glick made changes -
            Link This issue depends on JENKINS-20950 [ JENKINS-20950 ]
            jglick Jesse Glick added a comment -

            OldDataMonitor is supposed to be releasing memory when either you agree to Discard Old Data, or the job or build is deleted. However other bugs like JENKINS-20950 can prevent discarding of unreadable data (the first scenario) from completing and thus releasing objects. (OldDataMonitor fails to remove entries when saving them fails—another buglet.)

            Also it may be possible to avoid holding hard references to begin with in certain cases, namely when the system object is in fact a job or build and can thus be relocated by a string identifier.

            jglick Jesse Glick added a comment - OldDataMonitor is supposed to be releasing memory when either you agree to Discard Old Data , or the job or build is deleted. However other bugs like JENKINS-20950 can prevent discarding of unreadable data (the first scenario) from completing and thus releasing objects. ( OldDataMonitor fails to remove entries when saving them fails—another buglet.) Also it may be possible to avoid holding hard references to begin with in certain cases, namely when the system object is in fact a job or build and can thus be relocated by a string identifier.
            jglick Jesse Glick made changes -
            Description The {{data} map in {{hudson.diagnosis.OldDataMonitor}} keeps growing on my machine.

            I have removed an old plugin, which has stored it's settings in all project files.

            Now, on each load/access of a project a new {{FreeStyleProject}} object seems to be created and each time the old data is found and the project (as it is a Saveable) is stored as containing old data in the {{data}} map of the {{OldDataMonitor}} instance.
            One would think that this should not be a problem with only 20 projects but {{FreeStyleProject}} does not implement equals and hashCode and therefore every single project object is kept around in this map.
            After a week since the last restart it has accumulated a total of over 200k map entries totaling a whopping 1GB of heap memory.

            I would love to remove this old data but because of bug JENKINS-18809 I can't even do that. I will remove it by hand.
            The {{data}} map in {{hudson.diagnosis.OldDataMonitor}} keeps growing on my machine.

            I have removed an old plugin, which has stored it's settings in all project files.

            Now, on each load/access of a project a new {{FreeStyleProject}} object seems to be created and each time the old data is found and the project (as it is a Saveable) is stored as containing old data in the {{data}} map of the {{OldDataMonitor}} instance.
            One would think that this should not be a problem with only 20 projects but {{FreeStyleProject}} does not implement equals and hashCode and therefore every single project object is kept around in this map.
            After a week since the last restart it has accumulated a total of over 200k map entries totaling a whopping 1GB of heap memory.

            I would love to remove this old data but because of bug JENKINS-18809 I can't even do that. I will remove it by hand.
            jglick Jesse Glick made changes -
            Assignee Jesse Glick [ jglick ]
            jglick Jesse Glick made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            Code changed in jenkins
            User: Jesse Glick
            Path:
            core/src/main/java/hudson/diagnosis/OldDataMonitor.java
            test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
            test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip
            http://jenkins-ci.org/commit/jenkins/8508fc365ac9faa4fa6ccee116e820c0455f0988
            Log:
            JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects.
            Otherwise a bug like JENKINS-20950 can prevent anything from being saved, and holds references to all the old data.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip http://jenkins-ci.org/commit/jenkins/8508fc365ac9faa4fa6ccee116e820c0455f0988 Log: JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. Otherwise a bug like JENKINS-20950 can prevent anything from being saved, and holds references to all the old data.

            Code changed in jenkins
            User: Jesse Glick
            Path:
            changelog.html
            core/src/main/java/hudson/diagnosis/OldDataMonitor.java
            test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
            http://jenkins-ci.org/commit/jenkins/681a8ff3070736610f338972ba433379723346fb
            Log:
            [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference.

            Compare: https://github.com/jenkinsci/jenkins/compare/10eca374e5eb...681a8ff30707

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: changelog.html core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java http://jenkins-ci.org/commit/jenkins/681a8ff3070736610f338972ba433379723346fb Log: [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. Compare: https://github.com/jenkinsci/jenkins/compare/10eca374e5eb...681a8ff30707
            scm_issue_link SCM/JIRA link daemon made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #3248
            JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. (Revision 8508fc365ac9faa4fa6ccee116e820c0455f0988)
            [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. (Revision 681a8ff3070736610f338972ba433379723346fb)

            Result = SUCCESS
            Jesse Glick : 8508fc365ac9faa4fa6ccee116e820c0455f0988
            Files :

            • test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
            • core/src/main/java/hudson/diagnosis/OldDataMonitor.java
            • test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip

            Jesse Glick : 681a8ff3070736610f338972ba433379723346fb
            Files :

            • test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java
            • changelog.html
            • core/src/main/java/hudson/diagnosis/OldDataMonitor.java
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #3248 JENKINS-19544 Whether or not we manage to save an object with old data, be sure to remove it from the list, and continue with other objects. (Revision 8508fc365ac9faa4fa6ccee116e820c0455f0988) [FIXED JENKINS-19544] When OldDataMonitor is reported a Run, just remember the ID rather than holding a strong reference. (Revision 681a8ff3070736610f338972ba433379723346fb) Result = SUCCESS Jesse Glick : 8508fc365ac9faa4fa6ccee116e820c0455f0988 Files : test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java core/src/main/java/hudson/diagnosis/OldDataMonitor.java test/src/test/resources/hudson/diagnosis/OldDataMonitorTest/robustness.zip Jesse Glick : 681a8ff3070736610f338972ba433379723346fb Files : test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java changelog.html core/src/main/java/hudson/diagnosis/OldDataMonitor.java

            Will this fix be back-ported to LTS versions?

            esinsag Sagi Sinai-Glazer added a comment - Will this fix be back-ported to LTS versions?
            jglick Jesse Glick made changes -
            Labels memory-leak performance lts-candidate memory-leak performance
            jglick Jesse Glick added a comment -

            @esinsag I can mark it as a candidate for consideration.

            jglick Jesse Glick added a comment - @esinsag I can mark it as a candidate for consideration.
            olivergondza Oliver Gondža made changes -
            Labels lts-candidate memory-leak performance 1.554.1-fixed memory-leak performance
            rsandell rsandell added a comment -

            could this issue be the cause of PermGen growing indefinitely as well?

            rsandell rsandell added a comment - could this issue be the cause of PermGen growing indefinitely as well?
            danielbeck Daniel Beck made changes -
            Link This issue is related to JENKINS-22261 [ JENKINS-22261 ]
            jglick Jesse Glick made changes -
            Link This issue depends on JENKINS-24358 [ JENKINS-24358 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 150998 ] JNJira + In-Review [ 193767 ]

            People

              jglick Jesse Glick
              rbaradari Ramin Baradari
              Votes:
              7 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: