Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-3406

PermGen space outofmemory on slaves

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: remoting
    • Labels:
      None
    • Environment:
      Platform: All, OS: Windows XP
    • Similar Issues:

      Description

      We have experienced for some time now that Slaves die with PermGen space
      OutOfMemoryError.

      We are currently using Hudson 1.285, but it has been a problem with a number of
      releases. Possibly all releases we have ever used - not entirely sure.

      Our Master is restarted (hard via its Windows Service, without preparing it for
      shutdown) at least once every day.

      After a week or so, the Slaves (4 of them) start dropping like flies with
      exhausted PermGen space.

      All projects run via Ant. The builder setup looks like this:

      <hudson.tasks.Ant>
      <targets>ci.build ci.validate</targets>
      <antOpts>-Xmx512m</antOpts>
      <buildFile>build.ear.xml</buildFile>
      <properties></properties>
      </hudson.tasks.Ant>

      I have changed the slaves to run with -XX:+HeapDumpOnOutOfMemoryError. I have
      the dump available if anyone wants to have a look - it is 5MB zipped.

      Looking at the dump with Eclipse's Memory Analyzer the prime suspects (according
      to the tool - I'm a rookie on the subject) is:

      46 instances of "hudson.remoting.RemoteClassLoader", loaded by
      "sun.misc.Launcher$AppClassLoader @ 0x3007630" occupy 3.606.384 (60,70%) bytes.

        Attachments

          Activity

          Hide
          krystian_nowak Krystian Nowak added a comment -

          adding myself as CC

          Show
          krystian_nowak Krystian Nowak added a comment - adding myself as CC
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Yes, that's a lot of classloaders. Please send the dump to me.

          Also, after you let the slave run for a while, go to
          http://server/hudson/computer/YOURSLAVENAME/dumpExportTable and capture that
          page, too, which should show us what classloaders from the master is exposed to
          this slave.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Yes, that's a lot of classloaders. Please send the dump to me. Also, after you let the slave run for a while, go to http://server/hudson/computer/YOURSLAVENAME/dumpExportTable and capture that page, too, which should show us what classloaders from the master is exposed to this slave.
          Hide
          jskovjyskebankdk jskovjyskebankdk added a comment -

          Created an attachment (id=668)
          Dump from a slave after a few builds

          Show
          jskovjyskebankdk jskovjyskebankdk added a comment - Created an attachment (id=668) Dump from a slave after a few builds
          Hide
          jskovjyskebankdk jskovjyskebankdk added a comment -

          I had to restart Hudson to get the dump (to disable security), so it may not be
          the rich dump you wished for.
          If not, let me know, and I'll let it run for longer next time.

          Show
          jskovjyskebankdk jskovjyskebankdk added a comment - I had to restart Hudson to get the dump (to disable security), so it may not be the rich dump you wished for. If not, let me know, and I'll let it run for longer next time.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/www/changelog.html
          http://fisheye4.cenqua.com/changelog/hudson/?cs=24470
          Log:
          [FIXED JENKINS-3406] I fixed one leak of the class loader caused by JNLP slaves that reconnect to the master. So I tentatively mark this bug as closed. This fix will be in 1.337.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/www/changelog.html http://fisheye4.cenqua.com/changelog/hudson/?cs=24470 Log: [FIXED JENKINS-3406] I fixed one leak of the class loader caused by JNLP slaves that reconnect to the master. So I tentatively mark this bug as closed. This fix will be in 1.337.
          Hide
          jskovjyskebankdk jskovjyskebankdk added a comment -

          I have been using the new build for the past two weeks and have disabled nightly restart of the slaves. Still running, so I guess you nailed it.
          Thanks!

          Show
          jskovjyskebankdk jskovjyskebankdk added a comment - I have been using the new build for the past two weeks and have disabled nightly restart of the slaves. Still running, so I guess you nailed it. Thanks!

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            jskovjyskebankdk jskovjyskebankdk
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: