Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4045

hudson.remoting.Channel.unexport attempts to unexport oid instead of object

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • core
    • None
    • Platform: All, OS: All

      We came across an issue where we were seeing persistent memory leaks in Hudson.
      It appeared to be similar if not the same as the latter part of this thread:

      http://www.nabble.com/Possible-memory-leak-in-hudson.remoting.ExportTable-td12000299.html

      It was extremely subtle to track this thing down, but I've found the cause of it.

      It looks like in the course of handling Unexport and EOF commands, Hudson calls
      hudson.remoting.Channel.unexport(oid). This in turn calls
      hudson.remoting.ExportTable.unexport(oid), but unfortunately the latter method
      expects to be passed the exported object itself, not the oid. It dutifully looks
      to see if anyone has exported the Integer oid and when it finds nobody has, it
      returns without releasing the reference from the intended Entry.

      The fix is simple; introduce an unexportByOid method onto
      hudson.remoting.ExportTable and have hudson.remoting.Channel call that instead.
      Patch forthcoming.

          [JENKINS-4045] hudson.remoting.Channel.unexport attempts to unexport oid instead of object

          mdillon added a comment -

          Created an attachment (id=783)
          Patch to add unexportByOid to ExportTable and call it from Channel

          mdillon added a comment - Created an attachment (id=783) Patch to add unexportByOid to ExportTable and call it from Channel

          Scott Tatum added a comment -

          Adding myself to the cc list - I plan to confirm the fix on my local installs.

          Scott Tatum added a comment - Adding myself to the cc list - I plan to confirm the fix on my local installs.

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/ExportTable.java
          http://fisheye4.cenqua.com/changelog/hudson/?cs=19879
          Log:
          JENKINS-4045 The actual change was not committed in the previous one.

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/ExportTable.java http://fisheye4.cenqua.com/changelog/hudson/?cs=19879 Log: JENKINS-4045 The actual change was not committed in the previous one.

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/www/changelog.html
          http://fisheye4.cenqua.com/changelog/hudson/?cs=19878
          Log:
          [FIXED JENKINS-4045] Thanks for a great detective work! If you are interested in other bug fixes and you are not a committer yet, please let me know so that I can add you as a committer.

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/www/changelog.html http://fisheye4.cenqua.com/changelog/hudson/?cs=19878 Log: [FIXED JENKINS-4045] Thanks for a great detective work! If you are interested in other bug fixes and you are not a committer yet, please let me know so that I can add you as a committer.

          Scott Tatum added a comment -

          I built from trunk and deployed a war with this fix. Before that, I had enabled
          my slaves for a couple of days. I was getting OOME after about 12 hours or so
          consistently, and visualvm showed the issue with the ExportTable entries was the
          culprit, taking up ~500 megs.

          After running the patched version for most of a day, I can see that used heap is
          staying low, and a heap dump shows nothing out of the ordinary. So, fix
          confirmed here. I can finally use slaves again. Thank you mdillon!!

          Scott Tatum added a comment - I built from trunk and deployed a war with this fix. Before that, I had enabled my slaves for a couple of days. I was getting OOME after about 12 hours or so consistently, and visualvm showed the issue with the ExportTable entries was the culprit, taking up ~500 megs. After running the patched version for most of a day, I can see that used heap is staying low, and a heap dump shows nothing out of the ordinary. So, fix confirmed here. I can finally use slaves again. Thank you mdillon!!

              • Issue 4008 has been marked as a duplicate of this issue. ***

          Kohsuke Kawaguchi added a comment - Issue 4008 has been marked as a duplicate of this issue. ***

          torbent added a comment -

          Just wanted to note that this fix didn't solve all of the leaks in (marked
          duplicate) issue 4008, but it has certainly helped.

          torbent added a comment - Just wanted to note that this fix didn't solve all of the leaks in (marked duplicate) issue 4008, but it has certainly helped.

            Unassigned Unassigned
            md5 Mike Dillon
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: