Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-70281

Mercurial plugin leaks memory when using ssh keys for auth

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • mercurial-plugin
    • Mercurial Plugin - 1251.va_b_121f184902
      Jenkins - 2.346.3

      Discovered this while investigating performance issues on our Jenkins server. The root cause was that the Jenkins JVM was nearly out of heap space, which was triggering GC thrashing. A heap dump revealed that a large portion of the heap was occupied by Strings related to the mercurial plugin. There seem to be 2 distinct leaks here:

      HgExe.Capability.MAP

      Looks like this tries to cache the capabilities of hg executables on the system. The main issue is that the arguments to the hg executable are mixed into the cache key. This is a problem when using ssh-based auth, because the ssh keys come from tmp files with randomly generated names. An example of one of these ArrayList<String> cache keys from my heap dump:

      [
      "hg",
      "--config",
      "ui.ssh=ssh -i /srv/jenkins/jenkins-mercurial9035497877612716076.sshkey -l buildbot"
      ]
      

      That 3rd String is the problem; the path to the temp sshkey file is different for every invocation of hg, so the MAP accumulates entries endlessly and gradually eats more and more heap space.

      HgExe.DeleteOnExit

      HgExe's constructor always calls File.deleteOnExit on the temporary sshkey file that it spawns. This permanently leaks the path to that file into DeleteOnExitHook.files (there is no mechanism that allows paths to be removed from this LinkedHashSet). This results in the gradual, unbounded accumulation of String instances which eventually consume the JVM's heap.

      Other Info

      • My Jenkins instance had been running for about 60 days when I discovered this
      • The Jenkins JVM was configured with a max heap size of 4.2 GB
      • The capabilities leak was responsible for about 2.4 GB of heap usage
      • The delete-on-exit leak was responsible for about 1.5 GB of heap usage

            Unassigned Unassigned
            bonuslord Andrew Lee
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: