Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23762

Adding a mixin class to Job leaks memory from every DSL run

      A user reported an exhaustion of the permanent generation. jmap -histo:live showed that there were several hundred GroovyClassLoader instances in heap, and further analysis tracked down the problem to usage of the Job DSL plugin with Groovy mixins.

      If you have a DSL

      job {
        name 'generated'
        description("generated on ${new Date()}")
      }
      

      then each time you run it, a new GroovyClassLoader is added to the heap. These do not seem to get collected easily, since they are held via a SoftReference in Groovy somewhere, but this /script forces soft references to be cleared:

      for (i = 1; ; i *= 2) {
        System.gc();
        new Object[i];
      }
      

      After running that, just one GroovyClassLoader remains, held from ASTTransformationVisitor.compUnit—an apparent memory leak in Groovy, but at least limited to holding the loader from one build at a time.

      If I change the DSL to be

      javaposse.jobdsl.dsl.Job.description2 = {String d ->
        description(d);
      }
      job {
        name 'generated'
        description2 "generated on ${new Date()}"
      }
      

      (recommended monkey-patching) then I see the same behavior. But if I change the DSL to be

      class JobMixin {
        public Object description2(String d) {
          return description(d.toUpperCase());
        }
      }
      javaposse.jobdsl.dsl.Job.mixin(JobMixin);
      job {
        name 'generated'
        description2 "generated on ${new Date()}"
      }
      

      then I get one GroovyClassLoader per run of the DSL, and they do not get collected even after forcing full GC. Besides the leak of one from ASTTransformationVisitor.compUnit, the rest are leaked in

      this     - value: groovy.lang.GroovyClassLoader #3
       <- parent     - class: groovy.util.GroovyScriptEngine$ScriptClassLoader, value: groovy.lang.GroovyClassLoader #3
        <- delegate     - class: groovy.lang.GroovyClassLoader$InnerLoader, value: groovy.util.GroovyScriptEngine$ScriptClassLoader #3
         <- <classLoader>     - class: JobMixin, value: groovy.lang.GroovyClassLoader$InnerLoader #3
          <- cachedClass     - class: org.codehaus.groovy.reflection.CachedClass, value: JobMixin class JobMixin
           <- mixinClass     - class: org.codehaus.groovy.reflection.MixinInMetaClass, value: org.codehaus.groovy.reflection.CachedClass #150
            <- key     - class: java.util.LinkedHashMap$Entry, value: org.codehaus.groovy.reflection.MixinInMetaClass #3
             <- [5]     - class: java.util.HashMap$Entry[], value: java.util.LinkedHashMap$Entry #7716
              <- table     - class: java.util.LinkedHashMap, value: java.util.HashMap$Entry[] #7123
               <- map     - class: java.util.LinkedHashSet, value: java.util.LinkedHashMap #1581
                <- mixinClasses     - class: groovy.lang.ExpandoMetaClass, value: java.util.LinkedHashSet #45
                 <- strongMetaClass     - class: org.codehaus.groovy.reflection.ClassInfo, value: groovy.lang.ExpandoMetaClass #1
                  <- $staticClassInfo     - class: javaposse.jobdsl.dsl.Job, value: org.codehaus.groovy.reflection.ClassInfo #320
                   <- [353]     - class: java.lang.Object[], value: javaposse.jobdsl.dsl.Job class Job
                    <- elementData     - class: java.util.Vector, value: java.lang.Object[] #8736
                     <- classes     - class: hudson.PluginFirstClassLoader, value: java.util.Vector #192
      

      (The analysis of root GC references was done using the NetBeans Profiler; VisualVM has the same tool.)

      Suggested resolutions:

      1. Document this bug and close without fixing.
      2. Clear mixinClasses from every built-in DSL class after every DSL run. (Possibly a race condition, if you are running multiple DSLs at once.)
      3. Prevent DSLs from adding mixins, just throwing some informative error. (Again this would prevent a possible race condition in the current system: there seems to be nothing preventing unrelated DSLs from clashing over definitions.)
      4. Load DSL classes like Job in their own class loader for each DSL run, rather than loading them from the plugin class loader. That would ensure isolation between DSL runs, as well as preventing memory leaks. This seems like the best approach.

          [JENKINS-23762] Adding a mixin class to Job leaks memory from every DSL run

          Code changed in jenkins
          User: Daniel Spilker
          Path:
          docs/Extending-the-DSL-from-your-Job-Scripts.md
          http://jenkins-ci.org/commit/job-dsl-plugin/5c7a57a0538102d389cde810abfdd7e223a08f73
          Log:
          added hint for JENKINS-23762

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Spilker Path: docs/Extending-the-DSL-from-your-Job-Scripts.md http://jenkins-ci.org/commit/job-dsl-plugin/5c7a57a0538102d389cde810abfdd7e223a08f73 Log: added hint for JENKINS-23762

          Code changed in jenkins
          User: Daniel Spilker
          Path:
          docs/Extending-the-DSL-from-your-Job-Scripts.md
          http://jenkins-ci.org/commit/job-dsl-plugin/717da240d763c48a74242aa7b733b1a7d79f3006
          Log:
          Merge pull request #252 from CoreMedia/memory-leak

          JENKINS-23762 Added Memory-Leak Warning

          Compare: https://github.com/jenkinsci/job-dsl-plugin/compare/2eddd4b4f4e5...717da240d763

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Spilker Path: docs/Extending-the-DSL-from-your-Job-Scripts.md http://jenkins-ci.org/commit/job-dsl-plugin/717da240d763c48a74242aa7b733b1a7d79f3006 Log: Merge pull request #252 from CoreMedia/memory-leak JENKINS-23762 Added Memory-Leak Warning Compare: https://github.com/jenkinsci/job-dsl-plugin/compare/2eddd4b4f4e5...717da240d763

          Nate Kumar added a comment -

          thanks daniel - qq though, shouldnt the comment be that "using groovy mixins causes memory leaks" and "use monkey patching instead to workaround"?

          Nate Kumar added a comment - thanks daniel - qq though, shouldnt the comment be that "using groovy mixins causes memory leaks" and "use monkey patching instead to workaround"?

          Jesse Glick added a comment -

          Right, monkey-patching seems to not introduce a leak (though it could lead to race conditions and inter-script interference); it is mixins which are really problematic.

          Jesse Glick added a comment - Right, monkey-patching seems to not introduce a leak (though it could lead to race conditions and inter-script interference); it is mixins which are really problematic.

          Thanks for the clarification, I fixed the text.

          Daniel Spilker added a comment - Thanks for the clarification, I fixed the text.

          Ben Dean added a comment -

          @daspilker, I am running into an OutOfMemory error with Jenkins 2.18, job-dsl plugin 1.50 and I suspect it is related to this issue. I'm not using mixins but I am doing a lot of monkey-patching. Here's a path to a GC root for a groovy closure that is almost certainly something from one of my job dsl scripts:

          this     - value: groovy.lang.GroovyClassLoader$InnerLoader #12
           <- <classLoader>     - class: Script1$_run_closure1, value: groovy.lang.GroovyClassLoader$InnerLoader #12
            <- klazz     - class: org.codehaus.groovy.reflection.ClassInfo, value: Script1$_run_closure1 class Script1$_run_closure1
             <- value     - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue, value: org.codehaus.groovy.reflection.ClassInfo #62
              <- [6656]     - class: java.lang.Object[], value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue #62
               <- table     - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment, value: java.lang.Object[] #39966
                <- [0]     - class: org.codehaus.groovy.util.AbstractConcurrentMapBase$Segment[], value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment #16
                 <- segments     - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Map, value: org.codehaus.groovy.util.AbstractConcurrentMapBase$Segment[] #1
                  <- map     - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7, value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Map #1
                   <- globalClassValue     - class: org.codehaus.groovy.reflection.ClassInfo, value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7 #1
                    <- <class> (Java frame)     - class: org.codehaus.groovy.reflection.ClassInfo, value: org.codehaus.groovy.reflection.ClassInfo class ClassInfo
          

          Most of the things I'm adding methods to are Context classes so I could change those to use the newer DslExtensionPoint stuff, rather than monkey-patching new methods onto those classes. In fact, you actually deleted the wiki page with the monkey-patching examples. But as far as I can tell, there's no way, other than monkey-patching, to add methods to things like JobParent. I need to do this so I can add new kinds of jobs or folders or whatever, top level dsl methods.

          Do you have a recommendation for how to do that?

          Our dsl script looks something like this (more or less)

          JobParent.metaClass.dockerJob = { String name, Closure closure ->
              DockerJobContext context = new DockerJobContext()
              ContextHelper.executeInContext(closure, context)
          
              job("docker/$name") {
                  steps {
                      shell("docker pull $context.docker.baseImage")
                  }
                  // lots of other steps, wrappers, publishers, etc
              }
          }
          
          dockerJob('docker/foo') {
              docker {
                  baseImage "ruby:2.1-onbuild"
              }
          }
          

          That's obviously a much shortened version of what we're doing, but does that make sense? We basically have many jobs that are very similar and want to add methods to JobParent to make those jobs. I'm not sure how to do that without monkey-patching

          Ben Dean added a comment - @daspilker, I am running into an OutOfMemory error with Jenkins 2.18, job-dsl plugin 1.50 and I suspect it is related to this issue. I'm not using mixins but I am doing a lot of monkey-patching. Here's a path to a GC root for a groovy closure that is almost certainly something from one of my job dsl scripts: this - value: groovy.lang.GroovyClassLoader$InnerLoader #12 <- <classLoader> - class: Script1$_run_closure1, value: groovy.lang.GroovyClassLoader$InnerLoader #12 <- klazz - class: org.codehaus.groovy.reflection.ClassInfo, value: Script1$_run_closure1 class Script1$_run_closure1 <- value - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue, value: org.codehaus.groovy.reflection.ClassInfo #62 <- [6656] - class: java.lang.Object[], value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue #62 <- table - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment, value: java.lang.Object[] #39966 <- [0] - class: org.codehaus.groovy.util.AbstractConcurrentMapBase$Segment[], value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment #16 <- segments - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Map, value: org.codehaus.groovy.util.AbstractConcurrentMapBase$Segment[] #1 <- map - class: org.codehaus.groovy.reflection.GroovyClassValuePreJava7, value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Map #1 <- globalClassValue - class: org.codehaus.groovy.reflection.ClassInfo, value: org.codehaus.groovy.reflection.GroovyClassValuePreJava7 #1 <- <class> (Java frame) - class: org.codehaus.groovy.reflection.ClassInfo, value: org.codehaus.groovy.reflection.ClassInfo class ClassInfo Most of the things I'm adding methods to are Context classes so I could change those to use the newer DslExtensionPoint stuff, rather than monkey-patching new methods onto those classes. In fact, you actually deleted the wiki page with the monkey-patching examples . But as far as I can tell, there's no way, other than monkey-patching, to add methods to things like JobParent . I need to do this so I can add new kinds of jobs or folders or whatever, top level dsl methods. Do you have a recommendation for how to do that? Our dsl script looks something like this (more or less) JobParent.metaClass.dockerJob = { String name, Closure closure -> DockerJobContext context = new DockerJobContext() ContextHelper.executeInContext(closure, context) job("docker/$name") { steps { shell("docker pull $context.docker.baseImage") } // lots of other steps, wrappers, publishers, etc } } dockerJob('docker/foo') { docker { baseImage "ruby:2.1-onbuild" } } That's obviously a much shortened version of what we're doing, but does that make sense? We basically have many jobs that are very similar and want to add methods to JobParent to make those jobs. I'm not sure how to do that without monkey-patching

          Daniel Spilker added a comment - - edited

          b_dean: Try to avoid meta programming and use other language features. E.g. you can use a function to replace meta programming in your example:

          def dockerJob(String name, Closure closure) {
              DockerJobContext context = new DockerJobContext()
              ContextHelper.executeInContext(closure, context)
          
              job("docker/$name") {
                  steps {
                      shell("docker pull $context.docker.baseImage")
                  }
                  // lots of other steps, wrappers, publishers, etc
              }
          }
          
          dockerJob('docker/foo') {
              docker {
                  baseImage "ruby:2.1-onbuild"
              }
          }
          

          Daniel Spilker added a comment - - edited b_dean : Try to avoid meta programming and use other language features. E.g. you can use a function to replace meta programming in your example: def dockerJob( String name, Closure closure) { DockerJobContext context = new DockerJobContext() ContextHelper.executeInContext(closure, context) job( "docker/$name" ) { steps { shell( "docker pull $context.docker.baseImage" ) } // lots of other steps, wrappers, publishers, etc } } dockerJob( 'docker/foo' ) { docker { baseImage "ruby:2.1-onbuild" } }

          Ben Dean added a comment -

          And if it's a new kind of thing? As in something that cannot currently be created with and of the jobs, views, folders, etc.

          Specifically, we have methods to add the CloudBees Bitbucket Source Folder stuff. It's a new class that is a subclass of Folder but in order to make it

          class BitbucketFolder extends Folder {
             // ... lots of stuff
          }
          
          JobParent.metaClass.bitbucketFolder = { String name, Closure closure ->
              processItem(name, BitbucketFolder, closure)
          }
          

          I don't think I can call processItem from a job dsl script since it's protected

          Ben Dean added a comment - And if it's a new kind of thing? As in something that cannot currently be created with and of the jobs, views, folders, etc. Specifically, we have methods to add the CloudBees Bitbucket Source Folder stuff. It's a new class that is a subclass of Folder but in order to make it class BitbucketFolder extends Folder { // ... lots of stuff } JobParent.metaClass.bitbucketFolder = { String name, Closure closure -> processItem(name, BitbucketFolder, closure) } I don't think I can call processItem from a job dsl script since it's protected

          Daniel Spilker added a comment - - edited

          In Groovy you can call anything...

          But processItem is an internal method and may change anytime without a warning. Use a configure block to adapt the folder config, e.g.

          def bitbucketFolder(String name, Closure closure) {
            // ...
            folder(name) {
              configure { node ->
                node.name = 'org.acme.WhatEver'
                // ...
              }
            }
            // ...
          }
          
          bitbucketFolder('example') {
            // ...
          }
          

          Daniel Spilker added a comment - - edited In Groovy you can call anything... But processItem is an internal method and may change anytime without a warning. Use a configure block to adapt the folder config, e.g. def bitbucketFolder( String name, Closure closure) { // ... folder(name) { configure { node -> node.name = 'org.acme.WhatEver' // ... } } // ... } bitbucketFolder( 'example' ) { // ... }

          Ben Dean added a comment -

          So if the CloudBees people wanted to make their cloudbees-folder plugin (or more specifically the branch-api plugin, which is where the jenkins.branch.OrganizationFolder class comes from), support the jobdsl, they wouldn't be able to add a new top level kind of thing? It's very similar to the MultibranchWorkflowJob but has a few different elements in the xml (and different root node). If it was going to be in the job-dsl plugin itself, it'd probably be a subclass of MultibranchWorkflowJob or they'd both subclass some AbstractMultibranchWorkflowJob or whatever, but I don't see any way that plugin developers can add job dsl methods for things like that.

          If that makes any sense.

          Also, I realize this is getting further away from memory-leaking monkey patches, just trying to figure out what "correct" way to do this stuff.

          Ben Dean added a comment - So if the CloudBees people wanted to make their cloudbees-folder plugin (or more specifically the branch-api plugin, which is where the jenkins.branch.OrganizationFolder class comes from), support the jobdsl, they wouldn't be able to add a new top level kind of thing? It's very similar to the MultibranchWorkflowJob but has a few different elements in the xml (and different root node). If it was going to be in the job-dsl plugin itself, it'd probably be a subclass of MultibranchWorkflowJob or they'd both subclass some AbstractMultibranchWorkflowJob or whatever, but I don't see any way that plugin developers can add job dsl methods for things like that. If that makes any sense. Also, I realize this is getting further away from memory-leaking monkey patches, just trying to figure out what "correct" way to do this stuff.

            quidryan Justin Ryan
            jglick Jesse Glick
            Votes:
            5 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: