Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46514

ClassLoader leak in remoting with job-dsl plugin

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Component/s: job-dsl-plugin, remoting
    • Labels:
      None
    • Environment:
      Jenkins 2.67
      job-dsl 1.64
    • Similar Issues:

      Description

      We have the same issue as JENKINS-30832 and we suspected a classloader leak, that is why we did some heap dumps. What we found is that after running the seed job (which processes N groovy scripts) M times, there were N*M GroovyClassLoaders retained. YourKit Java Profiler showed us that each of them loaded the classes (mostly closures) from one of our N groovy scripts.

      We also traced what is retaining the classloaders and we ended up at the unexportLog field of hudson.remoting.ExportTable. We didn't fully understand why was it retaining the classloaders, see the screenshot attached (the majority of LinkedList nodes was cut out). ul_rsp is one of the groovy scripts processed, the adress 172.17.0.1 corresponds to the node the seed job was running on.

      As a workaround hudson.remoting.ExportTable.unexportLogSize was set to 0 to prevent adding entries to unexportLog. Checking the heap dump revealed that the same amount of GroovyClassLoaders are present but held via weak/soft references only, making them eligible for garbage collection.

      I should also mention that the classloaders were leaked even after we got rid of the mixins in the seed job.

        Attachments

          Activity

          apetres Petres Andras created issue -
          daspilker Daniel Spilker made changes -
          Field Original Value New Value
          Assignee Daniel Spilker [ daspilker ] Oleg Nenashev [ oleg_nenashev ]
          apetres Petres Andras made changes -
          Description We have the same issue as JENKINS-30832 and we suspected a classloader leak, that is why we did some heap dumps. What we found is that after running the seed job (which processes N groovy scripts) M times, there were N*M GroovyClassLoaders retained. YourKit Java Profiler showed us that each of them loaded the classes (mostly closures) from one of our N groovy scripts.

          We also traced what is retaining the classloaders and we ended up at the _unexportLog_ field of _hudson.remoting.ExportTable_. We didn't fully understand why was it retaining the classloaders, see the screenshot attached (the majority of _LinkedList_ nodes was cut out). _ul_rsp_ is one of the groovy scripts processed, the adress _172.17.0.1_ corresponds to the node the seed job was running on.

          As a workaround hudson.remoting._ExportTable.unexportLogSize_ was set to 0 to prevent adding entries to _unexportLog_. Checking the heap dump revealed that the same amount of GroovyClassLoaders are present but held via weak/soft references only, making them eligible for garbage collection.

          I should also mention that the classloaders were leaked even after we got rid of the mixins in out seed job.
          We have the same issue as JENKINS-30832 and we suspected a classloader leak, that is why we did some heap dumps. What we found is that after running the seed job (which processes N groovy scripts) M times, there were N*M GroovyClassLoaders retained. YourKit Java Profiler showed us that each of them loaded the classes (mostly closures) from one of our N groovy scripts.

          We also traced what is retaining the classloaders and we ended up at the _unexportLog_ field of _hudson.remoting.ExportTable_. We didn't fully understand why was it retaining the classloaders, see the screenshot attached (the majority of _LinkedList_ nodes was cut out). _ul_rsp_ is one of the groovy scripts processed, the adress _172.17.0.1_ corresponds to the node the seed job was running on.

          As a workaround hudson.remoting._ExportTable.unexportLogSize_ was set to 0 to prevent adding entries to _unexportLog_. Checking the heap dump revealed that the same amount of GroovyClassLoaders are present but held via weak/soft references only, making them eligible for garbage collection.

          I should also mention that the classloaders were leaked even after we got rid of the mixins in the seed job.
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          It is not exactly clear to me why the object with the custom classloader is being sent over the channel. Likely you do remote calls within your Groovy script. I do not see the scripts attached, so I cannot say for sure. While Remoting behaves correctly by creating "createdAt" object with the Groovy classloader in such case, it is something not really desired in this case.

          So far I also wonder why it comes from the Channel Pinger thread. So far no giid explanation for that. I need your Java settings in order to check if there is a custom classloader defined there.

          Unexport Log Size is a valid workaround.

          IMHO you firstly need to investigate your scripts and check if they invoke remote calls directly. If no, it is something to be investigated in the JobDSL engine or its plugin implementations.

           

          Show
          oleg_nenashev Oleg Nenashev added a comment - It is not exactly clear to me why the object with the custom classloader is being sent over the channel. Likely you do remote calls within your Groovy script. I do not see the scripts attached, so I cannot say for sure. While Remoting behaves correctly by creating "createdAt" object with the Groovy classloader in such case, it is something not really desired in this case. So far I also wonder why it comes from the Channel Pinger thread. So far no giid explanation for that. I need your Java settings in order to check if there is a custom classloader defined there. Unexport Log Size is a valid workaround. IMHO you firstly need to investigate your scripts and check if they invoke remote calls directly. If no, it is something to be investigated in the JobDSL engine or its plugin implementations.  
          Hide
          daspilker Daniel Spilker added a comment -

          Oleg Nenashev The Job DSL engine uses remoting to access the DSL script file in the workspace through FilePath. The class in question is ScriptRequestGenerator. Unfortunately the class is overly complex because the Groovy class loader is an URLClassLoader and I needed to create a special URL handler for files in the workspace...

          Show
          daspilker Daniel Spilker added a comment - Oleg Nenashev The Job DSL engine uses remoting to access the DSL script file in the workspace through FilePath . The class in question is ScriptRequestGenerator . Unfortunately the class is overly complex because the Groovy class loader is an URLClassLoader and I needed to create a special URL handler for files in the workspace...
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          It could be solved via JOB_NAME@tmp folder on Master like Jenkins Pipeline does for Shared Libraries and load() operations, but I am not exactly sure it is a good practice

          Show
          oleg_nenashev Oleg Nenashev added a comment - It could be solved via JOB_NAME@tmp folder on Master like Jenkins Pipeline does for Shared Libraries and load() operations, but I am not exactly sure it is a good practice
          Hide
          daspilker Daniel Spilker added a comment -

          Oleg Nenashev Job DSL does something similar for JARs, ScriptRequestGenerator

          But it can also access any file from the workspace. That's another use of remoting: JenkinsJobManagement.

          But IMHO that should not lead to the GroocyClassLoader being send over remoting. The classes in question should be loaded by the plugin class loader or above. Can we debug that somehow? E.g. can we see which objects are transferred?

          Show
          daspilker Daniel Spilker added a comment - Oleg Nenashev Job DSL does something similar for JARs, ScriptRequestGenerator But it can also access any file from the workspace. That's another use of remoting: JenkinsJobManagement . But IMHO that should not lead to the GroocyClassLoader being send over remoting. The classes in question should be loaded by the plugin class loader or above. Can we debug that somehow? E.g. can we see which objects are transferred?
          Hide
          apetres Petres Andras added a comment -

          Oleg Nenashev to me it seems that ExportTable entries may help, since it stores data about objects sent over remote, right?

          Show
          apetres Petres Andras added a comment - Oleg Nenashev to me it seems that  ExportTable  entries may help, since it stores data about objects sent over remote, right?
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          > Can we debug that somehow? E.g. can we see which objects are transferred?

          > to me it seems that ExportTable entries may help, since it stores data about objects sent over remote, right?

          It is. You can dump Export Table at any moment, but some extra export table logging (and even "exportTable.log" would be useful).

           

          Show
          oleg_nenashev Oleg Nenashev added a comment - > Can we debug that somehow? E.g. can we see which objects are transferred? > to me it seems that  ExportTable  entries may help, since it stores data about objects sent over remote, right? It is. You can dump Export Table at any moment, but some extra export table logging (and even "exportTable.log" would be useful).  
          apetres Petres Andras made changes -
          Attachment ExportTable.dump.txt [ 39574 ]
          Hide
          apetres Petres Andras added a comment -

          I did the dump using ExportTable.dump() and attached it.
          What other kind of "extra export table logging" would you need?

          Show
          apetres Petres Andras added a comment - I did the dump using ExportTable.dump() and attached it. What other kind of "extra export table logging" would you need?
          Hide
          daspilker Daniel Spilker added a comment -

          Hm, the dump does not contain any Groovy or Job DSL classes. I'm not sure how to debug this.

          Show
          daspilker Daniel Spilker added a comment - Hm, the dump does not contain any Groovy or Job DSL classes. I'm not sure how to debug this.
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          If this is a real case, I am pretty sure it will break with JEP-200 on Jenkins 2.102+
          Petres Andras have you already upgraded just in case?

          Show
          oleg_nenashev Oleg Nenashev added a comment - If this is a real case, I am pretty sure it will break with JEP-200 on Jenkins 2.102+ Petres Andras have you already upgraded just in case?
          Hide
          apetres Petres Andras added a comment -

          I upgraded to 2.106 and did some dumps, no ExportTable or remoting-related stuff is retaining the classloaders anymore.

          Show
          apetres Petres Andras added a comment - I upgraded to 2.106 and did some dumps, no ExportTable or remoting-related stuff is retaining the classloaders anymore.
          Hide
          daspilker Daniel Spilker added a comment -

          Petres Andras Can we close the issue?

          Show
          daspilker Daniel Spilker added a comment - Petres Andras Can we close the issue?
          Hide
          apetres Petres Andras added a comment -

          Sure!

          Show
          apetres Petres Andras added a comment - Sure!
          daspilker Daniel Spilker made changes -
          Assignee Oleg Nenashev [ oleg_nenashev ] Daniel Spilker [ daspilker ]
          Resolution Cannot Reproduce [ 5 ]
          Status Open [ 1 ] Resolved [ 5 ]
          daspilker Daniel Spilker made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            Assignee:
            daspilker Daniel Spilker
            Reporter:
            apetres Petres Andras
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: