• Icon: Bug Bug
    • Resolution: Incomplete
    • Icon: Critical Critical
    • core
    • Linux Server with 24 CPUs and 64GB RAM
      Jenkins version LTS 1.509.4/1.532.1 on Jetty
      Memory allocated for Jenkins/Jetty process: 42GB
      Environment: Jenkins working with 600 jobs with high activities + 40 slave machines (Linux and Windows)

      After Jenkins upgraded to LTS 1.509.4 from LTS 1.509.3 I notice that over time (24 hours) Jenkins becomes very slowly.
      It turns out that Jenkins (under Jetty service) slowly "eats" the server memory. It's takes about 24 hours to take all the memory allocated to Jenkins (42GB). See the snapshots with examples...

      TEST #1 on LTS 1.509.4:
      1. Machine with Jetty up after restart
      2. Jenkins Used - After one Hour: 22GB
      3. Jenkins Used - After 12 Hours: 27GB
      4. Jenkins Used - After 20 Hours: 35GB -> Memory leaks between 10:00-10:20 as you can see after GC it's still think Java in-use and fail to cleanup the all memory as it should be.
      5. Jenkins Used - After 23 Hours: 39GB -> Very slow response and Heap is almost 100%

      TEST #2 on LTS 1.509.4:
      I tried to do manual GC, Doesn't help!
      (see attach file: "Monitor_Memory_Over_Time_Manual_GC")

      TEST #3 on LTS 1.509.3:
      Unfortunately I downgrade to LTS 1.509.3 because of the memory leak, for me it's a blocker issue!

      Please note that on version LTS 1.509.3 Jenkins works stable even on high environment without any memory leak... (See attach files: "Good_GC_A1.509.3" and "Good_GC_B1.509.3") but unfortunately there is a BIG unsolved problem/bug in this version, I can't rename jobs (Deadlock! which solved on the next version LTS 1.509.4/1.532.1 that I can't use because of the memory leak).

      TEST #4 with LTS 1.532.1:
      Same issue! Jenkins stuck with 100% memory usage after only 12 hours!

      Thank You,
      Ronen.

        1. Good_GC_B1.509.3.jpg
          Good_GC_B1.509.3.jpg
          322 kB
        2. Good_GC_A1.509.3.JPG
          Good_GC_A1.509.3.JPG
          268 kB
        3. gc.log
          15 kB
        4. Monitor_Memory_Over_Time_Manual_GC.jpg
          Monitor_Memory_Over_Time_Manual_GC.jpg
          572 kB
        5. Monitor_After_12_Hours.jpg
          Monitor_After_12_Hours.jpg
          224 kB
        6. Monitor_After_1_Hour.jpg
          Monitor_After_1_Hour.jpg
          238 kB
        7. Monitor_After_20_Hours.jpg
          Monitor_After_20_Hours.jpg
          321 kB
        8. Monitor_After_23_Hours.JPG
          Monitor_After_23_Hours.JPG
          253 kB

          [JENKINS-20620] Memory Leak on Jenkins LTS 1.509.4/1.532.1

          Ely Fuchs added a comment -

          This same problem happens with 1.540 and 1.541 on a red hat Linux platform

          Ely Fuchs added a comment - This same problem happens with 1.540 and 1.541 on a red hat Linux platform

          Ronen Peleg added a comment -

          The same problem happens with 1.532.1 LTS !

          Ronen Peleg added a comment - The same problem happens with 1.532.1 LTS !

          Tomer Pengo added a comment -

          The same problem with memory leak on our system with 1.509.4 LTS,
          After 8 hours the Jenkins become very slowly and then stuck, I must do restart to Jenkins every day!

          Tomer Pengo added a comment - The same problem with memory leak on our system with 1.509.4 LTS, After 8 hours the Jenkins become very slowly and then stuck, I must do restart to Jenkins every day!

          vjuranek added a comment -

          Hi,
          any change to see some heap dump (preferably some small one), or, as a first hint (as I won't be probably able to parse 40GB heap dump on my machine anyway), see output of

          jmap -histo:live $JENKINS_PID |head -n 30
          

          ?
          Thanks

          vjuranek added a comment - Hi, any change to see some heap dump (preferably some small one), or, as a first hint (as I won't be probably able to parse 40GB heap dump on my machine anyway), see output of jmap -histo:live $JENKINS_PID |head -n 30 ? Thanks

          Ronen Peleg added a comment -

          Sorry "vjuranek" already returns to the old and stable LTS version: 1.509.3 - The version LTS 1.509.3 come with a lots known bugs but with free of memory leaks.

          Any Jenkins system with more than 50 jobs can see very fast the memory leaks on versions LTS 1.509.4 or 1.532.1, I don't understand why Jenkins owners ignore this very blocked issue... maybe Jenkins support team need a Load Software to find their own bugs and issues.

          Ronen Peleg added a comment - Sorry "vjuranek" already returns to the old and stable LTS version: 1.509.3 - The version LTS 1.509.3 come with a lots known bugs but with free of memory leaks. Any Jenkins system with more than 50 jobs can see very fast the memory leaks on versions LTS 1.509.4 or 1.532.1, I don't understand why Jenkins owners ignore this very blocked issue... maybe Jenkins support team need a Load Software to find their own bugs and issues.

          vjuranek added a comment -

          It seems that JoJ [1] runs without any problems, so this memory leak maybe appears only on some specific configurations (or somehow fixed in 1.539?). Did you only upgrade Jenkins core or did you also upgrade any plugins (so we can exclude memory leak in the plugins)? By system with 50 jobs you mean 50 jobs present or 50 jobs run in parallel most of the time?
          Thanks

          Btw: there's no Jenkins support team, it's community project based on volunteers, if you want a support team, you should check Cloudbees or some other provider which offers support for Jenkins.

          [1] https://ci.jenkins-ci.org/

          vjuranek added a comment - It seems that JoJ [1] runs without any problems, so this memory leak maybe appears only on some specific configurations (or somehow fixed in 1.539?). Did you only upgrade Jenkins core or did you also upgrade any plugins (so we can exclude memory leak in the plugins)? By system with 50 jobs you mean 50 jobs present or 50 jobs run in parallel most of the time? Thanks Btw: there's no Jenkins support team, it's community project based on volunteers, if you want a support team, you should check Cloudbees or some other provider which offers support for Jenkins. [1] https://ci.jenkins-ci.org/

          Ronen Peleg added a comment -

          Thank You for your help

          First with all the respect to JoJ they doesn't have the system load that we are facing here and has you can see it's snapshots version. Anyway, I didn't try the release version 1.539... BTW: In order to solve memory leaks you must use load software such as Load Runner.

          I tried the new LTS versions after update plugins and without update plugins, still the same issue same bug and as I wrote this bug doesn't happens on previous version at all.

          Of course it community, open source and all of that... but don't forget there is also Enterprise edition that including "Jenkins support team" and I just wonder why they not solving this super critical issue but I guess this all another story...

          Ronen Peleg added a comment - Thank You for your help First with all the respect to JoJ they doesn't have the system load that we are facing here and has you can see it's snapshots version. Anyway, I didn't try the release version 1.539... BTW: In order to solve memory leaks you must use load software such as Load Runner. I tried the new LTS versions after update plugins and without update plugins, still the same issue same bug and as I wrote this bug doesn't happens on previous version at all. Of course it community, open source and all of that... but don't forget there is also Enterprise edition that including "Jenkins support team" and I just wonder why they not solving this super critical issue but I guess this all another story...

          Ronen Peleg added a comment -

          Update: I checked the release version: 1.539 that on JoJ, this version is memory leaks free but this version suffers from other bugs such as problem when using Build multi-configuration project (Matrix job) + Builds of a concurrently executable job.

          Ronen Peleg added a comment - Update: I checked the release version: 1.539 that on JoJ, this version is memory leaks free but this version suffers from other bugs such as problem when using Build multi-configuration project (Matrix job) + Builds of a concurrently executable job.

          Ronen Peleg added a comment -

          Update: I checked also the 1.54X versions (1.541, 1.542, 1.543, 1.544) and the memory leak issue is also on all release 1.54X versions!

          Ronen Peleg added a comment - Update: I checked also the 1.54X versions (1.541, 1.542, 1.543, 1.544) and the memory leak issue is also on all release 1.54X versions!

          Oleg Nenashev added a comment -

          Could you collect a heap dump / any other memory statistics of your system?
          We use 1.509.4 with a quite similar configuration, but we have not experienced such memory leaks (the uptime is close to 3 months). There are many other issues, but seems that the memory is OK...

          Oleg Nenashev added a comment - Could you collect a heap dump / any other memory statistics of your system? We use 1.509.4 with a quite similar configuration, but we have not experienced such memory leaks (the uptime is close to 3 months). There are many other issues, but seems that the memory is OK...

          Ronen Peleg added a comment -

          Thanks Oleg.

          I can't dump it because it's to big (40GB) and for some reason the "jmap" unable to connect to the Jenkins process. Anyway I downgraded the version to 1.537...

          Oleg, If you have a system with 600 jobs with high activities + 40 slave machines online and you don't have this issue it's very strange because I test these versions on several different machines and each time the problem reproduced (after 12-24 hours).

          Ronen Peleg added a comment - Thanks Oleg. I can't dump it because it's to big (40GB) and for some reason the "jmap" unable to connect to the Jenkins process. Anyway I downgraded the version to 1.537... Oleg, If you have a system with 600 jobs with high activities + 40 slave machines online and you don't have this issue it's very strange because I test these versions on several different machines and each time the problem reproduced (after 12-24 hours).

          Oleg Nenashev added a comment -

          Probably, we have different job contents/plugins (e.g. we don't use Maven plugin; xUnit components are not popular due to integration with external systems).
          Could you provide a list of your plugins? Since we know about stable/unstable versions, we can try to find an interoperability issue if it exists.

          Oleg Nenashev added a comment - Probably, we have different job contents/plugins (e.g. we don't use Maven plugin; xUnit components are not popular due to integration with external systems). Could you provide a list of your plugins? Since we know about stable/unstable versions, we can try to find an interoperability issue if it exists.

          Ronen Peleg added a comment -

          Oleg, I sent you email with our Jenkins Plugins list. Thank You

          Ronen Peleg added a comment - Oleg, I sent you email with our Jenkins Plugins list. Thank You

          BTW, Ronen, do you use System Groovy script buildsteps in your jobs?

          Nickolay Rumyantsev added a comment - BTW, Ronen, do you use System Groovy script buildsteps in your jobs?

          Ronen Peleg added a comment -

          Hi Nickolay, Yes we have a Groovy scripts on our jobs.

          Ronen Peleg added a comment - Hi Nickolay, Yes we have a Groovy scripts on our jobs.

          Ronen Peleg added a comment -

          Update:
          The solution was to delete some slave machines from the Jenkins nodes.
          It turns out that Jenkins can't handle more than 100 slave machines.
          Currently we have (after cleanup) 70 slave machines and no memory leak!

          BTW: The memory leak issue occurs only on Jenkins Master running on Linux O/S with more than 100 slave machines, Actually on Windows O/S this issue doesn't exist!

          Ronen Peleg added a comment - Update: The solution was to delete some slave machines from the Jenkins nodes. It turns out that Jenkins can't handle more than 100 slave machines. Currently we have (after cleanup) 70 slave machines and no memory leak! BTW: The memory leak issue occurs only on Jenkins Master running on Linux O/S with more than 100 slave machines, Actually on Windows O/S this issue doesn't exist!

          Oleg Nenashev added a comment -

          I've tested Jenkins 1.509.4(patched by remoting-2.36)/RHEL6.4 with about 150 slaves.
          There's no memory leak after 1 week. The test installation just builds several Jenkins plugins, hence there's no extremal load

          Probably, the error could be in the communication layer. I'll try the remoting version from 1.532.1 with a bigger workload

          Oleg Nenashev added a comment - I've tested Jenkins 1.509.4(patched by remoting-2.36)/RHEL6.4 with about 150 slaves. There's no memory leak after 1 week. The test installation just builds several Jenkins plugins, hence there's no extremal load Probably, the error could be in the communication layer. I'll try the remoting version from 1.532.1 with a bigger workload

          We need more information to be able to solve problems like this. Please see https://wiki.jenkins-ci.org/display/JENKINS/I%27m+getting+OutOfMemoryError for how to get the details we need to be able to work on problems like this.

          I'm not doubting that you are seeing the problem, and for that I am sorry. Please get us the details we need so that we can fix the problem.

          If you cannot post a heap dump, please get at least the histogram summary.

          Kohsuke Kawaguchi added a comment - We need more information to be able to solve problems like this. Please see https://wiki.jenkins-ci.org/display/JENKINS/I%27m+getting+OutOfMemoryError for how to get the details we need to be able to work on problems like this. I'm not doubting that you are seeing the problem, and for that I am sorry. Please get us the details we need so that we can fix the problem. If you cannot post a heap dump, please get at least the histogram summary.

          Ronen Peleg added a comment - - edited

          @Oleg Nenashev, Did you try it with 1200 active jobs? anyway this is what solved my problem. I guess you have issue with 100+ slave machines connected to Jenkins with high load Jenkins.

          @Kohsuke Kawaguchi, because I have 64GB RAM, I can't do it, I can't save 64GB RAM on my HDD and anyway my problem is already solved so it's save to close it.

          Ronen Peleg added a comment - - edited @Oleg Nenashev, Did you try it with 1200 active jobs? anyway this is what solved my problem. I guess you have issue with 100+ slave machines connected to Jenkins with high load Jenkins. @Kohsuke Kawaguchi, because I have 64GB RAM, I can't do it, I can't save 64GB RAM on my HDD and anyway my problem is already solved so it's save to close it.

            oleg_nenashev Oleg Nenashev
            ronenpg Ronen Peleg
            Votes:
            21 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: