Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38802

Use a more sensible default GC algorithm for jenkins

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • packaging
    • None

      Currently we run with the default Java GC algorithm (ParallelGC) - it is known that this can produce very long GC pauses when full GC cycles are run.

      ConcurrentMarkSweep or G1 would greatly reduce the maximum pause time and improve user experience and stability. Similar rationale to why Oracle is making G1 the default in Java 9.

      This is especially important for (for example) large heaps, because multi-second full GC pauses start to become frequent and painful.

          [JENKINS-38802] Use a more sensible default GC algorithm for jenkins

          Sam Van Oort added a comment -

          batmat Has commented that it is worth adding GC logging by default when we do this. I think that is an excellent idea.

          Sam Van Oort added a comment - batmat Has commented that it is worth adding GC logging by default when we do this. I think that is an excellent idea.

          Alexander A added a comment -

          Several coins from my side:

          • ParallelGC is optimised for good throughput, it assumes that you can clean everything in YoungGen before a lot of objects would reach OldGen. If it's not a case, maybe it makes sense to increase YG.
          • If you have a big heap (~10 Gigs and more) you need to find JVM parameters that would be efficient only for you anyway.
          • It would be quite interesting to read article how did you test different GC.

          Maybe it's better to create a confluence page with some tips/how-to for advanced users.

          Alexander A added a comment - Several coins from my side: ParallelGC is optimised for good throughput, it assumes that you can clean everything in YoungGen before a lot of objects would reach OldGen. If it's not a case, maybe it makes sense to increase YG. If you have a big heap (~10 Gigs and more) you need to find JVM parameters that would be efficient only for you anyway. It would be quite interesting to read article how did you test different GC. Maybe it's better to create a confluence page with some tips/how-to for advanced users.

          Sam Van Oort added a comment -

          batmat jimilian and company – I'm building the list of proposed settings here: https://gist.github.com/svanoort/66a766ea68781140b108f465be45ff00

          It's drawing from real world Jenkins users running large installations and their GC logs and still being refined a bit.

          Comments:

          > ParallelGC is optimised for good throughput, it assumes that you can clean everything in YoungGen before a lot of objects would reach OldGen

          This assumption holds – I've got GC logs an analysis behind this across a variety of long-running systems. Young-gen GC clears out the vast majority of garbage, the stable resident set & long-lived object set are actually pretty small.

          The problem here is that with parallelGC and >2 GB heaps the pauses get unmanageable.

          Sam Van Oort added a comment - batmat jimilian and company – I'm building the list of proposed settings here: https://gist.github.com/svanoort/66a766ea68781140b108f465be45ff00 It's drawing from real world Jenkins users running large installations and their GC logs and still being refined a bit. Comments: > ParallelGC is optimised for good throughput, it assumes that you can clean everything in YoungGen before a lot of objects would reach OldGen This assumption holds – I've got GC logs an analysis behind this across a variety of long-running systems. Young-gen GC clears out the vast majority of garbage, the stable resident set & long-lived object set are actually pretty small. The problem here is that with parallelGC and >2 GB heaps the pauses get unmanageable.

          Sam Van Oort added a comment - - edited

          In the process of vetting modified settings against real-world uses to finalize the recommendations here.

          May be worth setting some of the defaults for low-pause GC in the main settings by default (ex: ExplicitGCInvokesConcurrentAndUnloadsClasses and ParallelRefProcEnabled) even if heap is set to small, if they won't interfere with default ParallelGC for standard instances with <2 GB heaps. TBH most users probably should be running with less heap, since Jenkins resident set is pretty small.

          Probably I'll split these up in the packages like so:

          1. Base defaults (GC logging, basic settings)
          2. G1GC defaults, declared as an env variable
          3. CMS defaults declared as an env variable
          4. Java defaults are something like "$BASE_DEFAULTS" and then users on high-mem systems can add "$G1_DEFAULTS" or "$CMS_DEFAULTS" to this (and customize either).

          Sam Van Oort added a comment - - edited In the process of vetting modified settings against real-world uses to finalize the recommendations here. May be worth setting some of the defaults for low-pause GC in the main settings by default (ex: ExplicitGCInvokesConcurrentAndUnloadsClasses and ParallelRefProcEnabled) even if heap is set to small, if they won't interfere with default ParallelGC for standard instances with <2 GB heaps. TBH most users probably should be running with less heap, since Jenkins resident set is pretty small. Probably I'll split these up in the packages like so: 1. Base defaults (GC logging, basic settings) 2. G1GC defaults, declared as an env variable 3. CMS defaults declared as an env variable 4. Java defaults are something like "$BASE_DEFAULTS" and then users on high-mem systems can add "$G1_DEFAULTS" or "$CMS_DEFAULTS" to this (and customize either).

          Sam Van Oort added a comment -

          In local testing, G1 seems to work rather well even on fairly small heap sizes. I think the more critical aspect is having a decent number of cores to throw at the problem. So, I think my base rule will be "G1 if Java > 7 && cores > 2"

          Sam Van Oort added a comment - In local testing, G1 seems to work rather well even on fairly small heap sizes. I think the more critical aspect is having a decent number of cores to throw at the problem. So, I think my base rule will be "G1 if Java > 7 && cores > 2"

          I guess that makes sense. G1 for Java >= 8 seems to me logical in any case. I really don't think anyone expects a Jenkins master UI to freeze many seconds, but who would prefer that to losing a wee bit of throughput .

          Also, svanoort I remember there was a PR about that, more than the gist above? Can you add the link here or so? Thanks!

          Baptiste Mathus added a comment - I guess that makes sense. G1 for Java >= 8 seems to me logical in any case. I really don't think anyone expects a Jenkins master UI to freeze many seconds, but who would prefer that to losing a wee bit of throughput . Also, svanoort I remember there was a PR about that, more than the gist above? Can you add the link here or so? Thanks!

            Unassigned Unassigned
            svanoort Sam Van Oort
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: