Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33809

Find out what's wrong with the tests on the 2.0 branch requiring so much RAM

      Tests on the 2.0 branch require crazy amounts of RAM or that the VM be killed after every test. This appears to be a regression from 1.x.

      Running the full test suite (on 2.0, Maven 3.3.3 and Java 8):

      Surefire forked booter 1:

      Surefire forked booter 2 (there is a little jconsole disconnection of ~4 minutes in my local env, but nothing remarkable happened in the middle):

      Maven launcher process:

      Running the full test suite (on 2.0, Maven 3.3.3 and Java 7 and -Xmx600m - lowered from the current value of -Xmx1g):

      Surefire forked booter 1:

      Surefire forked booter 2:

      Maven launcher process:

      Running the full test suite (on 1.x, Maven 3.3.3 and Java 7 and -Xmx256):

      Surefire forked booter 1:

      Surefire forked booter 2:

      Maven launcher process:

        1. forked-booter-1.png
          forked-booter-1.png
          354 kB
        2. forked-booter-2.png
          forked-booter-2.png
          331 kB
        3. java7-1.x-maven-launcher.png
          java7-1.x-maven-launcher.png
          20 kB
        4. java7-1.x-surefire-booter1.png
          java7-1.x-surefire-booter1.png
          56 kB
        5. java7-1.x-surefire-booter2.png
          java7-1.x-surefire-booter2.png
          50 kB
        6. java7-2.0-maven-launcher.png
          java7-2.0-maven-launcher.png
          21 kB
        7. java7-2.0-surefire-booter1.png
          java7-2.0-surefire-booter1.png
          70 kB
        8. java7-2.0-surefire-booter2.png
          java7-2.0-surefire-booter2.png
          75 kB
        9. jenkins1-maven-launcher.png
          jenkins1-maven-launcher.png
          253 kB
        10. jenkins1-surefire-booter.png
          jenkins1-surefire-booter.png
          261 kB
        11. jenkins2-maven-launcher.png
          jenkins2-maven-launcher.png
          230 kB
        12. jenkins2-surefire-booter.png
          jenkins2-surefire-booter.png
          259 kB
        13. maven-laucher.png
          maven-laucher.png
          152 kB

          [JENKINS-33809] Find out what's wrong with the tests on the 2.0 branch requiring so much RAM

          Description updated with reports of a full run on JDK 7. Memory usage does not show anything abnormal. Forked surefire booters use 400-600 MB as much and Maven launcher process doesn't go up to 300MB.

          I still think -Xmx1g is not really required.

          Maybe we could activate -XX:+HeapDumpOnOutOfMemoryError on both agents and master JVMs in https://ci.jenkins-ci.org so we get some hint next time it happens.

          I don't see anything else we can do here.

          Antonio Muñiz added a comment - Description updated with reports of a full run on JDK 7. Memory usage does not show anything abnormal. Forked surefire booters use 400-600 MB as much and Maven launcher process doesn't go up to 300MB. I still think -Xmx1g is not really required. Maybe we could activate -XX:+HeapDumpOnOutOfMemoryError on both agents and master JVMs in https://ci.jenkins-ci.org so we get some hint next time it happens. I don't see anything else we can do here.

          In the last run I set -Xmx600m (it was the full test suite on Jenkins 2.0 and JDK 7, which seems to be the problematic configuration).
          I didn't get any OOM (monitoring results added to the description).

          Antonio Muñiz added a comment - In the last run I set -Xmx600m (it was the full test suite on Jenkins 2.0 and JDK 7, which seems to be the problematic configuration). I didn't get any OOM (monitoring results added to the description).

          Daniel Beck added a comment -

          Behavior looks absolutely sane. Maybe something environment dependent?

          Daniel Beck added a comment - Behavior looks absolutely sane. Maybe something environment dependent?

          Antonio Muñiz added a comment - - edited

          Today I was looking at OldDataMonitorTest for some unrelated reason and realized about this call: MemoryAssert.assertGC(ref). This call fills all the available memory until it generates an OOM, then it is caught and the memory freed so the JVM continues working. But this means that this test will consume all available memory always at some point.

          In Java 7, an OOM could happen - mainly - by two reasons: the JVM does not see more available virtual memory or the OS is not able to give the JVM more physical memory (even when it didn't reach the maximum amount of allowed virtual memory).

          When running the full test suite there are 3 Java processes running: currently with -Xmx800, -Xmx1g and -Xmx1g. Given the fact that at some point one of the processes will consume 1GB, perhaps the sum with the two others is reaching the maximum amount of physical memory (it really depends on what tests are concurrently running at that point), so it would explain why the OOM only happens sometimes.

          danielbeck How much physical memory celery has?

          Antonio Muñiz added a comment - - edited Today I was looking at OldDataMonitorTest for some unrelated reason and realized about this call: MemoryAssert.assertGC(ref) . This call fills all the available memory until it generates an OOM, then it is caught and the memory freed so the JVM continues working. But this means that this test will consume all available memory always at some point. In Java 7, an OOM could happen - mainly - by two reasons: the JVM does not see more available virtual memory or the OS is not able to give the JVM more physical memory (even when it didn't reach the maximum amount of allowed virtual memory). When running the full test suite there are 3 Java processes running: currently with -Xmx800 , -Xmx1g and -Xmx1g . Given the fact that at some point one of the processes will consume 1GB, perhaps the sum with the two others is reaching the maximum amount of physical memory (it really depends on what tests are concurrently running at that point), so it would explain why the OOM only happens sometimes . danielbeck How much physical memory celery has?

          Daniel Beck added a comment -

          amuniz I have 16GB on my laptop that also had problems with tests. Celery has 8GB.

          Daniel Beck added a comment - amuniz I have 16GB on my laptop that also had problems with tests. Celery has 8GB.

          danielbeck Ok, so my theory is... wrong

          Antonio Muñiz added a comment - danielbeck Ok, so my theory is... wrong

          Daniel Beck added a comment -

          amuniz Do we need to reopen this?

          Daniel Beck added a comment - amuniz Do we need to reopen this?

          danielbeck I don't think so. I could not reproduce the OOM locally (after many full runs). The issues we saw yesterday in https://ci.jenkins-ci.org did not reproduce after cleaning all zombie processes in celery node.

          So not much more to do, just merge https://github.com/jenkinsci/jenkins/pull/2220 and wait for the OOM to happen again (or perhaps it does not happen anymore).

          Antonio Muñiz added a comment - danielbeck I don't think so. I could not reproduce the OOM locally (after many full runs). The issues we saw yesterday in https://ci.jenkins-ci.org did not reproduce after cleaning all zombie processes in celery node. So not much more to do, just merge https://github.com/jenkinsci/jenkins/pull/2220 and wait for the OOM to happen again (or perhaps it does not happen anymore).

          Code changed in jenkins
          User: Daniel Beck
          Path:
          test/pom.xml
          http://jenkins-ci.org/commit/jenkins/0d0314ddee72b227a770d294549d12ded3b3fbac
          Log:
          JENKINS-33809 Don't reuse forks

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Beck Path: test/pom.xml http://jenkins-ci.org/commit/jenkins/0d0314ddee72b227a770d294549d12ded3b3fbac Log: JENKINS-33809 Don't reuse forks

          Code changed in jenkins
          User: Daniel Beck
          Path:
          test/pom.xml
          http://jenkins-ci.org/commit/jenkins/574f83a962a62021cc60b6af1e3de5c0f1f008b8
          Log:
          Merge pull request #2264 from daniel-beck/reuseForks=false

          JENKINS-33809 Don't reuse forks

          Compare: https://github.com/jenkinsci/jenkins/compare/4f51944cf1f9...574f83a962a6

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Beck Path: test/pom.xml http://jenkins-ci.org/commit/jenkins/574f83a962a62021cc60b6af1e3de5c0f1f008b8 Log: Merge pull request #2264 from daniel-beck/reuseForks=false JENKINS-33809 Don't reuse forks Compare: https://github.com/jenkinsci/jenkins/compare/4f51944cf1f9...574f83a962a6

            amuniz Antonio Muñiz
            danielbeck Daniel Beck
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: