Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22170

Jenkins process hangs using almost all CPU, much memory and UI unresponsive

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • core
    • None
    • Jenkins 1.554, Ubuntu 12.04.4 LTS 64 bits, 16 GB memory, JDK: OpenJDK Runtime Environment (IcedTea6 1.13.1) (6b30-1.13.1-1ubuntu2~0.12.04.1), OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode), installed using package supplied by jenkins-ci.org

      (I picked a component that seemed relevant, but the majority of them really are not descriptive at all, yet it is required I select one.)

      Currently running 1.554 Jenkins we notice on a daily occurence that the server just start to hang. The web UI becomes unresponsive and the output in top shows a process to this effect:

       PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
      5964 jenkins   20   0 8414m 4.9g  14m S  358 31.1   2590:29 java
      

      I created a thread dump with jstack -F, it's attached to this issue.

      The only recourse I have is to forcibly kill -9 the PID and restart Jenkins. 1.551 and I think 1.552 did not have this problem. I am not sure if 1.553 already exhibited this problem or if it only started with 1.554.

      The jobs that is it building are mostly Java projects with a POM file structure using Maven and Mercurial plugins to build. Artifacts get uploaded to a Nexus repository. Diskspace has more than enough space free (10s of GB at the minimum).

      Normal Jenkins process on this machine:

      30082 jenkins 20 0 6719m 1.3g 22m S 0 8.5 4:48.78 java

      Memory (16 GB):

      Mem: 16435924k total, 15182196k used, 1253728k free, 1466652k buffers

          [JENKINS-22170] Jenkins process hangs using almost all CPU, much memory and UI unresponsive

          Michiel Hendriks added a comment - - edited

          We recently updated to 1.555 and we're also suffering from severe memory leaking. Within a day it has burned through 4G of heap.

          Analyzing a heap dump with MAT gives me this info:

          Class Name                                                        |   Objects | Shallow Heap |    Retained Heap
          ----------------------------------------------------------------------------------------------------------------
          sun.security.ssl.SSLSocketImpl                                    |     9,662 |    1,468,624 | >= 1,173,575,432
          java.lang.ref.Finalizer                                           | 4,384,964 |  175,398,560 |   >= 702,958,432
          java.lang.String                                                  | 4,933,784 |  157,881,088 |   >= 690,034,640
          sun.security.ssl.SSLContextImpl                                   |     9,426 |      377,040 |   >= 690,027,624
          org.tmatesoft.svn.core.internal.wc.DefaultSVNAuthenticationManager|     7,424 |      475,136 |   >= 670,023,712
          java.lang.String[]                                                |    85,446 |   20,335,312 |   >= 651,775,152
          org.tmatesoft.svn.core.internal.wc.SVNCompositeConfigFile         |    14,850 |      356,400 |   >= 651,686,160
          org.tmatesoft.svn.core.internal.wc.SVNConfigFile                  |    29,700 |      950,400 |   >= 651,329,760
          byte[]                                                            |   312,123 |  618,359,960 |   >= 618,359,960
          char[]                                                            | 4,967,656 |  541,210,488 |   >= 541,210,488
          ----------------------------------------------------------------------------------------------------------------
          

          Michiel Hendriks added a comment - - edited We recently updated to 1.555 and we're also suffering from severe memory leaking. Within a day it has burned through 4G of heap. Analyzing a heap dump with MAT gives me this info: Class Name | Objects | Shallow Heap | Retained Heap ---------------------------------------------------------------------------------------------------------------- sun.security.ssl.SSLSocketImpl | 9,662 | 1,468,624 | >= 1,173,575,432 java.lang.ref.Finalizer | 4,384,964 | 175,398,560 | >= 702,958,432 java.lang. String | 4,933,784 | 157,881,088 | >= 690,034,640 sun.security.ssl.SSLContextImpl | 9,426 | 377,040 | >= 690,027,624 org.tmatesoft.svn.core.internal.wc.DefaultSVNAuthenticationManager| 7,424 | 475,136 | >= 670,023,712 java.lang. String [] | 85,446 | 20,335,312 | >= 651,775,152 org.tmatesoft.svn.core.internal.wc.SVNCompositeConfigFile | 14,850 | 356,400 | >= 651,686,160 org.tmatesoft.svn.core.internal.wc.SVNConfigFile | 29,700 | 950,400 | >= 651,329,760 byte [] | 312,123 | 618,359,960 | >= 618,359,960 char [] | 4,967,656 | 541,210,488 | >= 541,210,488 ----------------------------------------------------------------------------------------------------------------

          I've now reverted back to 1.553 to see if that one can run for longer than a day.

          Michiel Hendriks added a comment - I've now reverted back to 1.553 to see if that one can run for longer than a day.

          Michiel Hendriks added a comment - - edited

          So far 1.553 is running without issues. I've also started logging the usage of Old Gen. And it has barely exceeded 10% (-Xmx4G).

          Michiel Hendriks added a comment - - edited So far 1.553 is running without issues. I've also started logging the usage of Old Gen. And it has barely exceeded 10% (-Xmx4G).

          This memory leak is caused by a bug in OpenJDK 1.6.0 build 30: https://java.net/jira/browse/OPENJDK6-29
          Make sure you upgrade to build 31 or later.

          Michiel Hendriks added a comment - This memory leak is caused by a bug in OpenJDK 1.6.0 build 30: https://java.net/jira/browse/OPENJDK6-29 Make sure you upgrade to build 31 or later.

            Unassigned Unassigned
            asmodai Jeroen Ruigrok van der Werven
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: