Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-21422

Jenkins crashing due to out of memory when rebuilding jobs

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • build-pipeline-plugin, core
    • None
    • linux

      After an upgrade to version 1.546 Jenkins became very unstable crashing 2-4 times per day. From what I can tell it is because of missing junitResult.xml files when jenkins tries to load the test trend.

      Attached is a snippet from the log where you can see the FileNotFoundException for the junitResult.xml file followed by a StackOverFlow error and then an out of memory error.

      Also attached is a screen shot from a heap analysis showing that the request handler thread building the test trend is holding a considerable amount of the heap. Please note that this particular heap analysis is from a different crash than the log snippet. I did not analyze the heap on the crash shown in the log, but I am sure it would show the test trend for the same job that throw the StackOverFlow.

          [JENKINS-21422] Jenkins crashing due to out of memory when rebuilding jobs

          Clint Parham added a comment -

          I'm also running into "OutOfMemoryError: Java heap space" errors using the Build Pipeline View. Jenkins was fine running our jobs using ~130MB of heap. But since adding the Build Pipeline plugin we see heap memory spike to over 1.5GB when opening a single job page belonging to a pipeline. As soon as we disabled the Pipeline plugin, we could open the same job page and saw no increase in heap usage.

          Running Jenkins 1.602 and Build Pipline 1.4.7

          Partial stacktrace:
          Jun 2, 2015 1:39:25 PM org.eclipse.jetty.util.log.JavaUtilLog warn
          WARNING: Error while serving http://192.168.2.85:8081/job/Pipeline_MAT_Build/test/trendMap
          java.lang.reflect.InvocationTargetException
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298)
          at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161)
          ...
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
          at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)
          Caused by: java.lang.OutOfMemoryError: Java heap space

          Clint Parham added a comment - I'm also running into "OutOfMemoryError: Java heap space" errors using the Build Pipeline View. Jenkins was fine running our jobs using ~130MB of heap. But since adding the Build Pipeline plugin we see heap memory spike to over 1.5GB when opening a single job page belonging to a pipeline. As soon as we disabled the Pipeline plugin, we could open the same job page and saw no increase in heap usage. Running Jenkins 1.602 and Build Pipline 1.4.7 Partial stacktrace: Jun 2, 2015 1:39:25 PM org.eclipse.jetty.util.log.JavaUtilLog warn WARNING: Error while serving http://192.168.2.85:8081/job/Pipeline_MAT_Build/test/trendMap java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) ... at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Java heap space

          Daniel Beck added a comment -

          My guess would be the excessive serialization in Build Pipeline's build cause. The affected job's build.xml files would be interesting.

          Daniel Beck added a comment - My guess would be the excessive serialization in Build Pipeline's build cause. The affected job's build.xml files would be interesting.

          Henrik Kirk added a comment -

          Have the same problem. Attached is the build.xml file. Hope it helps in some way.

          <project>
          <actions/>
          <description/>
          <keepDependencies>false</keepDependencies>
          <properties>
          <com.chikli.hudson.plugin.naginator.NaginatorOptOutProperty plugin="naginator@1.16">
          <optOut>false</optOut>
          </com.chikli.hudson.plugin.naginator.NaginatorOptOutProperty>
          <com.gmail.ikeike443.PlayAutoTestJobProperty plugin="play-autotest-plugin@0.0.12"/>
          <jenkins.plugins.slack.SlackNotifier_-SlackJobProperty plugin="slack@1.8">
          <teamDomain>busywait</teamDomain>
          <token>p9kF0Mugbid5lGNqossRRoIm</token>
          <room/>
          <startNotification>false</startNotification>
          <notifySuccess>false</notifySuccess>
          <notifyAborted>false</notifyAborted>
          <notifyNotBuilt>false</notifyNotBuilt>
          <notifyUnstable>false</notifyUnstable>
          <notifyFailure>true</notifyFailure>
          <notifyBackToNormal>true</notifyBackToNormal>
          <notifyRepeatedFailure>false</notifyRepeatedFailure>
          <includeTestSummary>true</includeTestSummary>
          <showCommitList>false</showCommitList>
          <includeCustomMessage>false</includeCustomMessage>
          <customMessage/>
          </jenkins.plugins.slack.SlackNotifier_-SlackJobProperty>
          </properties>
          <scm class="hudson.plugins.git.GitSCM" plugin="git@2.4.0">
          <configVersion>2</configVersion>
          <userRemoteConfigs>
          <hudson.plugins.git.UserRemoteConfig>
          <url>git@10.0.0.1:henrik/project.git</url>
          <credentialsId>1013fa52-89a7-42d3-8007-ff92c65fb56a</credentialsId>
          </hudson.plugins.git.UserRemoteConfig>
          </userRemoteConfigs>
          <branches>
          <hudson.plugins.git.BranchSpec>
          <name>*/master</name>
          </hudson.plugins.git.BranchSpec>
          </branches>
          <doGenerateSubmoduleConfigurations>false</doGenerateSubmoduleConfigurations>
          <submoduleCfg class="list"/>
          <extensions/>
          </scm>
          <canRoam>true</canRoam>
          <disabled>false</disabled>
          <blockBuildWhenDownstreamBuilding>false</blockBuildWhenDownstreamBuilding>
          <blockBuildWhenUpstreamBuilding>false</blockBuildWhenUpstreamBuilding>
          <triggers>
          <hudson.triggers.SCMTrigger>
          <spec>H/5 * * * *</spec>
          <ignorePostCommitHooks>false</ignorePostCommitHooks>
          </hudson.triggers.SCMTrigger>
          </triggers>
          <concurrentBuild>false</concurrentBuild>
          <builders>
          <hudson.tasks.Shell>
          <command>
          cd web; /opt/activator-1.3.6-minimal/activator test;
          </command>
          </hudson.tasks.Shell>
          </builders>
          <publishers>
          <hudson.tasks.junit.JUnitResultArchiver plugin="junit@1.9">
          <testResults>web/target/test-reports/*.xml</testResults>
          <keepLongStdio>false</keepLongStdio>
          <healthScaleFactor>1.0</healthScaleFactor>
          </hudson.tasks.junit.JUnitResultArchiver>
          <com.chikli.hudson.plugin.naginator.NaginatorPublisher plugin="naginator@1.16">
          <regexpForRerun/>
          <rerunIfUnstable>true</rerunIfUnstable>
          <rerunMatrixPart>false</rerunMatrixPart>
          <checkRegexp>false</checkRegexp>
          <regexpForMatrixParent>false</regexpForMatrixParent>
          <delay class="com.chikli.hudson.plugin.naginator.FixedDelay">
          <delay>30</delay>
          </delay>
          <maxSchedule>1</maxSchedule>
          </com.chikli.hudson.plugin.naginator.NaginatorPublisher>
          </publishers>
          <buildWrappers>
          <hudson.plugins.build__timeout.BuildTimeoutWrapper plugin="build-timeout@1.15">
          <strategy class="hudson.plugins.build_timeout.impl.AbsoluteTimeOutStrategy">
          <timeoutMinutes>10</timeoutMinutes>
          </strategy>
          <operationList>
          <hudson.plugins.build__timeout.operations.FailOperation/>
          </operationList>
          </hudson.plugins.build__timeout.BuildTimeoutWrapper>
          </buildWrappers>
          </project>

          For Im 99% sure it is only happening when rebuilding after a failed build.

          Henrik Kirk added a comment - Have the same problem. Attached is the build.xml file. Hope it helps in some way. <project> <actions/> <description/> <keepDependencies>false</keepDependencies> <properties> <com.chikli.hudson.plugin.naginator.NaginatorOptOutProperty plugin="naginator@1.16"> <optOut>false</optOut> </com.chikli.hudson.plugin.naginator.NaginatorOptOutProperty> <com.gmail.ikeike443.PlayAutoTestJobProperty plugin="play-autotest-plugin@0.0.12"/> <jenkins.plugins.slack.SlackNotifier_-SlackJobProperty plugin="slack@1.8"> <teamDomain>busywait</teamDomain> <token>p9kF0Mugbid5lGNqossRRoIm</token> <room/> <startNotification>false</startNotification> <notifySuccess>false</notifySuccess> <notifyAborted>false</notifyAborted> <notifyNotBuilt>false</notifyNotBuilt> <notifyUnstable>false</notifyUnstable> <notifyFailure>true</notifyFailure> <notifyBackToNormal>true</notifyBackToNormal> <notifyRepeatedFailure>false</notifyRepeatedFailure> <includeTestSummary>true</includeTestSummary> <showCommitList>false</showCommitList> <includeCustomMessage>false</includeCustomMessage> <customMessage/> </jenkins.plugins.slack.SlackNotifier_-SlackJobProperty> </properties> <scm class="hudson.plugins.git.GitSCM" plugin="git@2.4.0"> <configVersion>2</configVersion> <userRemoteConfigs> <hudson.plugins.git.UserRemoteConfig> <url>git@10.0.0.1:henrik/project.git</url> <credentialsId>1013fa52-89a7-42d3-8007-ff92c65fb56a</credentialsId> </hudson.plugins.git.UserRemoteConfig> </userRemoteConfigs> <branches> <hudson.plugins.git.BranchSpec> <name>*/master</name> </hudson.plugins.git.BranchSpec> </branches> <doGenerateSubmoduleConfigurations>false</doGenerateSubmoduleConfigurations> <submoduleCfg class="list"/> <extensions/> </scm> <canRoam>true</canRoam> <disabled>false</disabled> <blockBuildWhenDownstreamBuilding>false</blockBuildWhenDownstreamBuilding> <blockBuildWhenUpstreamBuilding>false</blockBuildWhenUpstreamBuilding> <triggers> <hudson.triggers.SCMTrigger> <spec>H/5 * * * *</spec> <ignorePostCommitHooks>false</ignorePostCommitHooks> </hudson.triggers.SCMTrigger> </triggers> <concurrentBuild>false</concurrentBuild> <builders> <hudson.tasks.Shell> <command> cd web; /opt/activator-1.3.6-minimal/activator test; </command> </hudson.tasks.Shell> </builders> <publishers> <hudson.tasks.junit.JUnitResultArchiver plugin="junit@1.9"> <testResults>web/target/test-reports/*.xml</testResults> <keepLongStdio>false</keepLongStdio> <healthScaleFactor>1.0</healthScaleFactor> </hudson.tasks.junit.JUnitResultArchiver> <com.chikli.hudson.plugin.naginator.NaginatorPublisher plugin="naginator@1.16"> <regexpForRerun/> <rerunIfUnstable>true</rerunIfUnstable> <rerunMatrixPart>false</rerunMatrixPart> <checkRegexp>false</checkRegexp> <regexpForMatrixParent>false</regexpForMatrixParent> <delay class="com.chikli.hudson.plugin.naginator.FixedDelay"> <delay>30</delay> </delay> <maxSchedule>1</maxSchedule> </com.chikli.hudson.plugin.naginator.NaginatorPublisher> </publishers> <buildWrappers> <hudson.plugins.build__timeout.BuildTimeoutWrapper plugin="build-timeout@1.15"> <strategy class="hudson.plugins.build_timeout.impl.AbsoluteTimeOutStrategy"> <timeoutMinutes>10</timeoutMinutes> </strategy> <operationList> <hudson.plugins.build__timeout.operations.FailOperation/> </operationList> </hudson.plugins.build__timeout.BuildTimeoutWrapper> </buildWrappers> </project> For Im 99% sure it is only happening when rebuilding after a failed build.

          Henrik Kirk added a comment -

          This also deletes the "Test Trends"

          Henrik Kirk added a comment - This also deletes the "Test Trends"

          Matthew Weiss added a comment -

          Hi is there any update on this ticket? I am running into the same issue using Jenkins version 1.642.2 and the Multijob plugin version 1.20 and believe I am getting the same behavior. After rebuilding eventually the Jenkins instance crashes until I restart it at which point it will break eventually if I run the Multijob multiple times.

          Matthew Weiss added a comment - Hi is there any update on this ticket? I am running into the same issue using Jenkins version 1.642.2 and the Multijob plugin version 1.20 and believe I am getting the same behavior. After rebuilding eventually the Jenkins instance crashes until I restart it at which point it will break eventually if I run the Multijob multiple times.

          Daniel Beck added a comment -

          henrikkirk That's a config.xml, not a build.xml.

          Daniel Beck added a comment - henrikkirk That's a config.xml, not a build.xml.

          Marc Popp added a comment -

          As a temp. workaround: We increased the JavaVM's Memory to 2G and did not run in the problem very often anymore.

          Marc Popp added a comment - As a temp. workaround: We increased the JavaVM's Memory to 2G and did not run in the problem very often anymore.

          Dan Alvizu added a comment -

          I do not have an update on this - are you sure this is the correct ticket matthew? this is specific to the build pipeline plugin.

          Dan Alvizu added a comment - I do not have an update on this - are you sure this is the correct ticket matthew? this is specific to the build pipeline plugin.

          Matthew Weiss added a comment -

          dalvizu sorry, to be honest I'm not 100% sure, but it seems eerily similar. I'm going to watch my master's memory as the job runs tonight and see if it's a similar memory problem.

          Matthew Weiss added a comment - dalvizu sorry, to be honest I'm not 100% sure, but it seems eerily similar. I'm going to watch my master's memory as the job runs tonight and see if it's a similar memory problem.

          Dan Alvizu added a comment -

          The issue in this ticket happens if you have a build pipeline windows open for a long period while status is refreshed - previous snippets are serialized to the session and they can't be freed, eventually causing OOM. There isn't an easy fix - either finding a way to free them (which is deep in stapler i believe) or use a different UI technology are not quick or easy options.

          Dan Alvizu added a comment - The issue in this ticket happens if you have a build pipeline windows open for a long period while status is refreshed - previous snippets are serialized to the session and they can't be freed, eventually causing OOM. There isn't an easy fix - either finding a way to free them (which is deep in stapler i believe) or use a different UI technology are not quick or easy options.

            dalvizu Dan Alvizu
            jocarli John Carlile
            Votes:
            2 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: