-
Bug
-
Resolution: Incomplete
-
Blocker
-
Debian, 64-bit, 16 GB RAM, 8 processors.
Jenkins running in Jetty with JAVA_OPTIONS="-Xmx640m -XX:MaxPermSize=128m"
~250-300 jobs
The other day I upgraded from Jenkins 1.413 to 1.446. I migrated the data which it reported as old to the new format and upgraded the plugins respectively. All seemed to work fine. Then all of a sudden within an hour or so we hit:
Exception in thread "Ping thread for channel hudson.remoting.Channel@5d1dce7a:bud" java.lang.OutOfMemoryError: Java heap space Exception in thread "Thread-109599" java.lang.OutOfMemoryError: Java heap space Jan 8, 2012 12:10:19 PM hudson.triggers.SafeTimerTask run SEVERE: Timer task hudson.diagnosis.MemoryUsageMonitor@3b9e30a1 failed java.lang.OutOfMemoryError: Java heap space at hudson.model.TimeSeries.update(TimeSeries.java:74) at hudson.model.MultiStageTimeSeries.update(MultiStageTimeSeries.java:118) at hudson.diagnosis.MemoryUsageMonitor$MemoryGroup.update(MemoryUsageMonitor.java:95) at hudson.diagnosis.MemoryUsageMonitor$MemoryGroup.access$100(MemoryUsageMonitor.java:54) at hudson.diagnosis.MemoryUsageMonitor.doRun(MemoryUsageMonitor.java:123) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) 2012-01-08 12:25:06.254:WARN::Problem scavenging sessions java.lang.OutOfMemoryError: Java heap space
We have the following setup:
- Linux master (6 executors)
- 2 linux nodes (4 executors + 8 executors)
- 2 windows nodes (both running 4 executors)
We have 250-300 jobs. 90% of them are also producing Sonar reports.
This totally crashes Jenkins and it is unreachable after the OOME/PermGen.
Jenkins is running inside a Jetty.
After hitting this problem, I tried increasing the memory settings of Jetty (which was working fine without any memory settings back in 1.413) to:
JAVA_OPTIONS="-Xmx640m -XX:MaxPermSize=128m"
This seemed to work. Until it became apparent that it only slowed the re-occurrence of the problem.
So, I started jstatd and connected via visualvm and saw:
Permgen: 128m / 90m used.
OldGen: 426.6m / 426.6 m used
Please, fix as it's a blocker for us and I now have to resort to rolling back to our last know good version, i.e. 1.413.
Thanks in advance for looking into this!
The GC stats show that in the 2nd case PermGen didn't seem to be the problem, but OldGen was. Did you try to increase the OldGen heap space - via -Xmx - too?
BTW: it's not uncommon for new versions with new functionality to use slightly more heap and perm space. As long as there's no real memory leak - i.e. memory usage continuously increasing - I wouldn't consider this being a bug.
If you encouter such a problem again, please try to get a heap dump so we can track the problem down.