-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
RHEL 6
Jenkins 1.493
We have about 200+ jobs that are polling Perforce from Jenkins and we noticed after a short period, Jenkins would start throwing OutOfMemoryErrors in the log whenever it tried to poll Perforce. After a period of a few hours, Jenkins stops becoming responsive to HTTP requests. Here's the error logged by Jenkins (project name replaced with $PROJECT_NAME):
Apr 16, 2013 8:11:25 AM hudson.triggers.Trigger checkTriggers
WARNING: hudson.triggers.SCMTrigger.run() failed for $PROJECT_NAME
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:691)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:943)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1336)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
at hudson.util.SequentialExecutionQueue$QueueEntry.submit(SequentialExecutionQueue.java:108)
at hudson.util.SequentialExecutionQueue$QueueEntry.access$100(SequentialExecutionQueue.java:95)
at hudson.util.SequentialExecutionQueue.execute(SequentialExecutionQueue.java:66)
at hudson.triggers.SCMTrigger.run(SCMTrigger.java:128)
at hudson.triggers.SCMTrigger.run(SCMTrigger.java:102)
at hudson.triggers.Trigger.checkTriggers(Trigger.java:261)
at hudson.triggers.Trigger$Cron.doRun(Trigger.java:209)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
When we investigated further, we found that Jenkins' Perforce plugin had created a large number of 'p4 client -i' processes but they were just sitting there. Some had become defunct over time. Here's a sample of the output I see from a 'ps -ef | grep jenkins' command. The actual output contains about 590 processes.
jenkins 7257 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7262 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7273 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7278 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7281 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7282 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7283 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7285 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7287 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7288 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7289 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7290 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7291 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7292 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7294 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7295 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7298 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7302 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7306 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7309 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7312 1 3 06:08 ? 00:04:54 /usr/bin/java -Dcom.sun.akuma.Daemon=daemonized -Djava.awt.headless=true -DJENKINS_HOME=/var/lib/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --daemon --httpPort=8080 --ajp13Port=8009 --debug=5 --handlerCountStartup=50 --handlerCountMax=100 --handlerCountMaxIdle=50 --accessLoggerClassName=winstone.accesslog.SimpleAccessLogger --simpleAccessLogger.format=combined --simpleAccessLogger.file=/var/log/jenkins/access_log
jenkins 7313 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7325 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7331 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7356 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7357 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7358 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7359 7312 0 07:21 ? 00:00:00 [p4] <defunct>
jenkins 7490 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i
jenkins 7495 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i
jenkins 7499 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i
As you can see, I suspected that this might be due to the same issue as JENKINS-16429, so I increased the handlers, but that only allowed additional threads to be created before Jenkins starts failing.