Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-17630

java.lang.OutOfMemoryError: unable to create new native thread

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • p4-plugin
    • None
    • RHEL 6
      Jenkins 1.493

      We have about 200+ jobs that are polling Perforce from Jenkins and we noticed after a short period, Jenkins would start throwing OutOfMemoryErrors in the log whenever it tried to poll Perforce. After a period of a few hours, Jenkins stops becoming responsive to HTTP requests. Here's the error logged by Jenkins (project name replaced with $PROJECT_NAME):

      Apr 16, 2013 8:11:25 AM hudson.triggers.Trigger checkTriggers
      WARNING: hudson.triggers.SCMTrigger.run() failed for $PROJECT_NAME
      java.lang.OutOfMemoryError: unable to create new native thread
      at java.lang.Thread.start0(Native Method)
      at java.lang.Thread.start(Thread.java:691)
      at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:943)
      at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1336)
      at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
      at hudson.util.SequentialExecutionQueue$QueueEntry.submit(SequentialExecutionQueue.java:108)
      at hudson.util.SequentialExecutionQueue$QueueEntry.access$100(SequentialExecutionQueue.java:95)
      at hudson.util.SequentialExecutionQueue.execute(SequentialExecutionQueue.java:66)
      at hudson.triggers.SCMTrigger.run(SCMTrigger.java:128)
      at hudson.triggers.SCMTrigger.run(SCMTrigger.java:102)
      at hudson.triggers.Trigger.checkTriggers(Trigger.java:261)
      at hudson.triggers.Trigger$Cron.doRun(Trigger.java:209)
      at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
      at java.util.TimerThread.mainLoop(Timer.java:555)
      at java.util.TimerThread.run(Timer.java:505)

      When we investigated further, we found that Jenkins' Perforce plugin had created a large number of 'p4 client -i' processes but they were just sitting there. Some had become defunct over time. Here's a sample of the output I see from a 'ps -ef | grep jenkins' command. The actual output contains about 590 processes.

      jenkins 7257 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7262 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7273 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7278 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7281 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7282 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7283 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7285 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7287 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7288 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7289 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7290 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7291 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7292 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7294 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7295 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7298 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7302 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7306 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7309 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7312 1 3 06:08 ? 00:04:54 /usr/bin/java -Dcom.sun.akuma.Daemon=daemonized -Djava.awt.headless=true -DJENKINS_HOME=/var/lib/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --daemon --httpPort=8080 --ajp13Port=8009 --debug=5 --handlerCountStartup=50 --handlerCountMax=100 --handlerCountMaxIdle=50 --accessLoggerClassName=winstone.accesslog.SimpleAccessLogger --simpleAccessLogger.format=combined --simpleAccessLogger.file=/var/log/jenkins/access_log
      jenkins 7313 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7325 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7331 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7356 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7357 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7358 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7359 7312 0 07:21 ? 00:00:00 [p4] <defunct>
      jenkins 7490 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i
      jenkins 7495 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i
      jenkins 7499 7312 0 07:21 ? 00:00:00 /usr/local/sbin/p4 -s client -i

      As you can see, I suspected that this might be due to the same issue as JENKINS-16429, so I increased the handlers, but that only allowed additional threads to be created before Jenkins starts failing.

          [JENKINS-17630] java.lang.OutOfMemoryError: unable to create new native thread

          Rob Petti added a comment -

          Please provide your Jenkins Perforce Plugin version as well as your Java version.

          Does polling seem to actually work in your case?

          Rob Petti added a comment - Please provide your Jenkins Perforce Plugin version as well as your Java version. Does polling seem to actually work in your case?

          porterhouse91 added a comment -

          1.3.20

          This occurred in 1.3.21 as well. It appears that this might be a volume issue. I just switched the SCM polling limit to 10 and it doesn't seem to be happening. Previously, this behavior would emerge after a few hours bringing Jenkins to it's knees. Since all of our 200+ jobs are polling every minute, this is likely causing some contention.

          porterhouse91 added a comment - 1.3.20 This occurred in 1.3.21 as well. It appears that this might be a volume issue. I just switched the SCM polling limit to 10 and it doesn't seem to be happening. Previously, this behavior would emerge after a few hours bringing Jenkins to it's knees. Since all of our 200+ jobs are polling every minute, this is likely causing some contention.

          porterhouse91 added a comment -

          I was able to avoid the the issue by throttling the number of concurrent SCM polling requests. I set the value down to 10 and now everything seems to be fine. This is likely still an issue with the plugin as the number of polling clients increases. I am happy to help test or resolve the issue.

          porterhouse91 added a comment - I was able to avoid the the issue by throttling the number of concurrent SCM polling requests. I set the value down to 10 and now everything seems to be fine. This is likely still an issue with the plugin as the number of polling clients increases. I am happy to help test or resolve the issue.

          Rob Petti added a comment -

          If you hammer perforce too quickly, especially with spec updates, it tends to deadlock and hang. I don't think there is any way around this, unfortunately.

          Rob Petti added a comment - If you hammer perforce too quickly, especially with spec updates, it tends to deadlock and hang. I don't think there is any way around this, unfortunately.

          Rob Petti added a comment -

          Not a blocker since there is a work-around.

          Rob Petti added a comment - Not a blocker since there is a work-around.

          porterhouse91 added a comment -

          Agree with downgrading the ticket, and I also agree that this may not be avoidable. It was a good lesson learned on our end.

          porterhouse91 added a comment - Agree with downgrading the ticket, and I also agree that this may not be avoidable. It was a good lesson learned on our end.

            Unassigned Unassigned
            porterhouse91 porterhouse91
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: