• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker

      I get the following stack trace once a day:

      Exception:
      FATAL: unable to create new native thread
      java.lang.OutOfMemoryError: unable to create new native thread
          at java.lang.Thread.start0(Native Method)
          at java.lang.Thread.start(Thread.java:640)
          at hudson.Proc$LocalProc.<init>(Proc.java:264)
          at hudson.Proc$LocalProc.<init>(Proc.java:216)
          at hudson.Launcher$LocalLauncher.launch(Launcher.java:763)
          at hudson.Launcher$ProcStarter.start(Launcher.java:353)
          at hudson.plugins.perforce.HudsonP4DefaultExecutor.exec(HudsonP4DefaultExecutor.java:79)
          at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:321)
          at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:292)
          at com.tek42.perforce.parse.Workspaces.getWorkspace(Workspaces.java:61)
          at hudson.plugins.perforce.PerforceSCM.getPerforceWorkspace(PerforceSCM.java:1467)
          at hudson.plugins.perforce.PerforceSCM.getPerforceWorkspace(PerforceSCM.java:1428)
          at hudson.plugins.perforce.PerforceSCM.checkout(PerforceSCM.java:792)
          at hudson.model.AbstractProject.checkout(AbstractProject.java:1325)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581)
          at hudson.model.Run.execute(Run.java:1516)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:236)
      

      The URL http:/jenkinsserver/threadDump shows that there are hundreds of ComThreads:

      ComThread for RequestHandlerThread[#301]
      "ComThread for RequestHandlerThread[#301]" Id=50942 Group=main RUNNABLE (in native)
          at com4j.Win32Lock.suspend0(Native Method)
          at com4j.Win32Lock.suspend(Win32Lock.java:37)
          at com4j.ComThread.run0(ComThread.java:146)
          at com4j.ComThread.run(ComThread.java:134)
      

      The Monitoring plugin (https://wiki.jenkins-ci.org/display/JENKINS/Monitoring) shows that about 40 threads are added every hour. I tried to kill the ComThreads with the plugin but this did not work.

      As far as I understand the threads belong to the Active Directory plugin but I do not understand why they are created / never finish.

      I have activated logging for the Active Directory plugin by enabling the following loggers (all with log level all):

      • hudson.plugins.active_directory
      • org.acegisecurity
      • com4j

      The log contains a lot of entries like this:

      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: SecurityContextHolder now cleared, as request processing completed
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.providers.anonymous.AnonymousProcessingFilter doFilter
      FINE: Populated SecurityContextHolder with anonymous token: 'org.acegisecurity.providers.anonymous.AnonymousAuthenticationToken@7ceb964f: Username: anonymous; Password: [PROTECTED]; Authenticated: true; Details: org.acegisecurity.ui.WebAuthenticationDetails@1de6: RemoteIpAddress: ##.##.##.##; SessionId: null; Granted Authorities: '
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter storeSecurityContextInSession
      FINE: The HttpSession is currently null, and the HttpSessionContextIntegrationFilter is prohibited from creating an HttpSession (because the allowSessionCreation property is false) - SecurityContext thus not stored for next request
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.basicauth.BasicProcessingFilter doFilter
      FINE: Authorization header: null
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.ExceptionTranslationFilter doFilter
      FINE: Chain processed normally
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: New SecurityContext instance will be associated with SecurityContextHolder
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter readSecurityContextFromSession
      FINE: No HttpSession currently exists
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: SecurityContextHolder now cleared, as request processing completed
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter storeSecurityContextInSession
      FINE: The HttpSession is currently null, and the HttpSessionContextIntegrationFilter is prohibited from creating an HttpSession (because the allowSessionCreation property is false) - SecurityContext thus not stored for next request
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.ExceptionTranslationFilter doFilter
      FINE: Chain processed normally
      

      A lot of traffic (about four requests per second) is caused by the desktop application Hudson Tray Tracker (https://code.google.com/p/hudson-tray-tracker/). This application does not log in (it uses anonymous access).

      I have also

      • enabled access logging, but found no unusual URL requests
      • connected with VisualVM (http://visualvm.java.net/) to the Jenkins JVM, but got no more informations
      • created a PowerShell script that logs the number of threads, handles, working set and the virtual size of the Java process, but it only shows that all numbers are constantly increasing:
      Date Time Pid Thread Handle
      Count
      WorkingSet VirtualSize
      2013-01-20 22:01:31 740 86 1380 474.285 MB 1,291.566 MB
      2013-01-20 23:01:31 740 111 1719 520.957 MB 1,309.094 MB
      2013-01-21 00:01:31 740 138 2094 576.602 MB 1,322.531 MB
      2013-01-21 01:01:30 740 161 2381 630.742 MB 1,334.031 MB
      2013-01-21 02:01:30 740 177 2563 635.363 MB 1,342.094 MB
      2013-01-21 03:01:30 740 203 2884 638.371 MB 1,355.031 MB
      2013-01-21 04:01:31 740 222 3192 660.984 MB 1,396.156 MB
      2013-01-21 05:01:30 740 243 3443 641.645 MB 1,406.609 MB
      2013-01-21 06:01:30 740 264 3719 648.918 MB 1,417.219 MB
      2013-01-21 07:01:31 740 294 4114 661.613 MB 1,432.156 MB
      2013-01-21 08:01:31 740 313 4357 647.117 MB 1,441.719 MB
      2013-01-21 09:01:30 740 344 4758 667.797 MB 1,457.219 MB
      2013-01-21 10:01:31 740 414 5701 742.559 MB 1,575.898 MB
      2013-01-21 11:01:31 740 502 6777 729.488 MB 1,620.813 MB
      2013-01-21 12:01:52 740 562 7533 740.715 MB 1,650.813 MB
      2013-01-21 13:01:31 740 616 8313 763.828 MB 1,677.922 MB

      The problem is related to system load: at weekend the numbers are increasing at a lower rate than at work days. Between Christmas and New Year it didn't happened at all as most developers have been on vacation.

      • created a Groovey script:
      def threads=com4j.ComThread.threads.toArray(new com4j.ComThread[com4j.ComThread.threads.size()]);
      def numberOfTerminatedPeerThreads=0
      
      println("The follwing ComThreads have a peer thread which has been terminated")
      for (t in threads) {
        if (t.getName().startsWith("ComThread for RequestHandlerThread[#") &&
            t.peer.getState().equals(Thread.State.TERMINATED) ) {
          println("ComThread: " + t.getName() + "  Peer: " + t.peer )
          numberOfTerminatedPeerThreads++
          //t.kill()
        }
      }
      
      println("\nnumber of terminated peer threads: " + numberOfTerminatedPeerThreads)
      

      result:

      The follwing ComThreads have a peer thread which has been terminated
      ComThread: ComThread for RequestHandlerThread[#357]  Peer: Thread[RequestHandlerThread[#357],5,]
      ComThread: ComThread for RequestHandlerThread[#258]  Peer: Thread[RequestHandlerThread[#258],5,]
      ...
      
      number of terminated peer threads: 506
      

          [JENKINS-16429] ComThread is leaking even when its peer is dead

          Bernhard Berbuir created issue -

          I have found a workaround: I have increased the maximum number of idle handler threads. I added the following parameters at the <arguments>-tag in the jenkins.xml file:
          --handlerCountStartup=50 --handlerCountMax=100 --handlerCountMaxIdle=50

          Now I do not have a single ComThread with a terminated peer thread and the JVM is running smoothly.

          I suspect that the continuous creating and destroying of threads caused the thread leak. The winstone servlet engine or the ComThread library seems not be able to handle such a scenario successfully.

          Bernhard Berbuir added a comment - I have found a workaround: I have increased the maximum number of idle handler threads. I added the following parameters at the <arguments> -tag in the jenkins.xml file: --handlerCountStartup=50 --handlerCountMax=100 --handlerCountMaxIdle=50 Now I do not have a single ComThread with a terminated peer thread and the JVM is running smoothly. I suspect that the continuous creating and destroying of threads caused the thread leak. The winstone servlet engine or the ComThread library seems not be able to handle such a scenario successfully.
          Bernhard Berbuir made changes -
          Labels Original: active_directory, outofmemoryerror threads windows New: active_directory, comthread outofmemoryerror threads windows winstone

          porterhouse91 added a comment -

          Berhard, how many Jenkins jobs do you have running? We started experiencing these issues due to an increased number of jobs (we think). Right now we are at ~200 jobs polling every minute.

          porterhouse91 added a comment - Berhard, how many Jenkins jobs do you have running? We started experiencing these issues due to an increased number of jobs (we think). Right now we are at ~200 jobs polling every minute.

          I have only a small number of jobs (18-24 jobs). I am using one master and do not have a slave.
          ==> all polling is done at the master

          I have set the option "Max # of concurrent polling" to 1 (Jenkins | Manage Jenkins | Configure System | SCM Polling)
          ==> Jenkins runs only one polling at a time (I have several jobs which are polling every minute).

          I assume that the Hudson Tray Tracker (HTT) is the problem as all ComThreads have a terminated RequestHandlerThread. HTT polls every 15 seconds for every job (its like a DDoS Tool )

          Bernhard Berbuir added a comment - I have only a small number of jobs (18-24 jobs). I am using one master and do not have a slave. ==> all polling is done at the master I have set the option "Max # of concurrent polling" to 1 (Jenkins | Manage Jenkins | Configure System | SCM Polling) ==> Jenkins runs only one polling at a time (I have several jobs which are polling every minute). I assume that the Hudson Tray Tracker (HTT) is the problem as all ComThreads have a terminated RequestHandlerThread. HTT polls every 15 seconds for every job (its like a DDoS Tool )
          Bernhard Berbuir made changes -
          Environment Original: Windows 7 Professional 64-bit with SP1
          Jenkins 1.498
          Active Directory plugin 1.30

          JVM: Java HotSpot(TM) Server VM (20.1-b02, mixed mode) (32-bit)
          Java: version 1.6.0_26, vendor Sun Microsystems Inc.
          JVM arguments:
          -Xrs
          -Xms512m
          -Xmx768m
          -XX:PermSize=64M
          -XX:MaxPermSize=128M
          -Xss256k
          -XX:+HeapDumpOnOutOfMemoryError
          -XX:HeapDumpPath=F:\JENKINS_HOME\OOM.hprof
          -Dcom.sun.management.jmxremote.port=8010
          -Dcom.sun.management.jmxremote.ssl=false
          -Dcom.sun.management.jmxremote.authenticate=true
          -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle
          New: Windows 7 Professional 64-bit with SP1
          Jenkins 1.498
          Active Directory plugin 1.30
          one master, no slaves

          JVM: Java HotSpot(TM) Server VM (20.1-b02, mixed mode) (32-bit)
          Java: version 1.6.0_26, vendor Sun Microsystems Inc.
          JVM arguments:
          -Xrs
          -Xms512m
          -Xmx768m
          -XX:PermSize=64M
          -XX:MaxPermSize=128M
          -Xss256k
          -XX:+HeapDumpOnOutOfMemoryError
          -XX:HeapDumpPath=F:\JENKINS_HOME\OOM.hprof
          -Dcom.sun.management.jmxremote.port=8010
          -Dcom.sun.management.jmxremote.ssl=false
          -Dcom.sun.management.jmxremote.authenticate=true
          -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle
          Bernhard Berbuir made changes -
          Environment Original: Windows 7 Professional 64-bit with SP1
          Jenkins 1.498
          Active Directory plugin 1.30
          one master, no slaves

          JVM: Java HotSpot(TM) Server VM (20.1-b02, mixed mode) (32-bit)
          Java: version 1.6.0_26, vendor Sun Microsystems Inc.
          JVM arguments:
          -Xrs
          -Xms512m
          -Xmx768m
          -XX:PermSize=64M
          -XX:MaxPermSize=128M
          -Xss256k
          -XX:+HeapDumpOnOutOfMemoryError
          -XX:HeapDumpPath=F:\JENKINS_HOME\OOM.hprof
          -Dcom.sun.management.jmxremote.port=8010
          -Dcom.sun.management.jmxremote.ssl=false
          -Dcom.sun.management.jmxremote.authenticate=true
          -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle
          New: Windows 7 Professional 64-bit with SP1
          Jenkins 1.498
          Winstone Servlet Engine v0.9.10
          Active Directory plugin 1.30
          one master, no slaves

          JVM: Java HotSpot(TM) Server VM (20.1-b02, mixed mode) (32-bit)
          Java: version 1.6.0_26, vendor Sun Microsystems Inc.
          JVM arguments:
          -Xrs
          -Xms512m
          -Xmx768m
          -XX:PermSize=64M
          -XX:MaxPermSize=128M
          -Xss256k
          -XX:+HeapDumpOnOutOfMemoryError
          -XX:HeapDumpPath=F:\JENKINS_HOME\OOM.hprof
          -Dcom.sun.management.jmxremote.port=8010
          -Dcom.sun.management.jmxremote.ssl=false
          -Dcom.sun.management.jmxremote.authenticate=true
          -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle

          porterhouse91 added a comment -

          I am fairly certain that our issue is related to the Perforce plugin. I filed a ticket with the plugin providers: https://issues.jenkins-ci.org/browse/JENKINS-17630

          Do you happen to see a large number of active 'p4 client -i' processes just sitting around? This seems to be the core of our problem.

          porterhouse91 added a comment - I am fairly certain that our issue is related to the Perforce plugin. I filed a ticket with the plugin providers: https://issues.jenkins-ci.org/browse/JENKINS-17630 Do you happen to see a large number of active 'p4 client -i' processes just sitting around? This seems to be the core of our problem.

          You seem to have a different problem, because I never had problems with Perforce threads.

          Bernhard Berbuir added a comment - You seem to have a different problem, because I never had problems with Perforce threads.

          Reassining to the perforce plugin

          Kohsuke Kawaguchi added a comment - Reassining to the perforce plugin
          Kohsuke Kawaguchi made changes -
          Component/s New: perforce [ 15506 ]
          Component/s Original: active-directory [ 15526 ]

            kohsuke Kohsuke Kawaguchi
            bernhardb Bernhard Berbuir
            Votes:
            9 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: