Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65873

java.lang.OutOfMemoryError: unable to create new native thread

    XMLWordPrintable

Details

    Description

      We regularly see issues with the jenkins/inbound-agent in our Jenkins logs on Kubernetes. It seems to occur in around 1% of all jobs.

      The error message is below.

      Whilst the error message refers to java.lang.OutOfMemoryError: and unable to create new native thread we have checked the pods and nodes in the cluster and there is always sufficient memory or threads available at the time of the error.

      The specific versions for this specific error message are:

      jenkins/inbound-agent:4.3-4

      Jenkins 2.263.4

      However we have also seen this error occur with different versions of both the inbound-agent and Jenkins.

      Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from ip-100-64-244-120.eu-west-1.compute.internal/100.64.244.120:39138
      	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1800)
      	at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
      	at hudson.remoting.Channel.call(Channel.java:1001)
      	at hudson.FilePath.act(FilePath.java:1157)
      	at hudson.FilePath.act(FilePath.java:1146)
      	at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:121)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:904)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:835)
      	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1288)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
      	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:717)
      	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
      	at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
      	at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
      	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
      	at jenkins.util.InterceptingScheduledExecutorService.schedule(InterceptingScheduledExecutorService.java:49)
      	at org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream.reschedule(DelayBufferedOutputStream.java:72)
      	at org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream.<init>(DelayBufferedOutputStream.java:68)
      	at org.jenkinsci.plugins.workflow.log.BufferedBuildListener$Replacement.readResolve(BufferedBuildListener.java:77)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1260)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2133)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:465)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:423)
      	at hudson.remoting.UserRequest.deserialize(UserRequest.java:290)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:189)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
      	at hudson.remoting.Request$2.run(Request.java:369)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:117)
      Caused: java.io.IOException: Remote call on JNLP4-connect connection from ip-100-64-244-120.eu-west-1.compute.internal/100.64.244.120:39138 failed
      	at hudson.remoting.Channel.call(Channel.java:1007)
      	at hudson.FilePath.act(FilePath.java:1157)
      	at hudson.FilePath.act(FilePath.java:1146)
      	at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:121)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:904)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:835)
      	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1288)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
      	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

      Attachments

        Issue Links

          Activity

            I could have, however since Jenkins is only starting to support java 17 in preview, I don't know what kind of surprises could arise from trying out a version that is not battle tested in our context. In any case I have built a jdk11 with the referenced commit cherry-picked and I'm currently running my reproduction harness to check whether the problem is gone.

            vlatombe Vincent Latombe added a comment - I could have, however since Jenkins is only starting to support java 17 in preview, I don't know what kind of surprises could arise from trying out a version that is not battle tested in our context. In any case I have built a jdk11 with the referenced commit cherry-picked and I'm currently running my reproduction harness to check whether the problem is gone.

            Cherry-picking https://github.com/openjdk/jdk/commit/e35005d5ce383ddd108096a3079b17cb0bcf76f1 on jdk11 and running the harness overnight shows a very significant reduction of the number of occurrences of the problem (0.04% instead of 0.2% over 5000-6000 builds)

            vlatombe Vincent Latombe added a comment - Cherry-picking https://github.com/openjdk/jdk/commit/e35005d5ce383ddd108096a3079b17cb0bcf76f1 on jdk11 and running the harness overnight shows a very significant reduction of the number of occurrences of the problem (0.04% instead of 0.2% over 5000-6000 builds)
            basil Basil Crow added a comment -

            jenkinsci/remoting#523 has been released in Remoting 4.14 and Jenkins 2.348. This helps alleviate the problem to some degree, but it does not eliminate the problem.

            Backporting JDK-8268773 / openjdk/jdk@e35005d5ce3 showed a significant reduction of the number of occurrences of the problem (0.04% instead of 0.2% over 5000-6000 builds). JDK-8268773 / openjdk/jdk@e35005d5ce3 has been backported to jdk11u-dev in JDK-8286753 / openjdk/jdk11u-dev#1074 and to jdk17u-dev in JDK-8286629 / openjdk/jdk17u-dev#390.

            basil Basil Crow added a comment - jenkinsci/remoting#523 has been released in Remoting 4.14 and Jenkins 2.348 . This helps alleviate the problem to some degree, but it does not eliminate the problem. Backporting JDK-8268773 / openjdk/jdk@ e35005d5ce3 showed a significant reduction of the number of occurrences of the problem (0.04% instead of 0.2% over 5000-6000 builds). JDK-8268773 / openjdk/jdk@ e35005d5ce3 has been backported to jdk11u-dev in JDK-8286753 / openjdk/jdk11u-dev#1074 and to jdk17u-dev in JDK-8286629 / openjdk/jdk17u-dev#390 .

            Thank you basil!

            vlatombe Vincent Latombe added a comment - Thank you basil !
            rhinoceros rhinoceros.xn added a comment - - edited

            After adding sleep(10) before git checkout this problem no longer occurs.

            Maybe sleep(10) before git or checkout step is a workaround.

            wasimj  dg424 

             

            sleep(10)
            checkout changelog: false, poll: false, scm: ........
            
            OR
            
            sleep(10)
            git branch: 'master', credentialsId: '******', url: 'git@git.yourcomampy.com:xx/zz.git'
            rhinoceros rhinoceros.xn added a comment - - edited After adding sleep(10) before git checkout this problem no longer occurs. Maybe sleep(10) before git or checkout step is a workaround. wasimj   dg424     sleep(10) checkout changelog: false , poll: false , scm: ........ OR sleep(10) git branch: 'master' , credentialsId: '******' , url: 'git@git.yourcomampy.com:xx/zz.git'

            People

              vlatombe Vincent Latombe
              wasimj Wasim
              Votes:
              9 Vote for this issue
              Watchers:
              24 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: