Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65873

java.lang.OutOfMemoryError: unable to create new native thread

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      We regularly see issues with the jenkins/inbound-agent in our Jenkins logs on Kubernetes. It seems to occur in around 1% of all jobs.

      The error message is below.

      Whilst the error message refers to java.lang.OutOfMemoryError: and unable to create new native thread we have checked the pods and nodes in the cluster and there is always sufficient memory or threads available at the time of the error.

      The specific versions for this specific error message are:

      jenkins/inbound-agent:4.3-4

      Jenkins 2.263.4

      However we have also seen this error occur with different versions of both the inbound-agent and Jenkins.

      Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from ip-100-64-244-120.eu-west-1.compute.internal/100.64.244.120:39138
      	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1800)
      	at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
      	at hudson.remoting.Channel.call(Channel.java:1001)
      	at hudson.FilePath.act(FilePath.java:1157)
      	at hudson.FilePath.act(FilePath.java:1146)
      	at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:121)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:904)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:835)
      	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1288)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
      	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:717)
      	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
      	at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
      	at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
      	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
      	at jenkins.util.InterceptingScheduledExecutorService.schedule(InterceptingScheduledExecutorService.java:49)
      	at org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream.reschedule(DelayBufferedOutputStream.java:72)
      	at org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream.<init>(DelayBufferedOutputStream.java:68)
      	at org.jenkinsci.plugins.workflow.log.BufferedBuildListener$Replacement.readResolve(BufferedBuildListener.java:77)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1260)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2133)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2342)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2266)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2124)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1625)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:465)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:423)
      	at hudson.remoting.UserRequest.deserialize(UserRequest.java:290)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:189)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
      	at hudson.remoting.Request$2.run(Request.java:369)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:117)
      Caused: java.io.IOException: Remote call on JNLP4-connect connection from ip-100-64-244-120.eu-west-1.compute.internal/100.64.244.120:39138 failed
      	at hudson.remoting.Channel.call(Channel.java:1007)
      	at hudson.FilePath.act(FilePath.java:1157)
      	at hudson.FilePath.act(FilePath.java:1146)
      	at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:121)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:904)
      	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:835)
      	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1288)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
      	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
      	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

        Attachments

          Issue Links

            Activity

            Hide
            vlatombe Vincent Latombe added a comment -

            Donald Gobin this is the agent side.

            Show
            vlatombe Vincent Latombe added a comment - Donald Gobin this is the agent side.
            Hide
            dg424 Donald Gobin added a comment -

            Hi Vincent Latombe

            But I see the remoting stack is on both sides (remoting.jar is in the jenkins.war file as well) and the stack trace in my comment above shows classes (org.jenkinsci.plugins.workflow.log, jenkins.util.InterceptingScheduledExecutorService) that I cannot find in remoting.jar on the the agent side. So, I'm actually not sure where the OOM is happening; if your PR is to address only the agent side, then it means the root cause of the exception is on the agent side but the error shows up on the server side ? I'm confused

            Show
            dg424 Donald Gobin added a comment - Hi Vincent Latombe But I see the remoting stack is on both sides (remoting.jar is in the jenkins.war file as well) and the stack trace in my comment above shows classes (org.jenkinsci.plugins.workflow.log, jenkins.util.InterceptingScheduledExecutorService) that I cannot find in remoting.jar on the the agent side. So, I'm actually not sure where the OOM is happening; if your PR is to address only the agent side, then it means the root cause of the exception is on the agent side but the error shows up on the server side ? I'm confused
            Hide
            kon Kalle Niemitalo added a comment -

            I see org.jenkinsci.plugins.workflow.log classes in these files on an agent:

            • remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar (workflow-api 1136.v7f5f1759dc16)
            • remoting/jarCache/AA/E8875DDC0E79929E944D30636208F6.jar (workflow-api 1108.v57edf648f5d4)
            • remoting/jarCache/EC/7A1A038FDCBC2456010A181E58E35B.jar (workflow-api 1122.v7a_916f363c86)

            I don't know whether those file names are hashes or just random. Anyway, it's conceivable that the agent could load org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream etc. from these files.

            Show
            kon Kalle Niemitalo added a comment - I see org.jenkinsci.plugins.workflow.log classes in these files on an agent: remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar (workflow-api 1136.v7f5f1759dc16) remoting/jarCache/AA/E8875DDC0E79929E944D30636208F6.jar (workflow-api 1108.v57edf648f5d4) remoting/jarCache/EC/7A1A038FDCBC2456010A181E58E35B.jar (workflow-api 1122.v7a_916f363c86) I don't know whether those file names are hashes or just random. Anyway, it's conceivable that the agent could load org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream etc. from these files.
            Hide
            kon Kalle Niemitalo added a comment -

            Oh, Checksum.java first computes an SHA-256 hash, but then splits that to two 128-bit parts and xors them together.

            $ sha256sum remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar
            38165eeaa9e20f4a5bdced3d142660b13ec55dfea343aba86da3775ee1ab5196 *remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar
            

            38165eeaa9e20f4a5bdced3d142660b1 xor 3ec55dfea343aba86da3775ee1ab5196 = 06D303140AA1A4E2367F9A63F58D3127

            Show
            kon Kalle Niemitalo added a comment - Oh, Checksum.java first computes an SHA-256 hash, but then splits that to two 128-bit parts and xors them together. $ sha256sum remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar 38165eeaa9e20f4a5bdced3d142660b13ec55dfea343aba86da3775ee1ab5196 *remoting/jarCache/06/D303140AA1A4E2367F9A63F58D3127.jar 38165eeaa9e20f4a5bdced3d142660b1 xor 3ec55dfea343aba86da3775ee1ab5196 = 06D303140AA1A4E2367F9A63F58D3127
            Hide
            dg424 Donald Gobin added a comment -

            Hi Kalle Niemitalo,

            Thanks. I see it now. So, these classes are "shipped" to the agent at runtime. If I fire up the agent and do not start a job, the classes do not exist. Just trying to understand how the process works...

            Show
            dg424 Donald Gobin added a comment - Hi Kalle Niemitalo , Thanks. I see it now. So, these classes are "shipped" to the agent at runtime. If I fire up the agent and do not start a job, the classes do not exist. Just trying to understand how the process works...

              People

              Assignee:
              jthompson Jeff Thompson
              Reporter:
              wasimj Wasim
              Votes:
              6 Vote for this issue
              Watchers:
              20 Start watching this issue

                Dates

                Created:
                Updated: