Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68199

java.lang.OutOfMemoryError: unable to create new native thread

    XMLWordPrintable

Details

    Description

      Please see here - https://issues.jenkins.io/browse/JENKINS-65873?focusedCommentId=424033&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-424033 and ticket in general. This issue only happens when there is a git checkout stage in the pipeline. One suggestion here - https://github.com/jenkinsci/remoting/pull/505#issuecomment-1062281877 - is to downgrade the git client plugin to 3.7.0. It seems that the suspicion is somewhere on the git side of things. Any ideas, suggestions, things we can do to help debug ?

      Attachments

        Issue Links

          Activity

            markewaite Mark Waite added a comment -

            I must be missing the context. The git checkout for a job happens on an agent, not on the controller. Usually, the agent is only performing a single job before it exits. I assumed that the message about being unable to create a new native thread was being reported by the agent, not by the controller. Is the message happening on the agent or on the controller?

            My suggestion to disable the performance optimization was offered just in case the git plugin was ignoring the fact that you have not enabled JGit. Switching to JGit on the controller will switch from forking a separate git process to instead perform the git operations inside the Jenkins process (controller or agent, as required by the context). We found in performance testing that it was faster to use JGit with small repositories and to use CLI git with large repositories. Your mileage may vary (as it does in almost all performance related topics).

            markewaite Mark Waite added a comment - I must be missing the context. The git checkout for a job happens on an agent, not on the controller. Usually, the agent is only performing a single job before it exits. I assumed that the message about being unable to create a new native thread was being reported by the agent, not by the controller. Is the message happening on the agent or on the controller? My suggestion to disable the performance optimization was offered just in case the git plugin was ignoring the fact that you have not enabled JGit. Switching to JGit on the controller will switch from forking a separate git process to instead perform the git operations inside the Jenkins process (controller or agent, as required by the context). We found in performance testing that it was faster to use JGit with small repositories and to use CLI git with large repositories. Your mileage may vary (as it does in almost all performance related topics).
            dg424 Donald Gobin added a comment - - edited

            The checkout is happening on both. There is a checkout on the master to get the Jenkinsfile pipeline definition, libs, etc from the repo (actually it does a full checkout on both sides, see here - https://issues.jenkins.io/browse/JENKINS-64199). The checkout is then done again on the agent where the pipeline can work with the repo contents to do its work.

            By the way, I linked this ticket to the other https://issues.jenkins.io/browse/JENKINS-65873 where the latest update seems to point to a kernel bug and a workaround in jdk 18 (we're on Jenkins LTS 2.303 right now). But again, we're only able to reproduce this problem when we have a git checkout stage in our pipeline – without this stage, we do not see the error.

             

            dg424 Donald Gobin added a comment - - edited The checkout is happening on both. There is a checkout on the master to get the Jenkinsfile pipeline definition, libs, etc from the repo (actually it does a full checkout on both sides, see here - https://issues.jenkins.io/browse/JENKINS-64199 ). The checkout is then done again on the agent where the pipeline can work with the repo contents to do its work. By the way, I linked this ticket to the other https://issues.jenkins.io/browse/JENKINS-65873  where the latest update seems to point to a kernel bug and a workaround in jdk 18 (we're on Jenkins LTS 2.303 right now). But again, we're only able to reproduce this problem when we have a git checkout stage in our pipeline – without this stage, we do not see the error.  

            See JENKINS-65873. The problem lies between the JDK and the linux kernel. It's not a Java bug.

            vlatombe Vincent Latombe added a comment - See JENKINS-65873 . The problem lies between the JDK and the linux kernel. It's not a Java bug.
            dg424 Donald Gobin added a comment -

            vlatombe This could be part of the problem, but I'm not sure it's the whole story. Note that we have not seen this problem prior to mid last year and we've been running Jenkins for much longer. If it was a linux bug that was there all along, why didn't we see it earlier ? And again, we cannot reproduce without the git stage in the pipeline. So, two pieces of evidence: 1) did not start to happen until mid last year, and 2) does not happen unless there is a git checkout stage in the pipeline. markewaite the reason that it was suggested to downgrade to the git plugin 3.7.0 was because it was prior to mid last year when our problem started to happen. I acknowledge that the plugin, in our case, just spawns the git executable and this should not cause a leak, at least not on the linux OS process side, but I don't know what else the plugin does internally.

            dg424 Donald Gobin added a comment - vlatombe This could be part of the problem, but I'm not sure it's the whole story. Note that we have not seen this problem prior to mid last year and we've been running Jenkins for much longer. If it was a linux bug that was there all along, why didn't we see it earlier ? And again, we cannot reproduce without the git stage in the pipeline. So, two pieces of evidence: 1) did not start to happen until mid last year, and 2) does not happen unless there is a git checkout stage in the pipeline. markewaite the reason that it was suggested to downgrade to the git plugin 3.7.0 was because it was prior to mid last year when our problem started to happen. I acknowledge that the plugin, in our case, just spawns the git executable and this should not cause a leak, at least not on the linux OS process side, but I don't know what else the plugin does internally.
            rhinoceros rhinoceros.xn added a comment -

            After adding sleep(10) before git checkout this problem no longer occurs.

            Maybe sleep(10) before git or checkout step is a workaround.

            sleep(10)
            checkout changelog: false, poll: false, scm: ........
            
            OR
            
            sleep(10)
            git branch: 'master', credentialsId: '******', url: 'git@git.yourcomampy.com:xx/zz.git'
            rhinoceros rhinoceros.xn added a comment - After adding sleep(10) before git checkout this problem no longer occurs. Maybe sleep(10) before git or checkout step is a workaround. sleep(10) checkout changelog: false , poll: false , scm: ........ OR sleep(10) git branch: 'master' , credentialsId: '******' , url: 'git@git.yourcomampy.com:xx/zz.git'

            People

              Unassigned Unassigned
              dg424 Donald Gobin
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: