Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68321

SCMEvent threads waiting on rate limit when rate limit isn't close to being hit

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • GitHub Branch Source Plugin version 2.11.4

      Our organization is experiencing large delays when jenkins processes webhooks (on the order of minutes up to hours). When looking at a threadDump, the following is observed in all SCMEvent threads:

      class 
      org.jenkinsci.plugins.github_branch_source.PullRequestGHEventSubscriber$SCMHeadEventImpl
       Wed Apr 20 14:18:16 EDT 2022 / SCMEvent [#4]"class org.jenkinsci.plugins.github_branch_source.PullRequestGHEventSubscriber$SCMHeadEventImpl Wed Apr 20 14:18:16 EDT 2022 / SCMEvent [#4]" Id=2608 Group=main TIMED_WAITING
      	at java.lang.Thread.sleep(Native Method)
      	at org.jenkinsci.plugins.github_branch_source.ApiRateLimitChecker$LocalChecker.waitUntilRateLimit(ApiRateLimitChecker.java:325)
      	at org.jenkinsci.plugins.github_branch_source.ApiRateLimitChecker$LocalChecker.checkRateLimit(ApiRateLimitChecker.java:261)
      	at org.jenkinsci.plugins.github_branch_source.ApiRateLimitChecker$RateLimitCheckerAdapter.checkRateLimit(ApiRateLimitChecker.java:242)
      	at org.kohsuke.github.GitHubRateLimitChecker.checkRateLimit(GitHubRateLimitChecker.java:128)
      	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:383)
      	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:355)
      	at org.kohsuke.github.Requester.fetch(Requester.java:76)
      	at org.kohsuke.github.GHRepository.read(GHRepository.java:132)
      	at org.kohsuke.github.GHPerson.getRepository(GHPerson.java:146)
      	at org.jenkinsci.plugins.github_branch_source.GitHubSCMNavigator.visitSource(GitHubSCMNavigator.java:1389)
      	at org.jenkinsci.plugins.github_branch_source.GitHubSCMNavigator.visitSources(GitHubSCMNavigator.java:926)
      	at jenkins.scm.api.SCMNavigator.visitSources(SCMNavigator.java:221)
      	at jenkins.branch.OrganizationFolder$SCMEventListenerImpl.onSCMHeadEvent(OrganizationFolder.java:1049)
      	at jenkins.scm.api.SCMHeadEvent$DispatcherImpl.fire(SCMHeadEvent.java:246)
      	at jenkins.scm.api.SCMHeadEvent$DispatcherImpl.fire(SCMHeadEvent.java:229)
      	at jenkins.scm.api.SCMEvent$Dispatcher.run(SCMEvent.java:505)
      	at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:750)
      
      	Number of locked synchronizers = 1
      	- java.util.concurrent.ThreadPoolExecutor$Worker@594bc15a
      

      However, the rate limit for the service account has not come close to 0. The minimum observed is 3000 out of 5000 remaining. This is observed on dashboards as well as when testing the connection from the jenkins UI.

      We are using public GitHub. The rate limiting strategy is set to "Throttle at/near rate limit". It used to be set to "Normalize API requests", but this exacerbated the problem. 

      Notably, the following is seen in the github branch source logs:

      2022-04-20 20:09:02.440+0000 [id=247607]        INFO    o.j.p.g.ApiRateLimitChecker$RateLimitCheckerAdapter#checkRateLimit: LocalChecker for rate limit was not set for this thread. Configured using system settings.
      2022-04-20 20:09:02.512+0000 [id=247609]        INFO    o.j.p.g.ApiRateLimitChecker$RateLimitCheckerAdapter#checkRateLimit: LocalChecker for rate limit was not set for this thread. Configured using system settings.
      2022-04-20 20:09:02.512+0000 [id=247608]        INFO    o.j.p.g.ApiRateLimitChecker$RateLimitCheckerAdapter#checkRateLimit: LocalChecker for rate limit was not set for this thread. Configured using system settings. 

      The following is also seen every few seconds in the github branch source logs:

      2022-04-20 20:09:21.187+0000 [id=247232]        FINE    jenkins.scm.api.SCMSource#defaultListener: Connecting to https://api.github.com using REDACTED 

      Let me know if any other information would be helpful.

          [JENKINS-68321] SCMEvent threads waiting on rate limit when rate limit isn't close to being hit

          Jesse Glick added a comment -

          FWIW in https://github.com/jenkinsci/github-branch-source-plugin/pull/313#discussion_r456647733 bitwiseman says

          There is literally never a reason to not check rate limits when interacting with github.com.

          Jesse Glick added a comment - FWIW in https://github.com/jenkinsci/github-branch-source-plugin/pull/313#discussion_r456647733 bitwiseman says There is literally never a reason to not check rate limits when interacting with github.com .

          Glenn Duffy added a comment - - edited

          jglick Thanks for the response. This is still an issue in our organization.

          Notably, we tried compiling a custom version of the github branch source plugin to comment out the sleep. However, another sleep manifested in another plugin (I believe it was the git plugin) with a similar rate limit.

          I agree that it is confusing for Never check rate limit not to work on public github.com. That defeats the purpose of that configuration setting in our environment.

          What is the next step for this issue? Would it be feasible to modify the implementation of Never check rate limit to work on github.com and for that to bypass any rate limit checks in any of these git plugins?

          Glenn Duffy added a comment - - edited jglick Thanks for the response. This is still an issue in our organization. Notably, we tried compiling a custom version of the github branch source plugin to comment out the sleep. However, another sleep manifested in another plugin (I believe it was the git plugin) with a similar rate limit. I agree that it is confusing for Never check rate limit not to work on public github.com. That defeats the purpose of that configuration setting in our environment. What is the next step for this issue? Would it be feasible to modify the implementation of Never check rate limit to work on github.com and for that to bypass any rate limit checks in any of these git plugins?

          Jesse Glick added a comment -

          modify the implementation of Never check rate limit to work on github.com

          I attempted this in https://github.com/jenkinsci/github-branch-source-plugin/pull/653. If you like you can try installing the resulting experimental build: https://repo.jenkins-ci.org/incrementals/org/jenkins-ci/plugins/github-branch-source/1703.vece6b_0c9c3ca/github-branch-source-1703.vece6b_0c9c3ca.hpi but beware source code comments such as

          GitHub treats clients that exceed their rate limit very harshly.

          I doubt my PR is the proper fix—more like a workaround pending investigation of the real cause. https://github.com/jenkinsci/github-branch-source-plugin/pull/654 possibly, though I doubt it; that fix would explain the not set for this thread warning, but the fallback still ought to behave as documented (not do anything until you are close to the limit) except in unlikely scenarios where you have multiple GitHub servers configured.

          another sleep manifest in another plugin (I believe it was the git plugin)

          github plugin perhaps? The git plugin knows nothing of REST API calls or rate limits.

          Jesse Glick added a comment - modify the implementation of Never check rate limit to work on github.com I attempted this in https://github.com/jenkinsci/github-branch-source-plugin/pull/653 . If you like you can try installing the resulting experimental build: https://repo.jenkins-ci.org/incrementals/org/jenkins-ci/plugins/github-branch-source/1703.vece6b_0c9c3ca/github-branch-source-1703.vece6b_0c9c3ca.hpi but beware source code comments such as GitHub treats clients that exceed their rate limit very harshly. I doubt my PR is the proper fix—more like a workaround pending investigation of the real cause. https://github.com/jenkinsci/github-branch-source-plugin/pull/654 possibly, though I doubt it; that fix would explain the not set for this thread warning, but the fallback still ought to behave as documented (not do anything until you are close to the limit) except in unlikely scenarios where you have multiple GitHub servers configured. another sleep manifest in another plugin (I believe it was the git plugin) github plugin perhaps? The git plugin knows nothing of REST API calls or rate limits.

          Glenn Duffy added a comment -

          Thanks jglick for the information.

          Yes it was likely the github plugin - I'll post a threadDump if I can capture one.

          Glenn Duffy added a comment - Thanks jglick for the information. Yes it was likely the github plugin - I'll post a threadDump if I can capture one.

          Glenn Duffy added a comment -

          I ended up implementing robhamilton 's workaround of interrupting any SCMEvent threads in the TIMED_WAIT state and it's working well. Thanks Rob for the suggestion. We haven't seen the system get near the rate limit so far. I may implement an improvement that foregoes interrupting the threads if the rate limit is below a certain amount. This would be a make-shift re-implementation of what jenkins should be doing itself.

          I don't consider this workaround sufficient to close this Jira though. Jenkins should really be smarter about the current rate limit and not need to throttle unless it gets near the actual limit.

          Glenn Duffy added a comment - I ended up implementing robhamilton 's workaround of interrupting any SCMEvent threads in the TIMED_WAIT state and it's working well. Thanks Rob for the suggestion. We haven't seen the system get near the rate limit so far. I may implement an improvement that foregoes interrupting the threads if the rate limit is below a certain amount. This would be a make-shift re-implementation of what jenkins should be doing itself. I don't consider this workaround sufficient to close this Jira though. Jenkins should really be smarter about the current rate limit and not need to throttle unless it gets near the actual limit.

          Sam Gleske added a comment -

          jglick quote

          FWIW in https://github.com/jenkinsci/github-branch-source-plugin/pull/313#discussion_r456647733 Liam Newman says

          There is literally never a reason to not check rate limits when interacting with github.com.

          I think bitwiseman is misguided on this one. GitHub support seems to disagree for GitHub Enterprise Cloud customers. They want to see API limits hit before being willing to raise limits for Apps. I filed a related ticket to stop doing this for all use cases (including github.com).

          Ref JENKINS-71849

          Sam Gleske added a comment - jglick quote FWIW in https://github.com/jenkinsci/github-branch-source-plugin/pull/313#discussion_r456647733 Liam Newman says There is literally never a reason to not check rate limits when interacting with github.com. I think bitwiseman is misguided on this one. GitHub support seems to disagree for GitHub Enterprise Cloud customers. They want to see API limits hit before being willing to raise limits for Apps. I filed a related ticket to stop doing this for all use cases ( including github.com ). Ref JENKINS-71849

          Liam Newman added a comment - - edited

          sag47 jglick

          I've commented on the referenced ticket. No, users must not be allowed to exceed rate limits on GitHub.com. 

          The existing implementation should allow customers to reach their rate limits AND not exceed them. If it doesn't, that's the bug to fix. 

           

          Liam Newman added a comment - - edited sag47 jglick I've commented on the referenced ticket. No, users must not be allowed to exceed rate limits on GitHub.com.  The existing implementation should allow customers to reach their rate limits AND not exceed them. If it doesn't, that's the bug to fix.   

          Rob Hamilton added a comment -

          This is definately a bug for us as we have multiple orgs with huge limits which we don't reach - and yet this implmentation throttles. The problem appears to be with the "anonymous" user which (going by the logs) the plugin appears to use as some sort of fallback, gets throttled and then sends everything into a waiting state.

          Rob Hamilton added a comment - This is definately a bug for us as we have multiple orgs with huge limits which we don't reach - and yet this implmentation throttles. The problem appears to be with the "anonymous" user which (going by the logs) the plugin appears to use as some sort of fallback, gets throttled and then sends everything into a waiting state.

          Liam Newman added a comment -

          robhamilton 

          Could you provide more logs? 

          Liam Newman added a comment - robhamilton   Could you provide more logs? 

          Nabeel added a comment -

          We are also affected by this issue with the Jenkins LTS version 2.462.3-lts-jdk17 and the below plugin versions:

          1. scm-api:696.v778d637b_a_762
          2. github-api:1.316-451.v15738eef3414
          3. github-branch-source:1741.va_3028eb_9fd21

          We are using multiple GitHub Apps for a large-sized organisation.

          Nabeel added a comment - We are also affected by this issue with the Jenkins LTS version 2.462.3-lts-jdk17 and the below plugin versions: scm-api:696.v778d637b_a_762 github-api:1.316-451.v15738eef3414 github-branch-source:1741.va_3028eb_9fd21 We are using multiple GitHub Apps for a large-sized organisation.

            Unassigned Unassigned
            gduffy Glenn Duffy
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: