• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None

      I'm seeing delays of about an hour from when a multi branch event is created in GitHub to when Jenkins actually processes it.

      If I run the in the script console:

      jenkins.scm.api.SCMEvent.executorService
      

      I see:

      java.util.concurrent.ScheduledThreadPoolExecutor@8b0291b[Running, pool size = 10, active threads = 10, queued tasks = 2401, completed tasks = 12838]
      

      Relevant code appears to be around:
      https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1179-L1198
      and

      https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1385

      Running:

      Jenkins.get().getAllItems(jenkins.branch.MultiBranchProject.class).size()
      

      Gives:
      579 multi branch projects.

      From what I can see in the log in
      /var/jenkins_home/jenkins.branch.* logs

      Events seem to be process in 0-1 seconds but based on Maths comparing events processed over an hour we were only processing 22 a minute, so some must be quite slow.

      We only get to keep 15 minutes worth of logs as the file max size is set to 33 kilobytes for some reason:
      https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1138-L1142

      Context:
      Large organistion with 1.7K repositories, and an organisation webhook is pointing at Jenkins

      Few ideas so far:
      1. Throw more threads at it

      https://github.com/jenkinsci/scm-api-plugin/blob/master/src/main/java/jenkins/scm/api/SCMEvent.java#L215

      Is hardcoded to 10 threads at a time

      2. Keep logs for longer

      3. See if there's something specific that could be holding this up?

      4. Global configuration to filter projects out of webhook processing? We have lots that don't need to be processed and will never be matched

        1. threads.6.20220325093536.txt
          667 kB
        2. threads.6.20220325093615.txt
          642 kB
        3. threads.6.20220325093629.txt
          678 kB
        4. threads.6.20220325093550.txt
          670 kB
        5. threads.6.20220325093641.txt
          689 kB
        6. threads.6.20220325093602.txt
          642 kB
        7. threads.6.20220325093655.txt
          716 kB
        8. threads.6.20220325093732.txt
          698 kB
        9. threads.6.20220325093759.txt
          671 kB
        10. threads.6.20220325093745.txt
          662 kB
        11. threads.6.20220325093706.txt
          696 kB
        12. threads.6.20220325093720.txt
          704 kB
        13. over-1-second.txt
          21 kB
        14. over-10-seconds.txt
          7 kB
        15. over-10-seconds-29-3.txt
          19 kB
        16. events-trace.txt
          53 kB
        17. events-trace-90s.txt
          53 kB
        18. github-api-get-repos-sorted.txt
          0.4 kB

          [JENKINS-68116] Slow processing of multi branch events

          Tim Jacomb created issue -

          Tim Jacomb added a comment -

          jglick / teilo do you happen to have any ideas or suggestions?

          Tim Jacomb added a comment - jglick / teilo do you happen to have any ideas or suggestions?
          Tim Jacomb made changes -
          Description Original: I'm seeing delays of about an hour from when a multi branch event is created in GitHub to when Jenkins actually processes it.

          If I run the in the script console:
          {code}
          jenkins.scm.api.SCMEvent.executorService
          {code}

          I see:
          {code}
          java.util.concurrent.ScheduledThreadPoolExecutor@8b0291b[Running, pool size = 10, active threads = 10, queued tasks = 2401, completed tasks = 12838]
          {code}

          Relevant code appears to be around:
          https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1179-L1198
          and

          https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1385

          Running:
          {code}
          Jenkins.get().getAllItems(jenkins.branch.MultiBranchProject.class).size()
          {code}

          Gives:
          579 multi branch projects.

          From what I can see in the log in
          /var/jenkins_home/jenkins.branch.* logs

          Events seem to be process in 0-1 seconds but based on Maths comparing events processed over an hour we were only processing 22 a minute, so some must be quite slow.

          We only get to keep 15 minutes worth of logs as the file max size is set to 33 kilobytes for some reason:
           https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1138-L1142


          Few ideas so far:
          1. Throw more threads at it

          https://github.com/jenkinsci/scm-api-plugin/blob/master/src/main/java/jenkins/scm/api/SCMEvent.java#L215

          Is hardcoded to 10 threads at a time

          2. Keep logs for longer

          3. See if there's something specific that could be holding this up?

          New: I'm seeing delays of about an hour from when a multi branch event is created in GitHub to when Jenkins actually processes it.

          If I run the in the script console:
          {code}
          jenkins.scm.api.SCMEvent.executorService
          {code}

          I see:
          {code}
          java.util.concurrent.ScheduledThreadPoolExecutor@8b0291b[Running, pool size = 10, active threads = 10, queued tasks = 2401, completed tasks = 12838]
          {code}

          Relevant code appears to be around:
          https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1179-L1198
          and

          https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1385

          Running:
          {code}
          Jenkins.get().getAllItems(jenkins.branch.MultiBranchProject.class).size()
          {code}

          Gives:
          579 multi branch projects.

          From what I can see in the log in
          /var/jenkins_home/jenkins.branch.* logs

          Events seem to be process in 0-1 seconds but based on Maths comparing events processed over an hour we were only processing 22 a minute, so some must be quite slow.

          We only get to keep 15 minutes worth of logs as the file max size is set to 33 kilobytes for some reason:
           https://github.com/jenkinsci/branch-api-plugin/blob/master/src/main/java/jenkins/branch/MultiBranchProject.java#L1138-L1142


          Context:
          Large organistion with 1.7K repositories, and an organisation webhook is pointing at Jenkins

          Few ideas so far:
          1. Throw more threads at it

          https://github.com/jenkinsci/scm-api-plugin/blob/master/src/main/java/jenkins/scm/api/SCMEvent.java#L215

          Is hardcoded to 10 threads at a time

          2. Keep logs for longer

          3. See if there's something specific that could be holding this up?

          4. Global configuration to filter projects out of webhook processing? We have lots that don't need to be processed and will never be matched

          Jesse Glick added a comment -

          I am afraid I am not intimately familiar with SCMEvent.

          Jesse Glick added a comment - I am afraid I am not intimately familiar with SCMEvent .
          Tim Jacomb made changes -
          Attachment New: threads.6.20220325093536.txt [ 57544 ]
          Attachment New: threads.6.20220325093550.txt [ 57545 ]
          Attachment New: threads.6.20220325093602.txt [ 57546 ]
          Attachment New: threads.6.20220325093615.txt [ 57547 ]
          Attachment New: threads.6.20220325093629.txt [ 57548 ]
          Attachment New: threads.6.20220325093641.txt [ 57549 ]
          Attachment New: threads.6.20220325093655.txt [ 57550 ]
          Attachment New: threads.6.20220325093706.txt [ 57551 ]
          Attachment New: threads.6.20220325093720.txt [ 57552 ]
          Attachment New: threads.6.20220325093732.txt [ 57553 ]
          Attachment New: threads.6.20220325093745.txt [ 57554 ]
          Attachment New: threads.6.20220325093759.txt [ 57555 ]

          Tim Jacomb added a comment -

          From what I can tell in the thread dumps all of the 10 threads are waiting for IO from GitHub

          Tim Jacomb added a comment - From what I can tell in the thread dumps all of the 10 threads are waiting for IO from GitHub

          James Nord added a comment -

          I think is could be exactly what happens when the user API quota is exhausted or low.

          Without access to the indexing logs, it is hard to say for sure.
          Try switching to GitHub App AUthentication which scales API limits with the number of repositories in the org.

          James Nord added a comment - I think is could be exactly what happens when the user API quota is exhausted or low. Without access to the indexing logs, it is hard to say for sure. Try switching to GitHub App AUthentication which scales API limits with the number of repositories in the org.

          Tim Jacomb added a comment -

          GitHub app is enabled I can upload the logs but I didn’t see anything useful there. It’s been running for 1.5 hours with 30 threads and has nothing queued anymore

          Tim Jacomb added a comment - GitHub app is enabled I can upload the logs but I didn’t see anything useful there. It’s been running for 1.5 hours with 30 threads and has nothing queued anymore

          James Nord added a comment - - edited

          in that case off the top of my head I am not sure.
          FWIW we have similar sized org (but more controllers) one controller has 120+ MBP jobs, we have not observed any issues.

          The caching threadpool does not help here as the cpu / alive time does not tell me if that request has been stuck waiting for GH to respond for 30 seconds or 1ms...

          Have you only just started seeing this, is it sporadic? GitHub has been flaky recently - even if their status page says otherwise

          James Nord added a comment - - edited in that case off the top of my head I am not sure. FWIW we have similar sized org (but more controllers) one controller has 120+ MBP jobs, we have not observed any issues. The caching threadpool does not help here as the cpu / alive time does not tell me if that request has been stuck waiting for GH to respond for 30 seconds or 1ms... Have you only just started seeing this, is it sporadic? GitHub has been flaky recently - even if their status page says otherwise

          Tim Jacomb added a comment -

          > Have you only just started seeing this, is it sporadic? GitHub has been flaky recently - even if their status page says otherwise

          14th February was when it was first reported as far as I can tell.

          Tim Jacomb added a comment - > Have you only just started seeing this, is it sporadic? GitHub has been flaky recently - even if their status page says otherwise 14th February was when it was first reported as far as I can tell.

            Unassigned Unassigned
            timja Tim Jacomb
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: