Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-69025

Massive build slowdown with scripted pipeline (and some plugins?) and many builds in queue

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • None
    • Jenkins 2.332.3
      Throttle plugin 2.8
      Plugins as latest as possible

       

      The builds slow down to a crawl. Every 'node(...} { }' statement takes multiple seconds to enter and other operations also appear very slow. The UI itself is rather responsive, until you try to change some configuration, then it takes up to multiple minutes to adjust something (e.g. to put a node offline).

      Thread dump shows that executors mostly keep waiting for Queue.maintain, I assume on every step judging by their slowness.

      AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#425841]"AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#425841]" Id=1116475 Group=main RUNNABLE
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1323)
      	at java.base@11.0.11/java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:738)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$TimingFlowNodeStorage.getNode(CpsFlowExecution.java:1841)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getNode(CpsFlowExecution.java:1174)
      	at hudson.plugins.throttleconcurrents.ThrottleJobProperty.getThrottledPipelineRunsForCategory(ThrottleJobProperty.java:340)
      	at hudson.plugins.throttleconcurrents.ThrottleQueueTaskDispatcher.throttleCheckForCategoriesOnNode(ThrottleQueueTaskDispatcher.java:132)
      	at hudson.plugins.throttleconcurrents.ThrottleQueueTaskDispatcher.canTakeImpl(ThrottleQueueTaskDispatcher.java:101)
      	at hudson.plugins.throttleconcurrents.ThrottleQueueTaskDispatcher.canTake(ThrottleQueueTaskDispatcher.java:62)
      	at hudson.model.queue.QueueTaskDispatcher.canTake(QueueTaskDispatcher.java:101)
      	at hudson.model.Queue$JobOffer.getCauseOfBlockage(Queue.java:276)
      	at hudson.model.Queue.maintain(Queue.java:1637)
      	at hudson.model.Queue$1.call(Queue.java:330)
      	at hudson.model.Queue$1.call(Queue.java:327)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:109)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:99)
      	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
      	at java.base@11.0.11/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:121)
      	at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
      
      	Number of locked synchronizers = 1
      	- java.util.concurrent.locks.ReentrantLock$NonfairSync@6a4b9f26
      
       

       

      Periodic Jenkins queue maintenance waiting for Periodic Jenkins queue maintenance like this:

      Executor #7 for Win-CrashReports-Node"Executor #7 for Win-CrashReports-Node" Id=1116435 Group=main WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@6a4b9f26 owned by "AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#425841]" Id=1116475
      	at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
      	-  waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@6a4b9f26
      	at java.base@11.0.11/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240)
      	at java.base@11.0.11/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:267)
      	at hudson.model.Queue._withLock(Queue.java:1454)
      	at hudson.model.Queue.withLock(Queue.java:1312)
      	at hudson.model.Executor.run(Executor.java:352)
      
       

       

      For some reason there are timers maintaining the Queue in parallel with Periodic queue maintenance thread:

      jenkins.util.Timer [#7]"jenkins.util.Timer [#7]" Id=57 Group=main WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@6a4b9f26 owned by "AtmostOneTaskExecutor[Periodic Jenkins queue maintenance] [#425841]" Id=1116475
      	at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
      	-  waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@6a4b9f26
      	at java.base@11.0.11/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917)
      	at java.base@11.0.11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240)
      	at java.base@11.0.11/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:267)
      	at hudson.model.Queue.schedule2(Queue.java:570)
      	at jenkins.model.ParameterizedJobMixIn.scheduleBuild2(ParameterizedJobMixIn.java:159)
      	at jenkins.model.ParameterizedJobMixIn.scheduleBuild(ParameterizedJobMixIn.java:117)
      	at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.scheduleBuild(ParameterizedJobMixIn.java:391)
      	at hudson.triggers.TimerTrigger.run(TimerTrigger.java:67)
      	at hudson.triggers.Trigger.checkTriggers(Trigger.java:292)
      	at hudson.triggers.Trigger$Cron.doRun(Trigger.java:234)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92)
      	at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
      	at java.base@11.0.11/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      	at java.base@11.0.11/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      	at java.base@11.0.11/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      	at java.base@11.0.11/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base@11.0.11/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
      
      	Number of locked synchronizers = 1
      	- java.util.concurrent.ThreadPoolExecutor$Worker@4ff95144
      
      

       

      We use throttle in some areas of our scripted pipeline, we do a lot of node steps. There are also many freestyle builds. In total, the queue contains 1200 items right now and we have around 30 static build nodes with around 6 executors each. Jenkins server has 8 CPU cores and 16 GB of RAM. It is very responsive, but the builds and any node config operations are extremely slow, barely even running. There are around 20 throttle categories and category overrides for 20 nodes for 3 categories each.

      I tried using throttle plugin sparingly, but still the scripted pipeline node step was really slow, so I am not sure if the problem is the plugin or jenkins core.

      UPDATE: I removed most of the plugins which could interfere with the queue that I could think of and stopped using throttle plugin at some crucial points (we can't really abandon it), but still the performance keeps degrading more the bigger the queue. There seem to be periodic CPU usage spikes caused by queue maintenance, which blocks all the builds periodically. I removed Priority Sorter and Scoring Load Balancer.

            Unassigned Unassigned
            gl1koz3 Edgars Batna
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: