We got a situation on Jenkins ver. 2.204.2 when the ping to the build agent connected via ssh actually started blocking other requests due to delays in the response from the agent, including the web interface gave a timeout, we did not wait for a ping response from the agent and restarted the master
A problematic thread for which we also saw a large CPU load
"Executor #3 for tkles-jenci0013" #899694 daemon prio=5 os_prio=0 tid=0x000000000179c800 nid=0x4f9 runnable [0x00007f3b7a2b0000]"Executor #3 for tkles-jenci0013" #899694 daemon prio=5 os_prio=0 tid=0x000000000179c800 nid=0x4f9 runnable [0x00007f3b7a2b0000] java.lang.Thread.State: RUNNABLE at java.lang.Thread.setPriority0(Native Method) at java.lang.Thread.setPriority(Thread.java:1095) at java.lang.Thread.init(Thread.java:417) at java.lang.Thread.init(Thread.java:349) at java.lang.Thread.<init>(Thread.java:678) at java.util.concurrent.Executors$DefaultThreadFactory.newThread(Executors.java:613) at hudson.util.DaemonThreadFactory.newThread(DaemonThreadFactory.java:46) at hudson.util.ExceptionCatchingThreadFactory.newThread(ExceptionCatchingThreadFactory.java:50) at hudson.util.NamingThreadFactory.newThread(NamingThreadFactory.java:52) at java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:619) at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) at java.util.concurrent.CompletableFuture.asyncSupplyStage(CompletableFuture.java:1604) at java.util.concurrent.CompletableFuture.supplyAsync(CompletableFuture.java:1830) at jenkins.metrics.impl.JenkinsMetricProviderImpl$ScheduledRate.onLeft(JenkinsMetricProviderImpl.java:919) at hudson.model.Queue$LeftItem.enter(Queue.java:2788) at hudson.model.Queue.onStartExecuting(Queue.java:1168) at hudson.model.Executor$1.call(Executor.java:359) at hudson.model.Executor$1.call(Executor.java:345) at hudson.model.Queue._withLock(Queue.java:1451) at hudson.model.Queue.withLock(Queue.java:1312) at hudson.model.Executor.run(Executor.java:345) Locked ownable synchronizers: - <0x000000025d637d48> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
At this time, the waiting threads looked like this
"jenkins.util.Timer [#6]" #117 daemon prio=5 os_prio=0 tid=0x00007f3b98002800 nid=0x12f1 waiting on condition [0x00007f3bea1e2000]"jenkins.util.Timer [#6]" #117 daemon prio=5 os_prio=0 tid=0x00007f3b98002800 nid=0x12f1 waiting on condition [0x00007f3bea1e2000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000025d637d48> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.model.Queue.maintain(Queue.java:1474) at hudson.model.Queue$MaintainTask.doRun(Queue.java:2898) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:70) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
I think this is a global defect connecting to the slave via ssh