-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
We are currently in the process of upgrading our Jenkins version from 2.387.3 to 2.414.3. Following the upgrade, Jenkins appears to be functioning normally without requiring any plugin updates.
As part of the upgrade process, we updated all possible plugins through the UI, which did not display any warnings. Notably, the update included the SnakeYAML plugin. However, upon further investigation, we discovered an issue with our existing Kubernetes plugins, specifically:
- Kubernetes: 3937.vd7b_82db_e347b_
- Kubernetes-cli: 1.12.0
- Kubernetes-client-api: 6.4.1-215.v2ed17097a_8e9
- Kubernetes-credentials: 0.10.0
To address this issue and anyway we need to upgrade kubernetes plugin, we updated the SSH credentials, Kubernetes credentials, and the Kubernetes and Kubernetes CLI plugins to the following versions:
- Kubernetes: 4054.v2da_8e2794884
- Kubernetes-cli: 1.12.1
- Kubernetes-client-api: 6.10.0-240.v57880ce8b_0b_2
- Kubernetes-credentials: 174.va_36e093562d9
I have attached the plugins.txt files before and after jenkins&plugin update. We used 2.414.3-lts-rhel-ubi9-jdk17 image for jenkins upgrade.
After running 2-3 jobs, Jenkins becomes unresponsive and displays the following error logs:
2024-09-27 11:11:11.827+0000 [id=565] WARNING j.m.api.Metrics$HealthChecker#execute: Some health checks are reporting as unhealthy: [thread-deadlock : [jenkins.util.Timer 1 locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@1566b90c (owned by Computer.threadPoolForRemoting 13):
at java.base@17.0.8.1/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@17.0.8.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:938)
at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
at hudson.model.Queue.maintain(Queue.java:1481)
at hudson.model.Queue$MaintainTask.doRun(Queue.java:2919)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92)
at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
at java.base@17.0.8.1/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base@17.0.8.1/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base@17.0.8.1/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
, Computer.threadPoolForRemoting 8 locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@1566b90c (owned by Computer.threadPoolForRemoting 13):
at java.base@17.0.8.1/jdk.internal.misc.Unsafe.park(Native Method)
at java.base@17.0.8.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:938)
at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
at hudson.model.Queue._withLock(Queue.java:1456)
at hudson.model.Queue.withLock(Queue.java:1314)
at jenkins.model.Nodes.updateNode(Nodes.java:201)
at jenkins.model.Jenkins.updateNode(Jenkins.java:2252)
at hudson.model.Node.save(Node.java:143)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:247)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:297)
at hudson.slaves.SlaveComputer$$Lambda$999/0x00000008010d5a10.call(Unknown Source)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
at java.base@17.0.8.1/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
, Computer.threadPoolForRemoting 13 locked on org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher@517c803b (owned by Computer.threadPoolForRemoting 8):
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.isLaunchSupported(KubernetesLauncher.java:91)
at hudson.slaves.SlaveComputer.isLaunchSupported(SlaveComputer.java:247)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.check(OnceRetentionStrategy.java:81)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.check(OnceRetentionStrategy.java:46)
at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:960)
at hudson.model.Queue._withLock(Queue.java:1397)
at hudson.model.Queue.withLock(Queue.java:1271)
at hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:957)
at hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:147)
at hudson.model.AbstractCIBase$1.run(AbstractCIBase.java:255)
at hudson.model.Queue._withLock(Queue.java:1397)
at hudson.model.Queue.withLock(Queue.java:1271)
at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:238)
at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1693)
at jenkins.model.Nodes$5.run(Nodes.java:279)
at hudson.model.Queue._withLock(Queue.java:1397)
at hudson.model.Queue.withLock(Queue.java:1271)
at jenkins.model.Nodes.removeNode(Nodes.java:270)
at jenkins.model.Jenkins.removeNode(Jenkins.java:2238)
at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:91)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.lambda$done$5(OnceRetentionStrategy.java:142)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$$Lambda$1353/0x00000008013c5220.run(Unknown Source)
at hudson.model.Queue._withLock(Queue.java:1397)
at hudson.model.Queue.withLock(Queue.java:1271)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.lambda$done$6(OnceRetentionStrategy.java:137)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$$Lambda$1352/0x00000008013c4ff8.run(Unknown Source)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
at jenkins.util.ErrorLoggingExecutorService$$Lambda$742/0x0000000800e57420.run(Unknown Source)
at java.base@17.0.8.1/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base@17.0.8.1/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
2024-09-27 11:12:36.768+0000 [id=575] INFO h.TcpSlaveAgentListener$ConnectionHandler#run: Connection #13 from /127.0.0.1:56944 failed: null
2024-09-27 11:13:40.300+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 5 sec
2024-09-27 11:13:45.301+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 10 sec
2024-09-27 11:13:50.302+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 15 sec
2024-09-27 11:13:55.302+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 20 sec
2024-09-27 11:14:00.303+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 25 sec
2024-09-27 11:14:05.303+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 30 sec
2024-09-27 11:14:10.304+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 35 sec
Computer.threadPoolForRemoting 13 locked on org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher@517c803b (owned by Computer.threadPoolForRemoting 8):
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.isLaunchSupported(KubernetesLauncher.java:91)
2024-09-27 11:14:15.304+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 40 sec
2024-09-27 11:14:20.305+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 45 sec
2024-09-27 11:14:25.306+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 50 sec
2024-09-27 11:14:30.306+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 55 sec
2024-09-27 11:14:35.307+0000 [id=155] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[tfaudit/feature%2FTOOLS-3513-nexusupgradeTest/15:tfaudit/feature%2FTOOLS-3513-nexusupgradeTest #15] unresponsive for 1 min 0 sec
2024-09-27 11:14:36.768+0000 [id=587] INFO h.TcpSlaveAgentListener$ConnectionHandler#run: Connection #14 from /127.0.0.1:56958 failed: null
Kindly help us in resolving the issue as this is blocking our upgrade.