Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73345

Thread deadlock blocking any job to run

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • None

      We have a Jenkins deployed using our kubernetes operator. We are using version 8.0.0. Last night, we redeployed the jenkins-operator with new values (removing some tools configuration) as configuration as code. However, since then, the jobs will stop working and we saw the following exception:

      2024-06-21 21:35:45.647+0000 [id=194]	INFO	j.j.plugin.JenkinsJobManagement#createOrUpdateConfig: createOrUpdateConfig for cloudmc-automation-engine
      2024-06-21 21:35:45.703+0000 [id=194]	INFO	c.c.h.p.folder.AbstractFolder$3#call: Loading job cloudmc-automation-engine/PR-38 (14000.0%)
      2024-06-21 21:35:45.716+0000 [id=517]	INFO	o.j.p.g.webhook.WebhookManager$1#run: GitHub webhooks activated for job cloudmc-automation-engine with [GitHubRepositoryName[host=github.com,username=cloudops,repository=cloudmc-automation-engine]] (events: [PULL_REQUEST, PUSH])
      2024-06-21 21:36:00.795+0000 [id=29]	WARNING	c.c.h.p.f.c.PeriodicFolderTrigger#run: Queue refused to schedule org.jenkinsci.plugins.workflow.multibranch.WorkflowMultiBranchProject@1de454f2[plugins/cloudmc-macrometa-plugin]
      2024-06-21 21:36:08.914+0000 [id=534]	WARNING	j.m.api.Metrics$HealthChecker#execute: Some health checks are reporting as unhealthy: [thread-deadlock : [jenkins.util.Timer [#9] locked on org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher@72dd3b4c (owned by Computer.threadPoolForRemoting [#13]):
      	 at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.isLaunchSupported(KubernetesLauncher.java:91)
      	 at hudson.slaves.SlaveComputer.isLaunchSupported(SlaveComputer.java:247)
      	 at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.check(OnceRetentionStrategy.java:81)
      	 at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy.check(OnceRetentionStrategy.java:46)
      	 at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:71)
      	 at hudson.model.Queue._withLock(Queue.java:1397)
      	 at hudson.model.Queue.withLock(Queue.java:1271)
      	 at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:62)
      	 at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92)
      	 at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
      	 at java.base@17.0.8.1/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
      	 at java.base@17.0.8.1/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      	 at java.base@17.0.8.1/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	 at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
      , Computer.threadPoolForRemoting [#13] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@362255d0 (owned by jenkins.util.Timer [#9]):
      	 at java.base@17.0.8.1/jdk.internal.misc.Unsafe.park(Native Method)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:938)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
      	 at hudson.model.Queue._withLock(Queue.java:1456)
      	 at hudson.model.Queue.withLock(Queue.java:1314)
      	 at jenkins.model.Nodes.updateNode(Nodes.java:201)
      	 at jenkins.model.Jenkins.updateNode(Jenkins.java:2252)
      	 at hudson.model.Node.save(Node.java:143)
      	 at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:247)
      	 at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:297)
      	 at hudson.slaves.SlaveComputer$$Lambda$841/0x0000000800e87b00.call(Unknown Source)
      	 at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	 at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
      	 at java.base@17.0.8.1/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	 at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
      , jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@362255d0 (owned by jenkins.util.Timer [#9]):
      	 at java.base@17.0.8.1/jdk.internal.misc.Unsafe.park(Native Method)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:938)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
      	 at java.base@17.0.8.1/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
      	 at hudson.model.Queue.maintain(Queue.java:1481)
      	 at hudson.model.Queue$MaintainTask.doRun(Queue.java:2919)
      	 at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92)
      	 at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
      	 at java.base@17.0.8.1/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
      	 at java.base@17.0.8.1/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      	 at java.base@17.0.8.1/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	 at java.base@17.0.8.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	 at java.base@17.0.8.1/java.lang.Thread.run(Thread.java:833)
      ]] 

      We need to redeployment the jenkins pod to allow to work again. This been awful since then. Can you help us figure out a solution to resolve this.

            tomaszsek Tomasz Sęk
            fyrnaga Alain
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: