-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
Jenkins 2.332.3 and 2.346.1
Amazon EC2 plugin Version 1.68
-
-
gradle:1.39.4
in the last week, periodically when running a job that uses ec2 agent, the jenkins finally hang (web GUI is still accessible, but impossible to cancel the jobs or run a new one) and we need to restart the Jenkins. Before that everything worked fine, such problems appeared only last week. If run manually via Manage Nodes and Clouds periodically jenkins returns 504 after a while, EC2 instance started and jenkins doesn't hang. Also seems that this happens more often with the Spot Configuration enabled.
Latest log:
022-06-23 19:27:05.227+0000 [id=5461] WARNING j.m.api.Metrics$HealthChecker#execute: Some health checks are reporting as unhealthy: [thread-deadlock : [jenkins.util.Timer [#2] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@5db1de9 (owned by Computer.threadPoolForRemoting [#647]): at java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.15/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240) at java.base@11.0.15/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:267) at hudson.model.Queue.maintain(Queue.java:1479) at hudson.model.Queue$MaintainTask.doRun(Queue.java:2899) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67) at java.base@11.0.15/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base@11.0.15/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at java.base@11.0.15/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base@11.0.15/java.lang.Thread.run(Thread.java:829) , Computer.threadPoolForRemoting [#647] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@b1cf322 (owned by jenkins.util.Timer [#4]): at java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.15/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240) at java.base@11.0.15/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:267) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:694) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:796) at hudson.plugins.ec2.util.MinimumInstanceChecker.lambda$null$11(MinimumInstanceChecker.java:114) at hudson.plugins.ec2.util.MinimumInstanceChecker$$Lambda$1030/0x0000000102a9bc40.accept(Unknown Source) at java.base@11.0.15/java.util.ArrayList.forEach(ArrayList.java:1541) at java.base@11.0.15/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1085) at hudson.plugins.ec2.util.MinimumInstanceChecker.lambda$checkForMinimumInstances$12(MinimumInstanceChecker.java:77) at hudson.plugins.ec2.util.MinimumInstanceChecker$$Lambda$257/0x0000000100417440.accept(Unknown Source) at java.base@11.0.15/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base@11.0.15/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) at java.base@11.0.15/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) at java.base@11.0.15/java.util.Iterator.forEachRemaining(Iterator.java:133) at java.base@11.0.15/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.base@11.0.15/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.base@11.0.15/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.base@11.0.15/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base@11.0.15/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base@11.0.15/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base@11.0.15/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) at hudson.plugins.ec2.util.MinimumInstanceChecker.checkForMinimumInstances(MinimumInstanceChecker.java:76) at hudson.plugins.ec2.SlaveTemplate$OnSaveListener.onChange(SlaveTemplate.java:1844) at hudson.model.listeners.SaveableListener.fireOnChange(SaveableListener.java:82) at jenkins.model.Jenkins.save(Jenkins.java:3545) at hudson.util.PersistedList.onModified(PersistedList.java:193) at hudson.util.PersistedList._onModified(PersistedList.java:224) at hudson.util.PersistedList.add(PersistedList.java:85) at hudson.plugins.gradle.injection.MavenOptsSetter.setMavenOpts(MavenOptsSetter.java:40) at hudson.plugins.gradle.injection.MavenOptsSetter.remove(MavenOptsSetter.java:31) at hudson.plugins.gradle.injection.MavenBuildScanInjection.removeMavenExtension(MavenBuildScanInjection.java:86) at hudson.plugins.gradle.injection.MavenBuildScanInjection.inject(MavenBuildScanInjection.java:58) at hudson.plugins.gradle.injection.BuildScanInjectionListener.lambda$inject$0(BuildScanInjectionListener.java:57) at hudson.plugins.gradle.injection.BuildScanInjectionListener$$Lambda$426/0x0000000100daf440.accept(Unknown Source) at java.base@11.0.15/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) at hudson.plugins.gradle.injection.BuildScanInjectionListener.inject(BuildScanInjectionListener.java:57) at hudson.plugins.gradle.injection.BuildScanInjectionListener.onConfigurationChange(BuildScanInjectionListener.java:49) at hudson.model.AbstractCIBase$$Lambda$424/0x0000000100dafc40.accept(Unknown Source) at jenkins.util.Listeners.lambda$notify$0(Listeners.java:59) at jenkins.util.Listeners$$Lambda$425/0x0000000100daf040.run(Unknown Source) at jenkins.util.Listeners.notify(Listeners.java:70) at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:277) at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1670) at jenkins.model.Nodes$5.run(Nodes.java:279) at hudson.model.Queue._withLock(Queue.java:1395) at hudson.model.Queue.withLock(Queue.java:1269) at jenkins.model.Nodes.removeNode(Nodes.java:270) at jenkins.model.Jenkins.removeNode(Jenkins.java:2215) at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:91) at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1$1.run(OnceRetentionStrategy.java:128) at hudson.model.Queue._withLock(Queue.java:1395) at hudson.model.Queue.withLock(Queue.java:1269) at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1.run(OnceRetentionStrategy.java:123) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) at java.base@11.0.15/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base@11.0.15/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base@11.0.15/java.lang.Thread.run(Thread.java:829) , jenkins.util.Timer [#4] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@5db1de9 (owned by Computer.threadPoolForRemoting [#647]): at java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.15/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917) at java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240) at java.base@11.0.15/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:267) at hudson.model.Queue._withLock(Queue.java:1454) at hudson.model.Queue.withLock(Queue.java:1312) at jenkins.model.Nodes.updateNode(Nodes.java:201) at jenkins.model.Jenkins.updateNode(Jenkins.java:2229) at hudson.model.Node.save(Node.java:143) at hudson.util.PersistedList.onModified(PersistedList.java:193) at hudson.util.PersistedList.replaceBy(PersistedList.java:99) at hudson.model.Slave.setNodeProperties(Slave.java:315) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:163) at hudson.plugins.ec2.EC2OndemandSlave.<init>(EC2OndemandSlave.java:73) at hudson.plugins.ec2.util.EC2AgentFactoryImpl.createOnDemandAgent(EC2AgentFactoryImpl.java:15) at hudson.plugins.ec2.SlaveTemplate.newOndemandSlave(SlaveTemplate.java:1566) at hudson.plugins.ec2.SlaveTemplate.toSlaves(SlaveTemplate.java:1182) at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:1154) at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:1355) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:889) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:714) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:740) at com.cloudbees.jenkins.plugins.amazonecs.ECSProvisioningStrategy.apply(ECSProvisioningStrategy.java:65) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:326) at hudson.slaves.NodeProvisioner.access$1000(NodeProvisioner.java:71) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:824) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67) at java.base@11.0.15/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base@11.0.15/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at java.base@11.0.15/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base@11.0.15/java.lang.Thread.run(Thread.java:829) ]]
Saw something similar recently though not in the context of ec2, but kubernetes. I think this is caused by a recent feature of the Gradle Plugin https://github.com/jenkinsci/gradle-plugin/commit/b4aa34b2c48d9d96e212c80d45508dc40c5a023f.
The feature has apparently been disabled by default https://github.com/jenkinsci/gradle-plugin/pull/162 but this is not released yet. Try disabling Gradle plugin and see if that fixes it.This new feature is disabled by default in the latest release of the gradle plugin https://github.com/jenkinsci/gradle-plugin/releases/tag/gradle-1.39.4. maybe try upgrading this plugin.
cc wolfs