• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • ec2-plugin
    • None
    • Jenkins ver. 2.138.2 + EC2 1.39

      Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?
       

      Starting health checks at Mon Oct 29 05:32:10 UTC 2018
      Health check results at Mon Oct 29 05:32:10 UTC 2018:
       * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z}
       * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z}
       * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z}
       * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]):
      	 at sun.misc.Unsafe.park(Native Method)
      	 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	 at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      	 at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      	 at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190)
      	 at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
      	 at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176)
      	 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      	 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	 at java.lang.Thread.run(Thread.java:748)
      , jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]):
      	 at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638)
      	 at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114)
      	 at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155)
      	 at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173)
      	 at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157)
      	 at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115)
      	 at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141)
      	 at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104)
      	 at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85)
      	 at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43)
      	 at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
      	 at hudson.model.Queue._withLock(Queue.java:1380)
      	 at hudson.model.Queue.withLock(Queue.java:1257)
      	 at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
      	 at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
      	 at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
      	 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	 at java.lang.Thread.run(Thread.java:748)
      , jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]):
      	 at sun.misc.Unsafe.park(Native Method)
      	 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	 at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      	 at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      	 at hudson.model.Queue._withLock(Queue.java:1437)
      	 at hudson.model.Queue.withLock(Queue.java:1300)
      	 at jenkins.model.Nodes.updateNode(Nodes.java:193)
      	 at jenkins.model.Jenkins.updateNode(Jenkins.java:2080)
      	 at hudson.model.Node.save(Node.java:140)
      	 at hudson.util.PersistedList.onModified(PersistedList.java:173)
      	 at hudson.util.PersistedList.replaceBy(PersistedList.java:85)
      	 at hudson.model.Slave.<init>(Slave.java:198)
      	 at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134)
      	 at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43)
      	 at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36)
      	 at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914)
      	 at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893)
      	 at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404)
      	 at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534)
      	 at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551)
      	 at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
      	 at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
      	 at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
      	 at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807)
      	 at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
      	 at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
      	 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	 at java.lang.Thread.run(Thread.java:748)
      ], timestamp=2018-10-29T05:32:10.085Z}
      

       

      (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288

          [JENKINS-54298] Jenkins crashes with Deadlock with EC2 Plugin

          Günter Grodotzki created issue -
          Günter Grodotzki made changes -
          Description Original: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?

           

          {{Starting health checks at Mon Oct 29 05:32:10 UTC 2018Starting health checks at Mon Oct 29 05:32:10 UTC 2018Health check results at Mon Oct 29 05:32:10 UTC 2018: * disk-space: Result\{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z} * plugins: Result\{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z} * temporary-space: Result\{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z} * thread-deadlock: Result\{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]): at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638) at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114) at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155) at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173) at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157) at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115) at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141) at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1380) at hudson.model.Queue.withLock(Queue.java:1257) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.model.Queue._withLock(Queue.java:1437) at hudson.model.Queue.withLock(Queue.java:1300) at jenkins.model.Nodes.updateNode(Nodes.java:193) at jenkins.model.Jenkins.updateNode(Jenkins.java:2080) at hudson.model.Node.save(Node.java:140) at hudson.util.PersistedList.onModified(PersistedList.java:173) at hudson.util.PersistedList.replaceBy(PersistedList.java:85) at hudson.model.Slave.<init>(Slave.java:198) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36) at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914) at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)], timestamp=2018-10-29T05:32:10.085Z}}}
          New: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?

           
          {noformat}
          Starting health checks at Mon Oct 29 05:32:10 UTC 2018Starting health checks at Mon Oct 29 05:32:10 UTC 2018Health check results at Mon Oct 29 05:32:10 UTC 2018: * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z} * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z} * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z} * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]): at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638) at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114) at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155) at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173) at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157) at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115) at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141) at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1380) at hudson.model.Queue.withLock(Queue.java:1257) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.model.Queue._withLock(Queue.java:1437) at hudson.model.Queue.withLock(Queue.java:1300) at jenkins.model.Nodes.updateNode(Nodes.java:193) at jenkins.model.Jenkins.updateNode(Jenkins.java:2080) at hudson.model.Node.save(Node.java:140) at hudson.util.PersistedList.onModified(PersistedList.java:173) at hudson.util.PersistedList.replaceBy(PersistedList.java:85) at hudson.model.Slave.<init>(Slave.java:198) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36) at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914) at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)], timestamp=2018-10-29T05:32:10.085Z{noformat}
           

          (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288)&#xA0;

          We have this deadlock issue with Jenkins 2.138.2 and downgrading Jenkins to version 2.138.1 has resolved it for us.

          Oliver Pereira added a comment - We have this deadlock issue with Jenkins 2.138.2 and downgrading Jenkins to version 2.138.1 has resolved it for us.

          Baptiste Mathus added a comment - - edited

          lifeofguenter could you possibly use git bisect to help us identify the exact commit where this issue was introduced?

          Given you've already identified the change was introduced between 2.138.1 and 2.138.2, this should should pretty easy to do if you already know how to reproduce/check for the presence of your issue.

          If you do not know how to do a git bisect, I'm ready to write a blog about it with pleasure.

          Baptiste Mathus added a comment - - edited lifeofguenter could you possibly use git bisect to help us identify the exact commit where this issue was introduced? Given you've already identified the change was introduced between 2.138.1 and 2.138.2, this should should pretty easy to do if you already know how to reproduce/check for the presence of your issue. If you do not know how to do a git bisect , I'm ready to write a blog about it with pleasure.
          Baptiste Mathus made changes -
          Description Original: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?

           
          {noformat}
          Starting health checks at Mon Oct 29 05:32:10 UTC 2018Starting health checks at Mon Oct 29 05:32:10 UTC 2018Health check results at Mon Oct 29 05:32:10 UTC 2018: * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z} * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z} * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z} * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]): at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638) at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114) at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155) at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173) at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157) at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115) at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141) at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1380) at hudson.model.Queue.withLock(Queue.java:1257) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748), jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]): at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.model.Queue._withLock(Queue.java:1437) at hudson.model.Queue.withLock(Queue.java:1300) at jenkins.model.Nodes.updateNode(Nodes.java:193) at jenkins.model.Jenkins.updateNode(Jenkins.java:2080) at hudson.model.Node.save(Node.java:140) at hudson.util.PersistedList.onModified(PersistedList.java:173) at hudson.util.PersistedList.replaceBy(PersistedList.java:85) at hudson.model.Slave.<init>(Slave.java:198) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43) at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36) at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914) at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)], timestamp=2018-10-29T05:32:10.085Z{noformat}
           

          (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288)&#xA0;
          New: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?

           
          {noformat}
          Starting health checks at Mon Oct 29 05:32:10 UTC 2018
          Health check results at Mon Oct 29 05:32:10 UTC 2018:
           * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z}
           * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z}
           * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z}
           * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]):
          at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638)
          at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114)
          at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155)
          at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173)
          at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157)
          at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115)
          at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141)
          at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43)
          at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
          at hudson.model.Queue._withLock(Queue.java:1380)
          at hudson.model.Queue.withLock(Queue.java:1257)
          at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.model.Queue._withLock(Queue.java:1437)
          at hudson.model.Queue.withLock(Queue.java:1300)
          at jenkins.model.Nodes.updateNode(Nodes.java:193)
          at jenkins.model.Jenkins.updateNode(Jenkins.java:2080)
          at hudson.model.Node.save(Node.java:140)
          at hudson.util.PersistedList.onModified(PersistedList.java:173)
          at hudson.util.PersistedList.replaceBy(PersistedList.java:85)
          at hudson.model.Slave.<init>(Slave.java:198)
          at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36)
          at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914)
          at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893)
          at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404)
          at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534)
          at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551)
          at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          ], timestamp=2018-10-29T05:32:10.085Z}
          {noformat}
           

          (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288)&#xA0;
          Baptiste Mathus made changes -
          Description Original: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?

           
          {noformat}
          Starting health checks at Mon Oct 29 05:32:10 UTC 2018
          Health check results at Mon Oct 29 05:32:10 UTC 2018:
           * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z}
           * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z}
           * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z}
           * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]):
          at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638)
          at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114)
          at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155)
          at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173)
          at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157)
          at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115)
          at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141)
          at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43)
          at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
          at hudson.model.Queue._withLock(Queue.java:1380)
          at hudson.model.Queue.withLock(Queue.java:1257)
          at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.model.Queue._withLock(Queue.java:1437)
          at hudson.model.Queue.withLock(Queue.java:1300)
          at jenkins.model.Nodes.updateNode(Nodes.java:193)
          at jenkins.model.Jenkins.updateNode(Jenkins.java:2080)
          at hudson.model.Node.save(Node.java:140)
          at hudson.util.PersistedList.onModified(PersistedList.java:173)
          at hudson.util.PersistedList.replaceBy(PersistedList.java:85)
          at hudson.model.Slave.<init>(Slave.java:198)
          at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36)
          at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914)
          at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893)
          at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404)
          at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534)
          at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551)
          at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          ], timestamp=2018-10-29T05:32:10.085Z}
          {noformat}
           

          (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288)&#xA0;
          New: Our Jenkins master (0 executors) will occasionally crash. Looking at the health-check logs it seems like there is a thread deadlock with the ec2 plugin?
           
          {noformat}
          Starting health checks at Mon Oct 29 05:32:10 UTC 2018
          Health check results at Mon Oct 29 05:32:10 UTC 2018:
           * disk-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.037Z}
           * plugins: Result{isHealthy=true, message=No failed plugins, timestamp=2018-10-29T05:32:10.038Z}
           * temporary-space: Result{isHealthy=true, timestamp=2018-10-29T05:32:10.038Z}
           * thread-deadlock: Result{isHealthy=false, message=[Computer.threadPoolForRemoting [#9788] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@20e93cc3 (owned by jenkins.util.Timer [#7]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:190)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$1.run(NodeProvisioner.java:176)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#9] locked on hudson.plugins.ec2.AmazonEC2Cloud@3fdbeb4b (owned by jenkins.util.Timer [#7]):
          at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:638)
          at hudson.plugins.ec2.EC2SpotSlave.getSpotRequest(EC2SpotSlave.java:114)
          at hudson.plugins.ec2.EC2SpotSlave.getInstanceId(EC2SpotSlave.java:155)
          at hudson.plugins.ec2.EC2Computer._describeInstanceOnce(EC2Computer.java:173)
          at hudson.plugins.ec2.EC2Computer._describeInstance(EC2Computer.java:157)
          at hudson.plugins.ec2.EC2Computer.describeInstance(EC2Computer.java:115)
          at hudson.plugins.ec2.EC2Computer.getUptime(EC2Computer.java:141)
          at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:104)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85)
          at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43)
          at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
          at hudson.model.Queue._withLock(Queue.java:1380)
          at hudson.model.Queue.withLock(Queue.java:1257)
          at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          , jenkins.util.Timer [#7] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@4f27ad4d (owned by jenkins.util.Timer [#9]):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at hudson.model.Queue._withLock(Queue.java:1437)
          at hudson.model.Queue.withLock(Queue.java:1300)
          at jenkins.model.Nodes.updateNode(Nodes.java:193)
          at jenkins.model.Jenkins.updateNode(Jenkins.java:2080)
          at hudson.model.Node.save(Node.java:140)
          at hudson.util.PersistedList.onModified(PersistedList.java:173)
          at hudson.util.PersistedList.replaceBy(PersistedList.java:85)
          at hudson.model.Slave.<init>(Slave.java:198)
          at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:134)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:43)
          at hudson.plugins.ec2.EC2SpotSlave.<init>(EC2SpotSlave.java:36)
          at hudson.plugins.ec2.SlaveTemplate.newSpotSlave(SlaveTemplate.java:914)
          at hudson.plugins.ec2.SlaveTemplate.provisionSpot(SlaveTemplate.java:893)
          at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:404)
          at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:534)
          at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:551)
          at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
          at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
          at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
          at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807)
          at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          ], timestamp=2018-10-29T05:32:10.085Z}
          {noformat}
           

          (sorry the formatting looks bad, so I uploaded it here as well: https://pastebin.co.za/5677303661068288)&#xA0;

          PR 321 does not throw this error anymore and I am going to continue testing this.

          Oliver Pereira added a comment - PR 321 does not throw this error anymore and I am going to continue testing this.
          Oliver Pereira made changes -
          Link New: This issue relates to JENKINS-54187 [ JENKINS-54187 ]

          Callum Pember added a comment - - edited

          This happens to me too using the latest version of the plugin. I use the EC2 plugin pretty heavily with the 'Idle termination time' setting. I'm getting crashes every few days. I just installed the support core plugin and got a dump for when it happened today, if that would be helpful.

          Callum Pember added a comment - - edited This happens to me too using the latest version of the plugin. I use the EC2 plugin pretty heavily with the 'Idle termination time' setting. I'm getting crashes every few days. I just installed the support core plugin and got a dump for when it happened today, if that would be helpful.
          dashvvv made changes -
          Link New: This issue relates to JENKINS-55189 [ JENKINS-55189 ]

            thoulen FABRIZIO MANFREDI
            lifeofguenter Günter Grodotzki
            Votes:
            4 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: