-
Bug
-
Resolution: Duplicate
-
Blocker
-
jclouds working with an openstack
We observe a deadlock in JClouds plugin that happens in a following scenarion:
1. There is a thread (thread A) that performs deployment of a new slave
2. Another thread (thread B) is working at the same time and decides take down one of the existing nodes
3. Thread A performs Hudson.getInstance().addNode() here: https://github.com/jenkinsci/jclouds-plugin/blob/master/jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsCloud.java#L205. At th
4. Thread B performs setTemporaryOffline on a computer of the slave it wants to take down here: https://github.com/jenkinsci/jclouds-plugin/blob/master/jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsRetentionStrategy.java#L30 . At that moment it already holds the lock for the JCloudsRetentionStarategy instance because it is synchronized
5. addNode is synchronized on Hudson instance and locks it
6. addNode subsequently calls hudson.model.AbstractCIBase.updateComputerList, which calls retentionStrategy.check on a node being deleted in thread B and locks on it
7. setTemporarilyOffline calls Jenkins.save that locks on the Jenkins (i.e. Hudson) instance, which is already locked by thread A in step 5.
The stack trace of thead A is therefore as follows:
Name: Computer.threadPoolForRemoting [#135] State: BLOCKED on jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy@4ccee114 owned by: jenkins.util.Timer [#8] Total blocked: 8 Total waited: 1 Stack trace: jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy.check(JCloudsRetentionStrategy.java:22) jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy.check(JCloudsRetentionStrategy.java:15) hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:663) hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:120) hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:180) - locked java.lang.Object@5a4034b5 jenkins.model.Jenkins.updateComputerList(Jenkins.java:1214) jenkins.model.Jenkins.setNodes(Jenkins.java:1711) jenkins.model.Jenkins.addNode(Jenkins.java:1693) - locked hudson.model.Hudson@32da12ad jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:205) jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:201) java.util.concurrent.FutureTask.run(FutureTask.java:262) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:744)
The stack trace of thread B is as follows:
Name: jenkins.util.Timer [#8] State: BLOCKED on hudson.model.Hudson@32da12ad owned by: Computer.threadPoolForRemoting [#135] Total blocked: 2 Total waited: 274 Stack trace: jenkins.model.Jenkins.save(Jenkins.java:2672) hudson.model.Node.setTemporaryOfflineCause(Node.java:221) hudson.model.Computer.setTemporarilyOffline(Computer.java:590) jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy.check(JCloudsRetentionStrategy.java:30) - locked jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy@4ccee114 jenkins.plugins.jclouds.compute.JCloudsRetentionStrategy.check(JCloudsRetentionStrategy.java:15) hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:66) hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:744)
This actually causes deadly problems during normal operation of JClouds under somewhat high load.
- duplicates
-
JENKINS-15820 JClouds plugin deadlock
-
- Closed
-