Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-25890

Deadlock between RunMap and Queue after restart; StepContext.isReady impl acquires lock

      Found one Java-level deadlock:
      =============================
      "Thread-5":
        waiting to lock monitor 0x00007f0984170b38 (object 0x0000000706fe3aa8, a hudson.model.RunMap),
        which is held by "Jenkins initialization thread"
      "Jenkins initialization thread":
        waiting to lock monitor 0x00007f0988015128 (object 0x00000007066b46c0, a hudson.model.Queue),
        which is held by "Thread-5"
      
      Java stack information for the threads listed above:
      ===================================================
      "Thread-5":
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:688)
      	- waiting to lock <0x0000000706fe3aa8> (a hudson.model.RunMap)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:671)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getById(AbstractLazyLoadRunMap.java:543)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:523)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:533)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getFlowExecution(CpsStepContext.java:386)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getProgramPromise(CpsStepContext.java:230)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.isReady(CpsStepContext.java:236)
      	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.run(ExecutorStepExecution.java:262)
      	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:281)
      	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getFullDisplayName(ExecutorStepExecution.java:290)
      	at hudson.model.LoadBalancer$1.assignGreedily(LoadBalancer.java:107)
      	at hudson.model.LoadBalancer$1.map(LoadBalancer.java:97)
      	at hudson.model.LoadBalancer$2.map(LoadBalancer.java:148)
      	at hudson.model.Queue.maintain(Queue.java:1053)
      	- locked <0x00000007066b46c0> (a hudson.model.Queue)
      	at hudson.model.Queue$1.call(Queue.java:316)
      	at hudson.model.Queue$1.call(Queue.java:313)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:94)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:84)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:104)
      	at java.lang.Thread.run(Thread.java:745)
      "Jenkins initialization thread":
      	at hudson.model.Queue.schedule2(Queue.java:639)
      	- waiting to lock <0x00000007066b46c0> (a hudson.model.Queue)
      	at org.jenkinsci.plugins.workflow.support.pickles.ExecutorPickle.rehydrate(ExecutorPickle.java:67)
      	at org.jenkinsci.plugins.workflow.support.pickles.serialization.PickleResolver.rehydrate(PickleResolver.java:68)
      	at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverReader.restorePickles(RiverReader.java:128)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.loadProgramAsync(CpsFlowExecution.java:401)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:379)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:300)
      	at hudson.model.RunMap.retrieve(RunMap.java:219)
      	at hudson.model.RunMap.retrieve(RunMap.java:56)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:688)
      	- locked <0x0000000706fe3aa8> (a hudson.model.RunMap)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:671)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getById(AbstractLazyLoadRunMap.java:543)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:523)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:533)
      	at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:59)
      	at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:51)
      	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
      	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
      	at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:165)
      	at jenkins.model.Jenkins.<init>(Jenkins.java:845)
      	at hudson.model.Hudson.<init>(Hudson.java:82)
      	at hudson.model.Hudson.<init>(Hudson.java:78)
      	at hudson.WebAppMain$3.run(WebAppMain.java:222)
      

      Ironically, StepContext.isReady is what is supposed to be breaking deadlocks, yet here it is acquiring a lock.

      Since getFlowExecution may block, I think getProgramPromise should be made to return a future which encompasses both getting the execution, and its programPromise.

          [JENKINS-25890] Deadlock between RunMap and Queue after restart; StepContext.isReady impl acquires lock

          Code changed in jenkins
          User: Jesse Glick
          Path:
          CHANGES.md
          aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          step-api/src/main/java/org/jenkinsci/plugins/workflow/steps/StepContext.java
          http://jenkins-ci.org/commit/workflow-plugin/790a5453ef094da16e906acc5b5cc512ac1bde60
          Log:
          [FIXED JENKINS-25890] isReady should not block, so getProgramPromise may not block on getFlowExecution.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: CHANGES.md aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java step-api/src/main/java/org/jenkinsci/plugins/workflow/steps/StepContext.java http://jenkins-ci.org/commit/workflow-plugin/790a5453ef094da16e906acc5b5cc512ac1bde60 Log: [FIXED JENKINS-25890] isReady should not block, so getProgramPromise may not block on getFlowExecution.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          CHANGES.md
          aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          step-api/src/main/java/org/jenkinsci/plugins/workflow/steps/StepContext.java
          http://jenkins-ci.org/commit/workflow-plugin/8740a01d48c36289efaca6d2af7714cc7718594e
          Log:
          Merge pull request #65 from jglick/isReady-JENKINS-25890

          JENKINS-25890 Deadlock

          Compare: https://github.com/jenkinsci/workflow-plugin/compare/8fe11e081785...8740a01d48c3

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: CHANGES.md aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java step-api/src/main/java/org/jenkinsci/plugins/workflow/steps/StepContext.java http://jenkins-ci.org/commit/workflow-plugin/8740a01d48c36289efaca6d2af7714cc7718594e Log: Merge pull request #65 from jglick/isReady- JENKINS-25890 JENKINS-25890 Deadlock Compare: https://github.com/jenkinsci/workflow-plugin/compare/8fe11e081785...8740a01d48c3

          Jesse Glick added a comment -

          The fix just caused another deadlock: consumption of Timer threads and thus starvation of pickle resolvers.

          Jesse Glick added a comment - The fix just caused another deadlock: consumption of Timer threads and thus starvation of pickle resolvers.

          Jesse Glick added a comment -

          Lack of logging from JENKINS-26130 made it hard to diagnose a failure in WorkflowTest.acquireWorkspace.

          Jesse Glick added a comment - Lack of logging from JENKINS-26130 made it hard to diagnose a failure in WorkflowTest.acquireWorkspace .

          Code changed in jenkins
          User: Jesse Glick
          Path:
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          http://jenkins-ci.org/commit/workflow-plugin/c16a522e69791622e379f42aa049a965e3be6dc8
          Log:
          JENKINS-25890 Refined fix to use its own thread pool, not jenkins.util.Timer.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java http://jenkins-ci.org/commit/workflow-plugin/c16a522e69791622e379f42aa049a965e3be6dc8 Log: JENKINS-25890 Refined fix to use its own thread pool, not jenkins.util.Timer.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          http://jenkins-ci.org/commit/workflow-plugin/729984d402ebae38c2dc44ea3d72eb5f01aa6dfe
          Log:
          Merge pull request #68 from jglick/starvation-JENKINS-25890

          JENKINS-25890 Use own thread pool for CpsStepContext.getProgramPromise

          Compare: https://github.com/jenkinsci/workflow-plugin/compare/2e48ba9981df...729984d402eb

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java http://jenkins-ci.org/commit/workflow-plugin/729984d402ebae38c2dc44ea3d72eb5f01aa6dfe Log: Merge pull request #68 from jglick/starvation- JENKINS-25890 JENKINS-25890 Use own thread pool for CpsStepContext.getProgramPromise Compare: https://github.com/jenkinsci/workflow-plugin/compare/2e48ba9981df...729984d402eb

          Code changed in jenkins
          User: Jesse Glick
          Path:
          aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/94b4b01fee287acb96ae2ced8da451045260420d
          Log:
          JENKINS-25890 causing problems for this test.
          Originally-Committed-As: f94559d5f48c794c836ee2d80e93a13fbb6c6ed6

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/94b4b01fee287acb96ae2ced8da451045260420d Log: JENKINS-25890 causing problems for this test. Originally-Committed-As: f94559d5f48c794c836ee2d80e93a13fbb6c6ed6

          Code changed in jenkins
          User: Jesse Glick
          Path:
          aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/32e4663331d273928301d699698ac59df3e46d77
          Log:
          [FIXED JENKINS-25890] isReady should not block, so getProgramPromise may not block on getFlowExecution.
          Originally-Committed-As: 790a5453ef094da16e906acc5b5cc512ac1bde60

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: aggregator/src/test/java/org/jenkinsci/plugins/workflow/WorkflowTest.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodySubContext.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java http://jenkins-ci.org/commit/workflow-cps-plugin/32e4663331d273928301d699698ac59df3e46d77 Log: [FIXED JENKINS-25890] isReady should not block, so getProgramPromise may not block on getFlowExecution. Originally-Committed-As: 790a5453ef094da16e906acc5b5cc512ac1bde60

          Code changed in jenkins
          User: Jesse Glick
          Path:
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/f073529a8bb9befbde8f23d9c8cbe650a74f108b
          Log:
          JENKINS-25890 Refined fix to use its own thread pool, not jenkins.util.Timer.
          Originally-Committed-As: c16a522e69791622e379f42aa049a965e3be6dc8

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java http://jenkins-ci.org/commit/workflow-cps-plugin/f073529a8bb9befbde8f23d9c8cbe650a74f108b Log: JENKINS-25890 Refined fix to use its own thread pool, not jenkins.util.Timer. Originally-Committed-As: c16a522e69791622e379f42aa049a965e3be6dc8

          Code changed in jenkins
          User: Jesse Glick
          Path:
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java
          cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/5abe16bc533fb1786aae6bcd2fa95e50549ca712
          Log:
          Merge pull request #68 from jglick/starvation-JENKINS-25890

          JENKINS-25890 Use own thread pool for CpsStepContext.getProgramPromise
          Originally-Committed-As: 729984d402ebae38c2dc44ea3d72eb5f01aa6dfe

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java cps/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsStepContext.java http://jenkins-ci.org/commit/workflow-cps-plugin/5abe16bc533fb1786aae6bcd2fa95e50549ca712 Log: Merge pull request #68 from jglick/starvation- JENKINS-25890 JENKINS-25890 Use own thread pool for CpsStepContext.getProgramPromise Originally-Committed-As: 729984d402ebae38c2dc44ea3d72eb5f01aa6dfe

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: