Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37154

Attempting to abort an input step hangs waiting for metadata

    XMLWordPrintable

Details

    Description

      Observed in a thread dump:

      "Running CpsFlowExecution[Owner[.../...:... #...]]" id=... state=WAITING cpu=70%
          - waiting on <0x...> (a com.google.common.util.concurrent.AbstractFuture$Sync)
          - locked <0x...> (a com.google.common.util.concurrent.AbstractFuture$Sync)
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
          at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275)
          at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
          at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.loadExecutions(InputAction.java:69)
          at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.remove(InputAction.java:141)
            - locked org.jenkinsci.plugins.workflow.support.steps.input.InputAction@4f8d2160
          at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.postSettlement(InputStepExecution.java:222)
          at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.doAbort(InputStepExecution.java:191)
          at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.stop(InputStepExecution.java:80)
          at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$1.onSuccess(CpsBodyExecution.java:210)
          at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$1.onSuccess(CpsBodyExecution.java:199)
          at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:568)
          at ...
      

      InputAction.loadExecutions currently needs to use a weak API which forces it to block. See discussion here.

      Attachments

        Issue Links

          Activity

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/workflow/support/steps/input/InputAction.java
            http://jenkins-ci.org/commit/pipeline-input-step-plugin/6efdc1fd5c8abd4daa840f4bc938d901e80cabdd
            Log:
            Noting JENKINS-37154.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/workflow/support/steps/input/InputAction.java http://jenkins-ci.org/commit/pipeline-input-step-plugin/6efdc1fd5c8abd4daa840f4bc938d901e80cabdd Log: Noting JENKINS-37154 .
            jglick Jesse Glick added a comment -

            Worse is that this seems to cause a pile-on hang in many threads in the stage view for the job, even if only one build is so affected:

            "Handling GET /job/…/wfapi/runs from … : RequestHandlerThread[#…]" id=… state=BLOCKED cpu=87%
                - waiting to lock <0x…> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction)
                  owned by "Running CpsFlowExecution[Owner[…/…:… #…]]" id=…
                at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.getExecutions(InputAction.java:133)
                at com.cloudbees.workflow.rest.external.RunExt.isPendingInput(RunExt.java:386)
                at com.cloudbees.workflow.rest.external.RunExt.initStatus(RunExt.java:403)
                at com.cloudbees.workflow.rest.external.RunExt.createOld(RunExt.java:319)
                at com.cloudbees.workflow.rest.external.RunExt.create(RunExt.java:303)
                at com.cloudbees.workflow.rest.external.JobExt.create(JobExt.java:126)
                at com.cloudbees.workflow.rest.endpoints.JobAPI.doRuns(JobAPI.java:68)
            
            jglick Jesse Glick added a comment - Worse is that this seems to cause a pile-on hang in many threads in the stage view for the job, even if only one build is so affected: "Handling GET /job/…/wfapi/runs from … : RequestHandlerThread[#…]" id=… state=BLOCKED cpu=87% - waiting to lock <0x…> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction) owned by "Running CpsFlowExecution[Owner[…/…:… #…]]" id=… at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.getExecutions(InputAction.java:133) at com.cloudbees.workflow.rest.external.RunExt.isPendingInput(RunExt.java:386) at com.cloudbees.workflow.rest.external.RunExt.initStatus(RunExt.java:403) at com.cloudbees.workflow.rest.external.RunExt.createOld(RunExt.java:319) at com.cloudbees.workflow.rest.external.RunExt.create(RunExt.java:303) at com.cloudbees.workflow.rest.external.JobExt.create(JobExt.java:126) at com.cloudbees.workflow.rest.endpoints.JobAPI.doRuns(JobAPI.java:68)
            jglick Jesse Glick added a comment -

            It seems that under certain conditions, this hang can occur simply by trying to abort a WorkflowRun paused in input after a restart. Seems to happen only if loadExecutions did not get called before (for example, the UI for the build was not displayed), and input was inside some block-scoped step. Even then it does not happen consistently, so evidently a race condition is at play.

            jglick Jesse Glick added a comment - It seems that under certain conditions, this hang can occur simply by trying to abort a WorkflowRun paused in input after a restart. Seems to happen only if loadExecutions did not get called before (for example, the UI for the build was not displayed), and input was inside some block-scoped step. Even then it does not happen consistently, so evidently a race condition is at play.
            jglick Jesse Glick added a comment -

            Occasional deadlock in a functional test I added:

            "Running CpsFlowExecution[Owner[p/1:p #1]]" #47 daemon prio=5 os_prio=0 tid=0x00007f6c6c024800 nid=0x2da6 waiting on condition [0x00007f6c62778000]
               java.lang.Thread.State: WAITING (parking)
            	at sun.misc.Unsafe.park(Native Method)
            	- parking to wait for  <0x0000000775538078> (a com.google.common.util.concurrent.AbstractFuture$Sync)
            	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
            	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
            	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
            	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
            	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275)
            	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
            	at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.loadExecutions(InputAction.java:66)
            	- locked <0x000000076dc6f3d8> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction)
            	at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.remove(InputAction.java:138)
            	- locked <0x000000076dc6f3d8> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction)
            	at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.postSettlement(InputStepExecution.java:220)
            	at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.doAbort(InputStepExecution.java:188)
            	at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.stop(InputStepExecution.java:80)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:795)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:789)
            	at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150)
            	at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253)
            	at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149)
            	at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134)
            	at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:170)
            	at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:662)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:649)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:586)
            	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:32)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
            	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            	at java.lang.Thread.run(Thread.java:745)
            
            jglick Jesse Glick added a comment - Occasional deadlock in a functional test I added: "Running CpsFlowExecution[Owner[p/1:p #1]]" #47 daemon prio=5 os_prio=0 tid=0x00007f6c6c024800 nid=0x2da6 waiting on condition [0x00007f6c62778000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000775538078> (a com.google.common.util.concurrent.AbstractFuture$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111) at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.loadExecutions(InputAction.java:66) - locked <0x000000076dc6f3d8> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction) at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.remove(InputAction.java:138) - locked <0x000000076dc6f3d8> (a org.jenkinsci.plugins.workflow.support.steps.input.InputAction) at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.postSettlement(InputStepExecution.java:220) at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.doAbort(InputStepExecution.java:188) at org.jenkinsci.plugins.workflow.support.steps.input.InputStepExecution.stop(InputStepExecution.java:80) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:795) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:789) at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134) at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:170) at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:662) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:649) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:586) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:32) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
            jglick Jesse Glick added a comment -

            My first attempt was to define

            class FlowExecutionList {
              // …
              public Iterable<FlowExecutionOwner> getOwners() {/* … */}
            }
            class FlowExecutionOwner {
              // …
              public @Nonnull ListenableFuture<FlowExecution> getPromise() {/* … */}
            }
            

            and to call these things from InputAction.onLoad, using Futures.addCallback to nest asynchronous stuff. This failed with a StackOverflowError in spite of WorkflowRun.LOADING_RUNS:

            at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:470)
            at hudson.model.RunMap.retrieve(RunMap.java:224)
            at hudson.model.RunMap.retrieve(RunMap.java:56)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:479)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:461)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:367)
            at hudson.model.RunMap.getById(RunMap.java:204)
            at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:723)
            at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.getExecutable(WorkflowRun.java:773)
            at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.onLoad(InputAction.java:57)
            at hudson.model.Run.onLoad(Run.java:346)
            at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:470)
            

            My second attempt was to call loadExecutions in a background thread from onLoaded, in the hope that it would complete before we try to use executions. This sporadically failed, as it seems to have gotten a CpsFlowExecution on which onLoad had not yet been called.

            jglick Jesse Glick added a comment - My first attempt was to define class FlowExecutionList { // … public Iterable<FlowExecutionOwner> getOwners() { /* … */ } } class FlowExecutionOwner { // … public @Nonnull ListenableFuture<FlowExecution> getPromise() { /* … */ } } and to call these things from InputAction.onLoad , using Futures.addCallback to nest asynchronous stuff. This failed with a StackOverflowError in spite of WorkflowRun.LOADING_RUNS : at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:470) at hudson.model.RunMap.retrieve(RunMap.java:224) at hudson.model.RunMap.retrieve(RunMap.java:56) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:479) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:461) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:367) at hudson.model.RunMap.getById(RunMap.java:204) at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:723) at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.getExecutable(WorkflowRun.java:773) at org.jenkinsci.plugins.workflow.support.steps.input.InputAction.onLoad(InputAction.java:57) at hudson.model.Run.onLoad(Run.java:346) at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:470) My second attempt was to call loadExecutions in a background thread from onLoaded , in the hope that it would complete before we try to use executions . This sporadically failed, as it seems to have gotten a CpsFlowExecution on which onLoad had not yet been called.

            People

              jglick Jesse Glick
              jglick Jesse Glick
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: