Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52362

Jenkins hangs due to "Running CpsFlowExecution unresponsive"

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker

      Three times in the last two weeks, we've had our Jenkins server stop responding to requests. When I check syslog, I see errors like this:

      Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
      Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/263:project #263] unresponsive for 5 sec
      Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
      Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/368:project #368] unresponsive for 5 sec
      Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
      Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/318:project #318] unresponsive for 5 sec

      These seem to persist indefinitely and there don't seem to be any other relevant messages in the log. The Web UI just hangs until nginx times out.

      The Java process will then refuse to stop when I try to restart the service and I have to kill it with kill -9.

       

          [JENKINS-52362] Jenkins hangs due to "Running CpsFlowExecution unresponsive"

          Philip Douglas created issue -

          Tyler May added a comment -

          We're having the same issue. Jenkins will run fine for a while, then lock up and we see the same errors in the logs. We're using Jenkins 2.127, workflow-cps 2.51:

          INFO: Running CpsFlowExecutionOwner[git/org/master/146:git/org/master #146] unresponsive for 5 sec
          Jul 05, 2018 3:28:36 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0

          Tyler May added a comment - We're having the same issue. Jenkins will run fine for a while, then lock up and we see the same errors in the logs. We're using Jenkins 2.127, workflow-cps 2.51: INFO: Running CpsFlowExecution Owner[git/org/master/146:git/org/master #146] unresponsive for 5 sec Jul 05, 2018 3:28:36 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0

          Ivan Aracki added a comment - - edited

          Happening randomly maybe 30% of the time. I'm using Jenkins to build and run spring-boot docker containers. Jenkins is also run in a container.

          The message I am getting:

           Jul 09, 2018 9:13:40 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
          {code:java}
          INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#336]: checking /var/jenkins_home/workspace/tonicdm on unresponsive for 5.1 sec
          

          Ivan Aracki added a comment - - edited Happening randomly maybe 30% of the time. I'm using Jenkins to build and run spring-boot docker containers. Jenkins is also run in a container. The message I am getting: Jul 09, 2018 9:13:40 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 {code:java} INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#336]: checking / var /jenkins_home/workspace/tonicdm on unresponsive for 5.1 sec

          venkat reddy added a comment - - edited

          We are also seeing same issue. 
          Jenkins ver. 2.121.1  

          Jenkins log

          Jul 22, 2018 6:53:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
          INFO: Running CpsFlowExecutionOwner[-Project/872:Project #872 unresponsive for 22 hr

           

           

          venkat reddy added a comment - - edited We are also seeing same issue.  Jenkins ver. 2.121.1    Jenkins log Jul 22, 2018 6:53:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: Running CpsFlowExecution Owner[-Project/872:Project #872 unresponsive for 22 hr    

          Also seeing this issue. It starts looping with the unresponsive time going up, but otherwise no change. It never seems to recover from this state, even after 12+ hours.

          Running jenkins from the Docker tag: jenkins/jenkins:lts
          Currently at version: 2.121.3

          Log output:

          Sep 01, 2018 11:42:41 PM com.squareup.okhttp.internal.Platform$JdkWithJettyBootPlatform getSelectedProtocol
          INFO: ALPN callback dropped: SPDY and HTTP/2 are disabled. Is alpn-boot on the boot class path?
          Sep 01, 2018 11:43:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
          INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 5 sec
          Sep 01, 2018 11:43:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
          INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 13 sec
          Sep 01, 2018 11:44:00 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
          

           

          Josiah Niedrauer added a comment - Also seeing this issue. It starts looping with the unresponsive time going up, but otherwise no change. It never seems to recover from this state, even after 12+ hours. Running jenkins from the Docker tag: jenkins/jenkins:lts Currently at version: 2.121.3 Log output: Sep 01, 2018 11:42:41 PM com.squareup.okhttp.internal.Platform$JdkWithJettyBootPlatform getSelectedProtocol INFO: ALPN callback dropped: SPDY and HTTP/2 are disabled. Is alpn-boot on the boot class path? Sep 01, 2018 11:43:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 5 sec Sep 01, 2018 11:43:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 13 sec Sep 01, 2018 11:44:00 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0  

          I have not been able to reproduce this since adding swap to my jenkins master docker host. I think this condition may somehow be triggered by low memory.

          Josiah Niedrauer added a comment - I have not been able to reproduce this since adding swap to my jenkins master docker host. I think this condition may somehow be triggered by low memory.
          Sam Van Oort made changes -
          Component/s New: durable-task-plugin [ 18622 ]
          Component/s New: workflow-cps-plugin [ 21713 ]
          Component/s New: workflow-durable-task-step-plugin [ 21715 ]
          Component/s Original: core [ 15593 ]
          Sam Van Oort made changes -
          Labels New: pipeline threads

          Sam Van Oort added a comment -

          jniedrauer pdouglas Please can you grab and attach a thread dump from when you see this issue?

          Sam Van Oort added a comment - jniedrauer pdouglas Please can you grab and attach a thread dump from when you see this issue?
          Nico Schmoigl made changes -
          Assignee New: Nico Schmoigl [ eagle_rainbow ]

            Unassigned Unassigned
            pdouglas Philip Douglas
            Votes:
            40 Vote for this issue
            Watchers:
            60 Start watching this issue

              Created:
              Updated: