Jenkins version 2.105 (latest)

      All plugins updated

      I have hundreds of pipeline jobs that are showing queued against the master, but not assigned to an executor.  If I check each job, all the jobs are already completed (success or failures).  They're just stuck.  Restarting the master doesn't seem to help.

      Job logs all show evidence of trying to resume the job after master restart:

      [Pipeline] End of Pipeline Resuming build at Tue Feb 06 01:27:30 UTC 2018 after Jenkins restart [Pipeline] End of Pipeline

      There are a ton of entries in org.jenkinsci.plugins.workflow.flow.FlowExecutionList.xml – are these the stuck jobs?  I tried clearing them and bouncing the master, but they 'come back'

      Eventually it seems like the master gets overloaded with these stuck jobs, and stops processing or dispatching jobs to slaves.

          [JENKINS-49389] Completed pipeline jobs queued against master

          John Arnold added a comment - - edited

          Attached a copy-paste of the /threadDump page, which shows both the list of jobs queued against the master, and all the threads associated. Note, there are a ton of these threads:

           

          //
          org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#514]
          "org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#514]" Id=7005 Group=main TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@c7a8190
           at sun.misc.Unsafe.park(Native Method)
           -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@c7a8190
           at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
           at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
           at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:1129)
           at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:809)
           at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
          

           

           

          John Arnold added a comment - - edited Attached a copy-paste of the /threadDump page, which shows both the list of jobs queued against the master, and all the threads associated. Note, there are a ton of these threads:   // org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#514] "org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#514]" Id=7005 Group=main TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@c7a8190  at sun.misc.Unsafe.park(Native Method)  -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@c7a8190  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)  at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:1129)  at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:809)  at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)  at java.lang. Thread .run( Thread .java:748)    

          John Arnold added a comment -

          jglick svanoort Can you take a look? I can provide any data.

          John Arnold added a comment - jglick svanoort Can you take a look? I can provide any data.

          John Arnold added a comment -

          I picked a stuck build, did 'Delete Build' in the GUI, and it still shows up as a stuck/queued build under the master.

          John Arnold added a comment - I picked a stuck build, did 'Delete Build' in the GUI, and it still shows up as a stuck/queued build under the master.

          John Arnold added a comment -

          It seems like there should be a check – Jenkins should never attempt to resume a completed job.  Also seems like resume should timeout after some low threshold, 300sec default or something.

          John Arnold added a comment - It seems like there should be a check – Jenkins should never attempt to resume a completed job.  Also seems like resume should timeout after some low threshold, 300sec default or something.

          Oleg Nenashev added a comment -

          Added to the Pipeline scrub queue, CC abayer svanoort

          Oleg Nenashev added a comment - Added to the Pipeline scrub queue, CC abayer svanoort

          Sam Van Oort added a comment -

          johnar I just released a big update to workflow-cps (v 2.47) and workflow-job (v 2.18) that fixes a large variety of related issues. Could you please install these updates and let us know if they resolve the issues here? Thanks!

          Sam Van Oort added a comment - johnar I just released a big update to workflow-cps (v 2.47) and workflow-job (v 2.18) that fixes a large variety of related issues. Could you please install these updates and let us know if they resolve the issues here? Thanks!

          Vivek Pandey added a comment -

          As mentioned by svanoort, this issue has been fixed.

          John Arnold I just released a big update to workflow-cps (v 2.47) and workflow-job (v 2.18) that fixes a large variety of related issues. Could you please install these updates and let us know if they resolve the issues here? Thanks!

          Vivek Pandey added a comment - As mentioned by svanoort , this issue has been fixed. John Arnold I just released a big update to workflow-cps (v 2.47) and workflow-job (v 2.18) that fixes a large variety of related issues. Could you please install these updates and let us know if they resolve the issues here? Thanks!

          Same bug on Jenkins 2.164.1

          workflow-csp 2.67

          workflow-job 2.32

          All pipelines jobs are being started on master that caused a server load.

          Rafal Kowalski added a comment - Same bug on Jenkins 2.164.1 workflow-csp 2.67 workflow-job 2.32 All pipelines jobs are being started on master that caused a server load.

            Unassigned Unassigned
            johnar John Arnold
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: