Hi,

      We found a potential bug that can only be replicated in pipeline jobs. Essentially when a job a running and a Jenkins restart occurs, the job is left hanging infinitely:

      Resuming build at Tue Jan 03 10:37:18 UTC 2017 after Jenkins restart
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      Waiting to resume part of TestRun2 #2: Waiting for next available executor
      ...
      

      I noticed that this behaviour does not exhibit on any other job types. i.e. freestyle.

      Here is a simple test pipeline script:

      node('XXXXX') {
      
        stage 'Stage 1'
          println 'Deploying to Stage 1...'
      
        stage 'Stage 2'
          println 'Running Tests in Stage 2'
          sleep 120
          println 'Tests passed!'
      
        stage 'Stage 3'
          println 'Deploying to Stage 3...'
      
      }
      

      ...Restart Jenkins as soon as it enters Stage 2, to replicate such behaviour.

      Currently I am using version 2.3, but I believe this issue was replicated in previous versions.

      Please can you help me explain why this behaviour only exists in pipeline jobs?

      Kind Regards,
      Tuan

          [JENKINS-40771] Race condition in FlowExecutionList

          Tuan Nguyen created issue -
          Tuan Nguyen made changes -
          Summary Original: Pipeline causes job to hang infinitely on restart New: Pipeline causes job to hang infinitely on Jenkins restart
          Tuan Nguyen made changes -
          Description Original: Hi,

          We found a potential bug that can only be replicated in pipeline jobs. Essentially when a job a running and a Jenkins restart occurs, the job is left hanging infinitely:
          {noformat}
          Resuming build at Tue Jan 03 10:37:18 UTC 2017 after Jenkins restart
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          ...
          {noformat}

          I noticed that this does not occur on any other job types. i.e. freestyle.


          Here is a simple test pipeline script:
          {noformat}
          node('XXXXX') {

            stage 'Stage 1'
              println 'Deploying to Stage 1...'

            stage 'Stage 2'
              println 'Running Tests in Stage 2'
              sleep 120
              println 'Tests passed!'

            stage 'Stage 3'
              println 'Deploying to Stage 3...'

          }
          {noformat}
          ...Restart Jenkins as soon as it enters Stage 2, to replicate such behaviour.

          Currently I am using version 2.3, but I believe this issue was replicated in previous versions.

          Please can you help me explain why this behaviour only exists in pipeline jobs?

          Kind Regards,
          Tuan
          New: Hi,

          We found a potential bug that can only be replicated in pipeline jobs. Essentially when a job a running and a Jenkins restart occurs, the job is left hanging infinitely:
          {noformat}
          Resuming build at Tue Jan 03 10:37:18 UTC 2017 after Jenkins restart
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          Waiting to resume part of TestRun2 #2: Waiting for next available executor
          ...
          {noformat}

          I noticed that this behaviour does not exhibit on any other job types. i.e. freestyle.


          Here is a simple test pipeline script:
          {noformat}
          node('XXXXX') {

            stage 'Stage 1'
              println 'Deploying to Stage 1...'

            stage 'Stage 2'
              println 'Running Tests in Stage 2'
              sleep 120
              println 'Tests passed!'

            stage 'Stage 3'
              println 'Deploying to Stage 3...'

          }
          {noformat}
          ...Restart Jenkins as soon as it enters Stage 2, to replicate such behaviour.

          Currently I am using version 2.3, but I believe this issue was replicated in previous versions.

          Please can you help me explain why this behaviour only exists in pipeline jobs?

          Kind Regards,
          Tuan

          Mike Cating added a comment -

          Seeing very similar behavior on JENKINS_VERSION = 2.32.1, except message is slightly different:

          Resuming build at Sat Jan 28 18:39:23 UTC 2017 after Jenkins restart
          Waiting to resume Unknown Pipeline node step: <AWS instance id> is offline

          Mike Cating added a comment - Seeing very similar behavior on JENKINS_VERSION = 2.32.1, except message is slightly different: Resuming build at Sat Jan 28 18:39:23 UTC 2017 after Jenkins restart Waiting to resume Unknown Pipeline node step: <AWS instance id> is offline

          Hosh added a comment -

          Having a similar issue, in my case I'm backing up the jobs directory and restoring it before starting Jenkins:

          [Pipeline] {
          [Pipeline] sh
          [jenkins-backup] Running shell script
          + mktemp jenkins-jobs-XXXXXXX.tar.gz
          [Pipeline] stage
          [Pipeline] { (Backup build history)
          [Pipeline] sh
          Resuming build at Tue Jan 31 16:40:33 GMT 2017 after Jenkins restart
          Waiting to resume Unknown Pipeline node step: ???
          [jenkins-backup] Running shell script
          Ready to run at Tue Jan 31 16:40:36 GMT 2017
          

          Hosh added a comment - Having a similar issue, in my case I'm backing up the jobs directory and restoring it before starting Jenkins: [Pipeline] { [Pipeline] sh [jenkins-backup] Running shell script + mktemp jenkins-jobs-XXXXXXX.tar.gz [Pipeline] stage [Pipeline] { (Backup build history) [Pipeline] sh Resuming build at Tue Jan 31 16:40:33 GMT 2017 after Jenkins restart Waiting to resume Unknown Pipeline node step: ??? [jenkins-backup] Running shell script Ready to run at Tue Jan 31 16:40:36 GMT 2017
          Jesse Glick made changes -
          Component/s New: workflow-basic-steps-plugin [ 21712 ]
          Component/s Original: workflow-aggregator-plugin [ 21710 ]

          Jesse Glick added a comment -

          Each case is potentially a distinct bug, and details matter a lot in terms of producing complete steps to reproduce from scratch.

          Jesse Glick added a comment - Each case is potentially a distinct bug, and details matter a lot in terms of producing complete steps to reproduce from scratch.
          Jesse Glick made changes -
          Resolution New: Incomplete [ 4 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          Matthew Hall added a comment -

          This is another case, i've copied the following from JENKINS-33761

          Hello, I have recently also come across the bug of jobs not restarting, I can also provide a testcase to help with investigation, three jobs are required:

          Job 1 will trigger job_40_sec and job_50_sec in parallel

          If jenkins restarts or is killed when job_40_sec and job_50_sec are both running, then, when Jenkins comes back online only one of the jobs is restarted whilst the other hangs indefinitely

          Please let me know if you need any more information or if this is the wrong place for this information

          Pipeline scripts:

          Job 1

          Map parallel_jobs = ['branch_1': {build job: 'job_50_sec'},
                               'branch_2': {build job: 'job_40_sec'}]
          parallel parallel_jobs

          job_40_sec

          node { sleep(40) }

          job_50_sec

          node { sleep(50) }

          Matthew Hall added a comment - This is another case, i've copied the following from JENKINS-33761 Hello, I have recently also come across the bug of jobs not restarting, I can also provide a testcase to help with investigation, three jobs are required: Job 1 will trigger job_40_sec and job_50_sec in parallel If jenkins restarts or is killed when job_40_sec and job_50_sec are both running, then, when Jenkins comes back online only one of the jobs is restarted whilst the other hangs indefinitely Please let me know if you need any more information or if this is the wrong place for this information Pipeline scripts: Job 1 Map parallel_jobs = [ 'branch_1' : {build job: 'job_50_sec' }, 'branch_2' : {build job: 'job_40_sec' }] parallel parallel_jobs job_40_sec node { sleep(40) } job_50_sec node { sleep(50) }
          Tuan Nguyen made changes -
          Resolution Original: Incomplete [ 4 ]
          Status Original: Resolved [ 5 ] New: Reopened [ 4 ]

            jglick Jesse Glick
            derng Tuan Nguyen
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: