Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39552

After restart, interrupted pipeline deadlocks waiting for executor

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      I had a pipeline build running, and then restarted Jenkins. After coming up again, I had this in the log for one of the parallel steps in the build:

      Resuming build at Mon Nov 07 13:11:05 CET 2016 after Jenkins restart
      Waiting to resume part of Atlassian Bitbucket » honey » master #4: ???
      Waiting to resume part of Atlassian Bitbucket » honey » master #4: Waiting for next available executor on bcubuntu32

      And the last message repeating every few minutes. The slave bcubuntu32 has only one executor, and it seems like this executor was "used up" for this task of waiting for an available executor...

      After I went into the configuration and changed number of executors to 2, the build continued as normal.

      A possibly related issue: Before restart, I put Jenkins in quiet mode, but the same build agent hung at the end of the pipeline part that was running, never finishing the build. In the end I made the restart without waiting for the part to finish.

      How to reproduce

      • In a fresh Jenkins instance, set master executors number to 1
      • Create job-1 and job-2 as follow
        node {
            parallel "parallel-1": {
                sh "true"
            }, "parallel-2": {
                sh "true"
            }
        }
        build 'job-2'
        
        node {
            sh "sleep 300"
        }
        

      Start a build, wait for job-2 node block to start, then restart Jenkins.

      When it comes back online, you'll see a deadlock

      It seems job-1 is trying to come back on the node it used before the restart, even though its current state doesn't require any node.

        Attachments

          Issue Links

            Activity

            estyrke Emil Styrke created issue -
            estyrke Emil Styrke made changes -
            Field Original Value New Value
            Epic Link JENKINS-35399 [ 171192 ]
            elatt Erik Lattimore made changes -
            Link This issue relates to JENKINS-43587 [ JENKINS-43587 ]
            vlatombe Vincent Latombe made changes -
            Attachment pipeline_restart_deadlock.png [ 37800 ]
            jamesdumay James Dumay made changes -
            Labels pipeline cloudbees-internal-pipeline pipeline
            vlatombe Vincent Latombe made changes -
            Description I had a pipeline build running, and then restarted Jenkins. After coming up again, I had this in the log for one of the parallel steps in the build:

            Resuming build at Mon Nov 07 13:11:05 CET 2016 after Jenkins restart
            Waiting to resume part of Atlassian Bitbucket » honey » master #4: ???
            Waiting to resume part of Atlassian Bitbucket » honey » master #4: Waiting for next available executor on bcubuntu32

            And the last message repeating every few minutes. The slave bcubuntu32 has only one executor, and it seems like this executor was "used up" for this task of waiting for an available executor...

            After I went into the configuration and changed number of executors to 2, the build continued as normal.

            A possibly related issue: Before restart, I put Jenkins in quiet mode, but the same build agent hung at the end of the pipeline part that was running, never finishing the build. In the end I made the restart without waiting for the part to finish.
            I had a pipeline build running, and then restarted Jenkins. After coming up again, I had this in the log for one of the parallel steps in the build:

            Resuming build at Mon Nov 07 13:11:05 CET 2016 after Jenkins restart
            Waiting to resume part of Atlassian Bitbucket » honey » master #4: ???
            Waiting to resume part of Atlassian Bitbucket » honey » master #4: Waiting for next available executor on bcubuntu32

            And the last message repeating every few minutes. The slave bcubuntu32 has only one executor, and it seems like this executor was "used up" for this task of waiting for an available executor...

            After I went into the configuration and changed number of executors to 2, the build continued as normal.

            A possibly related issue: Before restart, I put Jenkins in quiet mode, but the same build agent hung at the end of the pipeline part that was running, never finishing the build. In the end I made the restart without waiting for the part to finish.

            *How to reproduce*
             * In a fresh Jenkins instance, set master executors number to 1
             * Create job-1 and job-2 as follow
            {code:java}
            node {
                parallel "parallel-1": {
                    sh "true"
                }, "parallel-2": {
                    sh "true"
                }
            }
            build 'job-2'
            {code}
            {code:java}
            node {
                sh "sleep 300"
            }
            {code}

            Start a build, wait for job-2 node block to start, then restart Jenkins.

            When it comes back online, you'll see a deadlock
             !pipeline_restart_deadlock.png|thumbnail!

            It seems job-1 is trying to come back on the node it used before the restart, even though its current state doesn't require any node.
            cloudbees CloudBees Inc. made changes -
            Remote Link This issue links to "CloudBees Internal CD-29 (Web Link)" [ 19126 ]
            abayer Andrew Bayer made changes -
            Component/s workflow-durable-task-step-plugin [ 21715 ]
            Component/s pipeline [ 21692 ]
            vivek Vivek Pandey made changes -
            Labels cloudbees-internal-pipeline pipeline cloudbees-internal-pipeline pipeline triaged-2018-11
            dnusbaum Devin Nusbaum made changes -
            Link This issue duplicates JENKINS-53709 [ JENKINS-53709 ]
            dnusbaum Devin Nusbaum made changes -
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Closed [ 6 ]
            dnusbaum Devin Nusbaum made changes -
            Link This issue relates to JENKINS-41791 [ JENKINS-41791 ]

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              estyrke Emil Styrke
              Votes:
              11 Vote for this issue
              Watchers:
              22 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: