Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-75766

Zombie builds can abort other jobs running on the same node

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • None
    • workflow-durable-task-step:1378.v6a_3e903058a_3+

      A zombie build living on a node can terminate other jobs on that node. This seems to happen since https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/405.

      The Zombie build would show attempts to abort it in its console logs:

      ERROR: node block appears to be neither running nor scheduled; will cancel if this condition persists
      ERROR: node block still appears to be neither running nor scheduled; cancelling
      ERROR: node block appears to be neither running nor scheduled; will cancel if this condition persists
      ERROR: node block still appears to be neither running nor scheduled; cancelling
      [...]
      

      A zombie build is a build execution that is somehow stuck while in a node block. And autonomous attempts to interrupt it did not succeed. Reproducing that state is the biggest hurdle. But when there is one, other builds that use the node that the zombie build execution was stuck in may be interrupted when the ExecutorStepExecution$AnomalousStatus kicks in. In such case those builds console output show:

      ERROR: also cancelling shell steps running on <agentName>
      [...]
      Agent was removed
      

            jglick Jesse Glick
            allan_burdajewicz Allan BURDAJEWICZ
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: