Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53709

Parallel blocks in node blocks cause executors to be persisted outside of the node block

    XMLWordPrintable

Details

    • Pipeline Groovy 2.56

    Description

      When a parallel step is nested in a node step, the executor associated with the node appears to outlive both the parallel and node steps. This leads to the executor being rehydrated when a pipeline is restarted, even if the pipeline is outside of the node block.

      Reproduction test case:

      @Test public void shouldNotLeakExecutorsViaContextVars() {
          story.then(r -> {
              DumbSlave s = r.createOnlineSlave();
              WorkflowJob p = r.jenkins.createProject(WorkflowJob.class, "demo");
              p.setDefinition(new CpsFlowDefinition("node('" + s.getNodeName() + "') {\n" +
                      "  parallel one: {\n" +
                      "    echo '" + s.getNodeName() + "'\n" +
                      "  }\n" +
                      "}\n" +
                      "semaphore 'wait'\n", false));
              WorkflowRun b = p.scheduleBuild2(0).waitForStart();
              SemaphoreStep.waitForStart("wait/1", b);
              r.jenkins.removeNode(s);
          });
          story.then(r -> {
              WorkflowRun b = r.jenkins.getItemByFullName("demo", WorkflowJob.class).getBuildByNumber(1);
              SemaphoreStep.waitForStart("wait/1", b);
              SemaphoreStep.success("wait/1", null);
              while (b.isBuilding()) {
                  r.assertLogNotContains("Jenkins doesn’t have label", b);
                  Thread.sleep(100);
              }
              r.assertBuildStatusSuccess(b);
          });
      }
      

      This test currently fails because the pipeline waits for the 'Test' agent to become available after restarting even though we are not in a node block.

      Attachments

        Issue Links

          Activity

            dnusbaum Devin Nusbaum created issue -
            dnusbaum Devin Nusbaum made changes -
            Field Original Value New Value
            Status Open [ 1 ] In Progress [ 3 ]
            dnusbaum Devin Nusbaum made changes -
            Description When a {{parallel}} step is nested in a {{node}} step, the executor associated with the node appears to outlive both the {{parallel}} and {{node}} steps. This leads to the executor being rehydrated when a pipeline is restarted, even if the pipeline is outside of the node block.

            Reproduction test case:

            {code}
                @Test public void shouldNotLeakExecutorsViaContextVars() {
                    story.then(r -> {
                        DumbSlave s = r.createOnlineSlave();
                        WorkflowJob p = r.jenkins.createProject(WorkflowJob.class, "demo");
                        p.setDefinition(new CpsFlowDefinition("node('" + s.getNodeName() + "') {\n" +
                                " parallel one: {\n" +
                                " echo '" + s.getNodeName() + "'\n" +
                                " }\n" +
                                "}\n" +
                                "semaphore 'wait'\n", false));
                        WorkflowRun b = p.scheduleBuild2(0).waitForStart();
                        SemaphoreStep.waitForStart("wait/1", b);
                        r.disconnectSlave(s);
                    });
                    story.then(r -> {
                        WorkflowRun b = r.jenkins.getItemByFullName("demo", WorkflowJob.class).getBuildByNumber(1);
                        SemaphoreStep.waitForStart("wait/1", b);
                        SemaphoreStep.success("wait/1", null);
                        while (b.isBuilding()) {
                            r.assertLogNotContains(" is offline", b);
                            Thread.sleep(100);
                        }
                        r.assertBuildStatusSuccess(b);
                    });
                }
            {code}

            This test currently fails because the pipeline waits for the 'Test' agent to become available after restarting even though we are not in a node block.

            From a quick investigation, I think this may have been introduced by JENKINS-26034 ([commit|https://github.com/jenkinsci/workflow-cps-plugin/commit/c8c668f2b60a19c33add92e2b14345f23f58aabc]), because if I remove [ResultHandler.stepExecution|https://github.com/jenkinsci/workflow-cps-plugin/blob/54d2f4fe8069fde53789bfe21229ce8e545300bb/src/main/java/org/jenkinsci/plugins/workflow/cps/steps/ParallelStep.java#L70], the test case passes successfully. I'm not sure if we shouldn't be persisting the execution there, or if we need to clear it out after the step completes, or if the persistence is fine and the root problem is somewhere else.
            When a {{parallel}} step is nested in a {{node}} step, the executor associated with the node appears to outlive both the {{parallel}} and {{node}} steps. This leads to the executor being rehydrated when a pipeline is restarted, even if the pipeline is outside of the node block.

            Reproduction test case:

            {code}
            @Test public void shouldNotLeakExecutorsViaContextVars() {
                story.then(r -> {
                    DumbSlave s = r.createOnlineSlave();
                    WorkflowJob p = r.jenkins.createProject(WorkflowJob.class, "demo");
                    p.setDefinition(new CpsFlowDefinition("node('" + s.getNodeName() + "') {\n" +
                            " parallel one: {\n" +
                            " echo '" + s.getNodeName() + "'\n" +
                            " }\n" +
                            "}\n" +
                            "semaphore 'wait'\n", false));
                    WorkflowRun b = p.scheduleBuild2(0).waitForStart();
                    SemaphoreStep.waitForStart("wait/1", b);
                    r.jenkins.removeNode(s);
                });
                story.then(r -> {
                    WorkflowRun b = r.jenkins.getItemByFullName("demo", WorkflowJob.class).getBuildByNumber(1);
                    SemaphoreStep.waitForStart("wait/1", b);
                    SemaphoreStep.success("wait/1", null);
                    while (b.isBuilding()) {
                        r.assertLogNotContains("Jenkins doesn’t have label", b);
                        Thread.sleep(100);
                    }
                    r.assertBuildStatusSuccess(b);
                });
            }
            {code}

            This test currently fails because the pipeline waits for the 'Test' agent to become available after restarting even though we are not in a node block.

            From a quick investigation, I think this may have been introduced by JENKINS-26034 ([commit|https://github.com/jenkinsci/workflow-cps-plugin/commit/c8c668f2b60a19c33add92e2b14345f23f58aabc]), because if I remove [ResultHandler.stepExecution|https://github.com/jenkinsci/workflow-cps-plugin/blob/54d2f4fe8069fde53789bfe21229ce8e545300bb/src/main/java/org/jenkinsci/plugins/workflow/cps/steps/ParallelStep.java#L70], the test case passes successfully. I'm not sure if we shouldn't be persisting the execution there, or if we need to clear it out after the step completes, or if the persistence is fine and the root problem is somewhere else.
            dnusbaum Devin Nusbaum made changes -
            Description When a {{parallel}} step is nested in a {{node}} step, the executor associated with the node appears to outlive both the {{parallel}} and {{node}} steps. This leads to the executor being rehydrated when a pipeline is restarted, even if the pipeline is outside of the node block.

            Reproduction test case:

            {code}
            @Test public void shouldNotLeakExecutorsViaContextVars() {
                story.then(r -> {
                    DumbSlave s = r.createOnlineSlave();
                    WorkflowJob p = r.jenkins.createProject(WorkflowJob.class, "demo");
                    p.setDefinition(new CpsFlowDefinition("node('" + s.getNodeName() + "') {\n" +
                            " parallel one: {\n" +
                            " echo '" + s.getNodeName() + "'\n" +
                            " }\n" +
                            "}\n" +
                            "semaphore 'wait'\n", false));
                    WorkflowRun b = p.scheduleBuild2(0).waitForStart();
                    SemaphoreStep.waitForStart("wait/1", b);
                    r.jenkins.removeNode(s);
                });
                story.then(r -> {
                    WorkflowRun b = r.jenkins.getItemByFullName("demo", WorkflowJob.class).getBuildByNumber(1);
                    SemaphoreStep.waitForStart("wait/1", b);
                    SemaphoreStep.success("wait/1", null);
                    while (b.isBuilding()) {
                        r.assertLogNotContains("Jenkins doesn’t have label", b);
                        Thread.sleep(100);
                    }
                    r.assertBuildStatusSuccess(b);
                });
            }
            {code}

            This test currently fails because the pipeline waits for the 'Test' agent to become available after restarting even though we are not in a node block.

            From a quick investigation, I think this may have been introduced by JENKINS-26034 ([commit|https://github.com/jenkinsci/workflow-cps-plugin/commit/c8c668f2b60a19c33add92e2b14345f23f58aabc]), because if I remove [ResultHandler.stepExecution|https://github.com/jenkinsci/workflow-cps-plugin/blob/54d2f4fe8069fde53789bfe21229ce8e545300bb/src/main/java/org/jenkinsci/plugins/workflow/cps/steps/ParallelStep.java#L70], the test case passes successfully. I'm not sure if we shouldn't be persisting the execution there, or if we need to clear it out after the step completes, or if the persistence is fine and the root problem is somewhere else.
            When a {{parallel}} step is nested in a {{node}} step, the executor associated with the node appears to outlive both the {{parallel}} and {{node}} steps. This leads to the executor being rehydrated when a pipeline is restarted, even if the pipeline is outside of the node block.

            Reproduction test case:

            {code}
            @Test public void shouldNotLeakExecutorsViaContextVars() {
                story.then(r -> {
                    DumbSlave s = r.createOnlineSlave();
                    WorkflowJob p = r.jenkins.createProject(WorkflowJob.class, "demo");
                    p.setDefinition(new CpsFlowDefinition("node('" + s.getNodeName() + "') {\n" +
                            " parallel one: {\n" +
                            " echo '" + s.getNodeName() + "'\n" +
                            " }\n" +
                            "}\n" +
                            "semaphore 'wait'\n", false));
                    WorkflowRun b = p.scheduleBuild2(0).waitForStart();
                    SemaphoreStep.waitForStart("wait/1", b);
                    r.jenkins.removeNode(s);
                });
                story.then(r -> {
                    WorkflowRun b = r.jenkins.getItemByFullName("demo", WorkflowJob.class).getBuildByNumber(1);
                    SemaphoreStep.waitForStart("wait/1", b);
                    SemaphoreStep.success("wait/1", null);
                    while (b.isBuilding()) {
                        r.assertLogNotContains("Jenkins doesn’t have label", b);
                        Thread.sleep(100);
                    }
                    r.assertBuildStatusSuccess(b);
                });
            }
            {code}

            This test currently fails because the pipeline waits for the 'Test' agent to become available after restarting even though we are not in a node block.
            dnusbaum Devin Nusbaum made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            dnusbaum Devin Nusbaum made changes -
            Remote Link This issue links to "jenkinsci/workflow-cps-plugin#245 (Web Link)" [ 21829 ]
            dnusbaum Devin Nusbaum added a comment -

            Fix released in Pipeline Groovy version 2.56.

            dnusbaum Devin Nusbaum added a comment - Fix released in Pipeline Groovy version 2.56.
            dnusbaum Devin Nusbaum made changes -
            Released As Pipeline Groovy 2.56
            Resolution Fixed [ 1 ]
            Status In Review [ 10005 ] Resolved [ 5 ]
            dnusbaum Devin Nusbaum made changes -
            Link This issue is duplicated by JENKINS-51539 [ JENKINS-51539 ]
            jglick Jesse Glick made changes -
            Link This issue relates to JENKINS-41791 [ JENKINS-41791 ]
            dnusbaum Devin Nusbaum made changes -
            Link This issue relates to JENKINS-63164 [ JENKINS-63164 ]
            dnusbaum Devin Nusbaum made changes -
            Link This issue is duplicated by JENKINS-39552 [ JENKINS-39552 ]

            People

              dnusbaum Devin Nusbaum
              dnusbaum Devin Nusbaum
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: