Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63164

Closures in block-scoped steps can reference dead CpsBodyExecutions

    • workflow-cps 2.82

      Very similar to JENKINS-53709 and JENKINS-41791. In some cases a reference to the CpsBodyExecution of a block-scoped step can outlive the actual execution of that step, causing a CpsThread that is no longer part of the CpsThreadGroup to stick around in the program.

      The easiest way to run into this issue is if you create a closure in a node step, allow the closure to outlive the step (for example by assigning it to a variable), and then restart Jenkins after the node step completed but while the closure is still reachable in the Pipeline. The PlaceholderTask for the node step will be rehydrated when the Pipeline resumes, even though the node step already completed. If you are using dynamically provisioned agents, the Pipeline will either hang forever waiting for that agent or fail if Jenkins can tell that the node will never be available.

      The specific issue is that CpsBodyExecution.onSuccess and CpsBodyExecution.onFailure are injected as continuations into the program. Those objects are instances of non-static inner classes in CpsBodyExecution because they need to access the fields and methods of the CpsBodyExecution, but this means that the program's serialized state includes the CpsBodyExecution and all of its state, most notably its CpsThread field.

      Originally noticed because RestartingLoadStepTest.updatedBindingsOnRestart was flaky, see jenkinsci/workflow-cps-plugin#366.

      Minimal reproducer: 

      def closure = null
      node {
        closure = { -> "This closure captures CpsBodyExecution" }
      }
      // node step is complete, but its execution is reachable via the closure's captured variables.
      // Agent is rehydrated if you restart Jenkins here.
      echo(closure())
      

          [JENKINS-63164] Closures in block-scoped steps can reference dead CpsBodyExecutions

          dnusbaum If I understand this properly this is the same as JENKINS-39552 ?

          Vincent Latombe added a comment - dnusbaum If I understand this properly this is the same as JENKINS-39552 ?

          Devin Nusbaum added a comment -

          vlatombe This could be at least one cause of JENKINS-39552, but the examples in the description of that issue look more like the bugs fixed by JENKINS-53709 and JENKINS-41791.

          Devin Nusbaum added a comment - vlatombe This could be at least one cause of JENKINS-39552 , but the examples in the description of that issue look more like the bugs fixed by JENKINS-53709 and JENKINS-41791 .

          Devin Nusbaum added a comment -

          A fix for this issue was released in Pipeline: Groovy plugin version 2.82.

          Devin Nusbaum added a comment - A fix for this issue was released in Pipeline: Groovy plugin version 2.82.

          Jesse Glick added a comment -

          A similar problem (ref.: SECO-1832) was observed in a pipeline which saved a closure created inside a branch to a data structure that persisted past the parallel step end. Not sure if there is an easy reproduction case. dnusbaum suggests

          Perhaps we could come up with a change that would null out various fields in CpsBodyExecution once it completes to avoid these kinds of issues with accidental captures.

          Jesse Glick added a comment - A similar problem (ref.: SECO-1832) was observed in a pipeline which saved a closure created inside a branch to a data structure that persisted past the parallel step end. Not sure if there is an easy reproduction case. dnusbaum suggests Perhaps we could come up with a change that would null out various fields in CpsBodyExecution once it completes to avoid these kinds of issues with accidental captures.

            dnusbaum Devin Nusbaum
            dnusbaum Devin Nusbaum
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: