Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-41854

Contextualize a fresh FilePath after an agent reconnection

    • workflow-durable-task-step 2.31, workflow-basic-steps 2.18

      PlaceholderExecutable does not bother listening for a closed connection as such, it just lets any nested step actually using the agent fail if the connection dies in the middle. Usually this is fine, but there is a corner case where it is probably wrong (unconfirmed): if an agent is disconnected and reconnected during the first sh in

      node {
        sh 'sleep 999'
        sh 'sleep 999'
      }
      

      then the second sh could fail since it would be using the old Channel, even when the first sh succeeds because the DurableTaskStep.Execution recomputes the FilePath after the ChannelClosedException. (But if Jenkins restarted during the first sh after the reconnection then it should work, since the FilePath would be reconstructed from a FilePathPickle.)

      The fix may be tricky since FilePath.channel is effectively final, so currently whatever workspace is passed from PlaceholderExecutable will be used for the duration of the block. Perhaps BodyInvoker.withContexts should support offering a Provider of contextual objects—in this case something that caches a FilePath so long as it is valid (!Channel.outClosed?), and otherwise falls back to FilePathUtils.find like FilePathPickle.

          [JENKINS-41854] Contextualize a fresh FilePath after an agent reconnection

          Jesse Glick added a comment -

          oleg_nenashev reminds me that fixing this would probably involve also acquiring a fresh WorkspaceList lock, after a new Computer becomes available and hence a new WorkspaceList.

          Jesse Glick added a comment - oleg_nenashev reminds me that fixing this would probably involve also acquiring a fresh WorkspaceList lock, after a new Computer becomes available and hence a new WorkspaceList .

          Basil Crow added a comment -

          Is there any workaround for this issue? Is there any way for me to run my scripted pipeline on a node without using the problematic PlaceholderExecutable?

          Basil Crow added a comment - Is there any workaround for this issue? Is there any way for me to run my scripted pipeline on a node without using the problematic PlaceholderExecutable ?

          Jesse Glick added a comment -

          If the issue indeed exists in the described form, I would not expect there to be any general workaround. The build would just fail. Possibly you could retry the second sh. You could consolidate the scripts, if there were no crucial intervening steps of other kinds. You could run each sh in its own node, using stash and unstash as needed.

          Jesse Glick added a comment - If the issue indeed exists in the described form, I would not expect there to be any general workaround. The build would just fail. Possibly you could retry the second sh . You could consolidate the scripts, if there were no crucial intervening steps of other kinds. You could run each sh in its own node , using stash and unstash as needed.

          Basil Crow added a comment -

          > If the issue indeed exists in the described form

          The issue does indeed exist in the described form, as my production experience shows (described in JENKINS-50504, which I have now marked as a duplicate of this bug). I can also confirm that restarting Jenkins fixes the problem as described above by reconstructing the FilePath from a FilePathPickle. However, between the time that I hit this bug and the time I restart Jenkins, in-use workspaces are handed out to new runs, causing both runs to fail. This is a huge inconvenience for my users.

          I have written a reproducible test case.

          Basil Crow added a comment - > If the issue indeed exists in the described form The issue does indeed exist in the described form, as my production experience shows (described in JENKINS-50504 , which I have now marked as a duplicate of this bug). I can also confirm that restarting Jenkins fixes the problem as described above by reconstructing the FilePath from a FilePathPickle . However, between the time that I hit this bug and the time I restart Jenkins, in-use workspaces are handed out to new runs, causing both runs to fail. This is a huge inconvenience for my users. I have written a reproducible test case .

          Devin Nusbaum added a comment - - edited

          Fixes for this issue have just been released in Pipeline Nodes and Processes Plugin version 2.31 and Pipeline Basic Steps Plugin version 2.18. You must update Pipeline Groovy Plugin to version 2.70 at the same time you update the other plugins.

          Devin Nusbaum added a comment - - edited Fixes for this issue have just been released in Pipeline Nodes and Processes Plugin version 2.31 and Pipeline Basic Steps Plugin version 2.18. You must update Pipeline Groovy Plugin to version 2.70 at the same time you update the other plugins.

          Jesse Glick added a comment -

          (Or you may update Pipeline: Groovy first, and then the other plugins later.)

          Jesse Glick added a comment - (Or you may update Pipeline: Groovy first , and then the other plugins later .)

          Basil Crow added a comment -

          Well done! Thank you for fixing this long-standing robustness issue. This is much appreciated!

          Basil Crow added a comment - Well done! Thank you for fixing this long-standing robustness issue. This is much appreciated!

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            4 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: