Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28182

Revisit use of $JENKINS_SERVER_COOKIE and Launcher.kill

      PlaceholderExecutable needs an environment variable to add to the context to ensure that Launcher.kill, called when the block exits, will clean up any stray external processes run on this node.

      Originally the name of this variable was chosen to be JENKINS_SERVER_COOKIE, which is set by Job (to a confidential hex key) and normally used by AbstractBuild for this purpose. Unlike in that case, for Workflow the value is overridden (to a random UUID) for each node block, since there may be several. I think the intent was to use the same variable name to suppress any unwanted kills coming from Jenkins core code, but there seem to be none. Anyway after the many refactorings of environment variable handling in Workflow, it turns out this is dead code!

      • FileMonitoringTask sets its own value for JENKINS_SERVER_COOKIE, to durable-<workspaceHash>, which overrides any other value, and is used by stop. So within a shell step, $JENKINS_SERVER_COOKIE is the one from durable-task, not PlaceholderExecutable.
      • env.JENKINS_SERVER_COOKIE picks up the value from Job, due to the precedence order in getEffectiveEnvironment.

      So it seems that while the intention is still good (we would like a final check that any processes started inside node are killed when the block exits), the implementation is not right. Probably PlaceholderExecutable just needs to pick an unrelated environment variable for this purpose. And this logic needs to be tested; for example:

      def file = new File(...);
      node {
        sh "(sleep 5; touch $file) &"
      }
      sleep 10
      assert !file.exists()
      

          [JENKINS-28182] Revisit use of $JENKINS_SERVER_COOKIE and Launcher.kill

          jglick The kill here is in ExecutorStepExecution, which is scoped to the node level, not the step level, so the process killing happens at the end of the node block.  There is no kill for DurableStep currently.  My take on this is that this would be the desired behavior, but I'm open to changing it.  This would certainly solve the problem, but might have unintended consequences for some user's pipeline workflows.

          Matthew Mitchell added a comment - jglick  The kill here is in ExecutorStepExecution, which is scoped to the node level, not the step level, so the process killing happens at the end of the node block.  There is no kill for DurableStep currently.  My take on this is that this would be the desired behavior, but I'm open to changing it.  This would certainly solve the problem, but might have unintended consequences for some user's pipeline workflows.

          Jesse Glick added a comment -

          Hmm, probably FileMonitoringController.cleanup should be doing this.

          Jesse Glick added a comment - Hmm, probably FileMonitoringController.cleanup should be doing this.

          Reagan Elm added a comment -

          mmitche is there any way to opt-out of Launcher.kill for a specific node? Some of our jobs intentionally start log running scripts we need to survive after the node has completed.

          Reagan Elm added a comment - mmitche is there any way to opt-out of Launcher.kill for a specific node? Some of our jobs intentionally start log running scripts we need to survive after the node has completed.

          Jesse Glick added a comment -

          Clear the environment I suppose.

          Jesse Glick added a comment - Clear the environment I suppose.

          The documented way was always to alter the environment variable that is being checked, so for instance, in Freestyle jobs you'd do:

          JENKINS_SERVER_COOKIE=do_not_kill

          in the pipeline jobs you'd do:

          JENKINS_NODE_COOKIE=do_not_kill

          Matthew Mitchell added a comment - The documented way was always to alter the environment variable that is being checked, so for instance, in Freestyle jobs you'd do: JENKINS_SERVER_COOKIE=do_not_kill in the pipeline jobs you'd do: JENKINS_NODE_COOKIE=do_not_kill

          Hi all,

          sorry for reopening, I just received new jenkins update...

          Is it possible to don't reinvent variable names, but use the stable well-known approach from usual jobs?
          I am about BUILD_ID=dontKillMe variable.
          documented here https://wiki.jenkins.io/display/JENKINS/ProcessTreeKiller 

          everyone who uses jenkins a lot – know about this variable and it is extremely simple to receive info about this variable and process killing via searching.
          it is very strange and unusual to have different variables for old-school jobs and for pipelines.

           

           

          Mykola Marzhan added a comment - Hi all, sorry for reopening, I just received new jenkins update... Is it possible to don't reinvent variable names, but use the stable well-known approach from usual jobs? I am about BUILD_ID=dontKillMe variable. documented here  https://wiki.jenkins.io/display/JENKINS/ProcessTreeKiller   everyone who uses jenkins a lot – know about this variable and it is extremely simple to receive info about this variable and process killing via searching. it is very strange and unusual to have different variables for old-school jobs and for pipelines.    

          I don't think BUILD_ID was being used for killing processes before.

          Anyways, the classic approach doesn't work with pipelines.  Two executors on the same machine could run parts of the same build in parallel.  This means that when the process killer attempts to kill BUILD_ID it will kill the other executor's processes.

          We could perhaps introduce an additional environment variable, and check for both.  Thoughts jglick?

          Matthew Mitchell added a comment - I don't think BUILD_ID was being used for killing processes before. Anyways, the classic approach doesn't work with pipelines.  Two executors on the same machine could run parts of the same build in parallel.  This means that when the process killer attempts to kill BUILD_ID it will kill the other executor's processes. We could perhaps introduce an additional environment variable, and check for both.  Thoughts jglick ?

          Jesse Glick added a comment -

          Does not need to be reopened.

          I see no reason to look for BUILD_ID. As pointed out, this does not generalize well to parallel blocks.

          Jesse Glick added a comment - Does not need to be reopened. I see no reason to look for BUILD_ID . As pointed out, this does not generalize well to parallel blocks.

          Hi jglick
          it is not needed to look only to BUILD_ID but it is possible to look also to BUILD_ID.
          it means kill process only in case if JENKINS_NODE_COOKIE and BUILD_ID are unchanged.

          I know that pipelines are a completely new thing, but jenkins itself is not a new thing and it very important to keep backward compatibility.
          Many people work with jenkins many years and these new variables are completely unexpected. 

          Mykola Marzhan added a comment - Hi  jglick it is not needed to look only to BUILD_ID but it is possible to look also to BUILD_ID . it means kill process only in case if JENKINS_NODE_COOKIE and BUILD_ID are unchanged. I know that pipelines are a completely new thing, but jenkins itself is not a new thing and it very important to keep backward compatibility. Many people work with jenkins many years and these new variables are completely unexpected. 

          Jesse Glick added a comment -

          Could be a follow-up RFE (linked issue).

          Jesse Glick added a comment - Could be a follow-up RFE (linked issue).

            mmitche Matthew Mitchell
            jglick Jesse Glick
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: