• durable-task 1.31

      durable task step currently uses nohup to launch a durable process. But if Jenkins is started from an interactive terminal and the user presses Ctrl+C, the forked process is still gone. So far we've blushed it off saying this is not how Jenkins is typically run. That may be true, but it is also a perfectly reasonable way to run Jenkins, for example for the first time evaluation.

      The reason these processes get killed with Ctrl+C is because shell sends SIGINT to all the processes in the process group (source). In looking at nohup.c, nohup only ignores SIGHUP. You can also run a command like nohup sleep 30 from the command line. hit Ctrl+C, and observe that the sleep 30 process gets killed.

      The root problem is that nohup is a poor way to isolate a child process. Specifically, it doesn't put the process into a new process group, so it's vulnerable to any signal sent to the entire process group (of which Ctrl+C is one.) setsid is a better way of doing this. This puts the process into a new session (hence also new process group.) So no group-wide signal will get to the child process.

      See Wikipedia process group page for interaction of signals, process groups, and sessions.

          [JENKINS-25503] Use setsid instead of nohup

          Kohsuke Kawaguchi created issue -
          Kohsuke Kawaguchi made changes -
          Priority Original: Minor [ 4 ] New: Major [ 3 ]

          setsid(1) doesn't appear available on Solaris (source) nor FreeBSD (source).

          My proposal would be to write an equivalent command in Python.

          Kohsuke Kawaguchi added a comment - setsid(1) doesn't appear available on Solaris ( source ) nor FreeBSD ( source ). My proposal would be to write an equivalent command in Python.
          Kohsuke Kawaguchi made changes -
          Description Original: durable task step currently uses {{nohup}} to launch a durable process. But if Jenkins is started from an interactive terminal and the user presses {{Ctrl+C}}, the forked process is still gone. So far we've blushed it off saying this is not how Jenkins is typically run. That may be true, but it is also a perfectly reasonable way to run Jenkins, for example for the first time evaluation.

          The reason these processes get killed with {{Ctrl+C}} is because shell sends {{SIGINT}} to all the processes in the process group ([source|http://superuser.com/questions/708919/ctrlc-in-a-sub-process-is-killing-a-nohuped-process-earlier-in-the-script]). In looking at {{nohup.c}}, nohup [only ignores SIGHUP|https://gist.github.com/kohsuke/0eeb9bb43ca8d62643dd#file-nohup-c-L219]. You can also run a command like {{nohup sleep 30}} from the command line. hit {{Ctrl+C}}, and observe that the {{sleep 30}} process gets killed.

          The root problem is that {{nohup}} is a poor way to isolate a child process. Specifically, it doesn't put the process into a new process group, so it's vulnerable to any signal sent to the entire process group (of which {{Ctrl+C}} is one.) {{setsid}} is a better way of doing this. This puts the process into a new session (hence also new process group.) So no group-wide signal will get to the child process.

          See [Wikipedia process group page|http://en.wikipedia.org/wiki/Process_group] for interaction of signals, process groups, and sessions.
          New: durable task step currently uses {{nohup}} to launch a durable process. But if Jenkins is started from an interactive terminal and the user presses {{Ctrl+C}}, the forked process is still gone. So far we've blushed it off saying this is not how Jenkins is typically run. That may be true, but it is also a perfectly reasonable way to run Jenkins, for example for the first time evaluation.

          The reason these processes get killed with {{Ctrl+C}} is because shell sends {{SIGINT}} to all the processes in the process group ([source|http://superuser.com/questions/708919/ctrlc-in-a-sub-process-is-killing-a-nohuped-process-earlier-in-the-script]). In looking at {{nohup.c}}, nohup [only ignores SIGHUP|https://gist.github.com/kohsuke/0eeb9bb43ca8d62643dd#file-nohup-c-L219]. You can also run a command like {{nohup sleep 30}} from the command line. hit {{Ctrl+C}}, and observe that the {{sleep 30}} process gets killed.

          The root problem is that {{nohup}} is a poor way to isolate a child process. Specifically, it doesn't put the process into a new process group, so it's vulnerable to any signal sent to the entire process group (of which {{Ctrl+C}} is one.) {{[setsid|https://gist.github.com/kohsuke/2ed6558d3c4d1f129837]}} is a better way of doing this. This puts the process into a new session (hence also new process group.) So no group-wide signal will get to the child process.

          See [Wikipedia process group page|http://en.wikipedia.org/wiki/Process_group] for interaction of signals, process groups, and sessions.

          Here's the functioning Python version.

          Kohsuke Kawaguchi added a comment - Here's the functioning Python version .

          Kohsuke Kawaguchi added a comment - - edited

          This would be also a good place to do what we do with shell today; redirect stdout/err to a file, wait for the completion of the child process, then record exit code.

          This lets us intercept various signals like SIGINT, SIGTERM, so that we can forward the signal to the child process and also record the fact in the exit code file.

          Our current shell script wrapper doesn't do that, so in case the shell is killed (like Ctrl+C that got me into this), we fail to capture what killed it, and instead we get non-descriptive -1 (from BourneShellScript):

          int _pid = pid(workspace);
          if (_pid > 0 && !ProcessLiveness.isAlive(workspace.getChannel(), _pid)) {
              return -1; // arbitrary code to distinguish from 0 (success) and 1+ (observed failure)
          }
          

          Kohsuke Kawaguchi added a comment - - edited This would be also a good place to do what we do with shell today; redirect stdout/err to a file, wait for the completion of the child process, then record exit code. This lets us intercept various signals like SIGINT, SIGTERM, so that we can forward the signal to the child process and also record the fact in the exit code file. Our current shell script wrapper doesn't do that, so in case the shell is killed (like Ctrl+C that got me into this), we fail to capture what killed it, and instead we get non-descriptive -1 (from BourneShellScript ): int _pid = pid(workspace); if (_pid > 0 && !ProcessLiveness.isAlive(workspace.getChannel(), _pid)) { return -1; // arbitrary code to distinguish from 0 (success) and 1+ (observed failure) }

          Jesse Glick added a comment -

          If this saves durable tasks running on master or via CommandLauncher from e.g. Ctrl-C then great. My main concern is availability across POSIXish platforms. Does Mac OS X have it? /usr/bin?

          Jesse Glick added a comment - If this saves durable tasks running on master or via CommandLauncher from e.g. Ctrl-C then great. My main concern is availability across POSIXish platforms. Does Mac OS X have it? /usr/bin ?
          Jesse Glick made changes -
          Labels New: workflow
          Kanstantsin Shautsou made changes -
          Link New: This issue is related to JENKINS-25848 [ JENKINS-25848 ]
          Yoann Dubreuil made changes -
          Link New: This issue is related to JENKINS-27617 [ JENKINS-27617 ]

            carroll Carroll Chiou
            kohsuke Kohsuke Kawaguchi
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: