Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-47791

Eliminate ProcessLiveness


      ProcessLiveness along with PID tracking was introduced as a way to ensure that a sh step would terminate if the controller process died, for example if the computer were rebooted; otherwise the step would just sit there indefinitely waiting for output or an exit status which will never come.

      In practice this code has proven to be a major source of reliability issues. Prior to Java 9 there is no standard API for checking for the existence of a given process, so the code uses JNA. Or tries to, but it has a hard time being sure whether getpgid is actually supported, so it tries to detect that on every new node and cache the answer. Anyway these calls will not work when we are inside withDockerContainer since the container may remap process IDs (the $$ seen from the wrapper script is not necessarily meaningful from the agent JVM), so the code also has to detect decorated Launcher implementations and fall back to a different version based on command-line ps calls, which is not entirely portable, and has also had troubles in responding cleanly to laggy or hung remoting channels.

      Better to throw out this approach and start over. It seems to work to just have the wrapper script itself indicate that it is still alive, for example by touching the log file even when there is no new output. Then the agent JVM need do nothing more exotic than a file timestamp check.

            jglick Jesse Glick
            jglick Jesse Glick
            0 Vote for this issue
            3 Start watching this issue