Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-47791

Eliminate ProcessLiveness

    XMLWordPrintable

Details

    Description

      ProcessLiveness along with PID tracking was introduced as a way to ensure that a sh step would terminate if the controller process died, for example if the computer were rebooted; otherwise the step would just sit there indefinitely waiting for output or an exit status which will never come.

      In practice this code has proven to be a major source of reliability issues. Prior to Java 9 there is no standard API for checking for the existence of a given process, so the code uses JNA. Or tries to, but it has a hard time being sure whether getpgid is actually supported, so it tries to detect that on every new node and cache the answer. Anyway these calls will not work when we are inside withDockerContainer since the container may remap process IDs (the $$ seen from the wrapper script is not necessarily meaningful from the agent JVM), so the code also has to detect decorated Launcher implementations and fall back to a different version based on command-line ps calls, which is not entirely portable, and has also had troubles in responding cleanly to laggy or hung remoting channels.

      Better to throw out this approach and start over. It seems to work to just have the wrapper script itself indicate that it is still alive, for example by touching the log file even when there is no new output. Then the agent JVM need do nothing more exotic than a file timestamp check.

      Attachments

        Issue Links

          Activity

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java
            src/main/java/org/jenkinsci/plugins/durabletask/FileMonitoringTask.java
            src/main/java/org/jenkinsci/plugins/durabletask/ProcessLiveness.java
            src/test/java/org/jenkinsci/plugins/durabletask/BourneShellScriptTest.java
            src/test/java/org/jenkinsci/plugins/durabletask/CentOSFixture.java
            src/test/resources/org/jenkinsci/plugins/durabletask/CentOSFixture/Dockerfile
            http://jenkins-ci.org/commit/durable-task-plugin/a818ef883ff5200a64ca33f4d700d1ecf17b3211
            Log:
            Merge pull request #49 from jglick/heartbeat

            JENKINS-47791 Remove ProcessLiveness and PID tracking in favor of a simple timestamp check on the log file

            Compare: https://github.com/jenkinsci/durable-task-plugin/compare/edef6839f947...a818ef883ff5

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java src/main/java/org/jenkinsci/plugins/durabletask/FileMonitoringTask.java src/main/java/org/jenkinsci/plugins/durabletask/ProcessLiveness.java src/test/java/org/jenkinsci/plugins/durabletask/BourneShellScriptTest.java src/test/java/org/jenkinsci/plugins/durabletask/CentOSFixture.java src/test/resources/org/jenkinsci/plugins/durabletask/CentOSFixture/Dockerfile http://jenkins-ci.org/commit/durable-task-plugin/a818ef883ff5200a64ca33f4d700d1ecf17b3211 Log: Merge pull request #49 from jglick/heartbeat JENKINS-47791 Remove ProcessLiveness and PID tracking in favor of a simple timestamp check on the log file Compare: https://github.com/jenkinsci/durable-task-plugin/compare/edef6839f947...a818ef883ff5

            People

              jglick Jesse Glick
              jglick Jesse Glick
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: