Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42166

ProcessLiveness.workingLaunchers heuristic is flaky


      After running the docker-workflow demo recently, I saw numerous warnings:

      ... org.jenkinsci.plugins.durabletask.ProcessLiveness isAlive
      WARNING: hudson.Launcher$LocalLauncher@... on hudson.remoting.LocalChannel@... does not seem able to determine whether processes are alive or not

      I suspect that I had simply gotten to the point of having launched >10k processes in my current OS session, and so _isAlive(..., 9999, ...) correctly returned true.

      On the one hand this fake PID seems much too low; on the other I am not sure how high a valid pid_t might be. And turning off workingLaunchers entirely would mean either always trusting Liveness on nondecorated launchers—which could cause big problems if libc is not loadable properly—or never trusting it—which means a dead controller process (incl. reboot) will not be detected.

      Or we could always use the ps trick, even on supposedly local launchers. This means checking extra carefully that the command is POSIX-compliant.

            Unassigned Unassigned
            jglick Jesse Glick
            7 Vote for this issue
            9 Start watching this issue