Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42166

ProcessLiveness.workingLaunchers heuristic is flaky

      After running the docker-workflow demo recently, I saw numerous warnings:

      ... org.jenkinsci.plugins.durabletask.ProcessLiveness isAlive
      WARNING: hudson.Launcher$LocalLauncher@... on hudson.remoting.LocalChannel@... does not seem able to determine whether processes are alive or not
      

      I suspect that I had simply gotten to the point of having launched >10k processes in my current OS session, and so _isAlive(..., 9999, ...) correctly returned true.

      On the one hand this fake PID seems much too low; on the other I am not sure how high a valid pid_t might be. And turning off workingLaunchers entirely would mean either always trusting Liveness on nondecorated launchers—which could cause big problems if libc is not loadable properly—or never trusting it—which means a dead controller process (incl. reboot) will not be detected.

      Or we could always use the ps trick, even on supposedly local launchers. This means checking extra carefully that the command is POSIX-compliant.

          [JENKINS-42166] ProcessLiveness.workingLaunchers heuristic is flaky

          Jesse Glick added a comment -

          This tip suggests ps -p $PID, and there are nearby tips for Windows which could help with JENKINS-25053.

          Jesse Glick added a comment - This tip suggests ps -p $PID , and there are nearby tips for Windows which could help with JENKINS-25053 .

          Recently i started seeing this every second. I'm on linux running jenkins 2.32.3. Any ideas what is wrong? I tried restarting jenkins but there is no difference. It's really annoying as i cannot see the important messages....

           
          Mar 06, 2017 3:24:29 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness
          hudson.Launcher$RemoteLauncher@7b1f9078 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not
          Mar 06, 2017 3:24:30 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness
          hudson.Launcher$RemoteLauncher@301b1441 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not

          Pavel Georgiev added a comment - Recently i started seeing this every second. I'm on linux running jenkins 2.32.3. Any ideas what is wrong? I tried restarting jenkins but there is no difference. It's really annoying as i cannot see the important messages....   Mar 06, 2017 3:24:29 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness hudson.Launcher$RemoteLauncher@7b1f9078 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not Mar 06, 2017 3:24:30 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness hudson.Launcher$RemoteLauncher@301b1441 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not

          Carlos Sanchez added a comment - - edited

          Related problems when running in busybox or alpine (ie. docker jenkinsci/jnlp-slave:alpine). ps -o pid=9999 always succeeds

          And the docker image may not even have ps as shown in JENKINS-43881

          Carlos Sanchez added a comment - - edited Related problems when running in busybox or alpine (ie. docker jenkinsci/jnlp-slave:alpine). ps -o pid=9999 always succeeds And the docker image may not even have ps as shown in  JENKINS-43881

          example

          docker run -ti --rm --entrypoint bash jenkinsci/jnlp-slave:alpine -c "ps -o pid=9999"; echo $?
          9999
                 1
                  0
          
          ps --help
          BusyBox v1.25.1 (2016-10-26 16:15:20 GMT) multi-call binary.
          Usage: ps [-o COL1,COL2=HEADER]
          Show list of processes
          -o COL1,COL2=HEADER	Select columns for display
          

          Carlos Sanchez added a comment - example docker run -ti --rm --entrypoint bash jenkinsci/jnlp-slave:alpine -c "ps -o pid=9999" ; echo $? 9999 1 0 ps --help BusyBox v1.25.1 (2016-10-26 16:15:20 GMT) multi-call binary. Usage: ps [-o COL1,COL2=HEADER] Show list of processes -o COL1,COL2=HEADER Select columns for display

          Jesse Glick added a comment -

          Regarding Docker containers, see JENKINS-40101—a known bug.

          Jesse Glick added a comment - Regarding Docker containers, see  JENKINS-40101 —a known bug.

          Jesse Glick added a comment -

          Obsolete as of JENKINS-47791.

          Jesse Glick added a comment - Obsolete as of  JENKINS-47791 .

            Unassigned Unassigned
            jglick Jesse Glick
            Votes:
            7 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: