Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-6188

When a build is aborted, Hudson does not kill all the descendant processes recursively to avoid run-away processes on the Slave Build machines.

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • remoting
    • None

      Build job setup to run as "ant.bat -f scm.xml". Ant batch file sets environment variables and kicks off ANT build. While build running, abort is clicked and job aborts normally in Hudson Master as indicated in log/console output:
      Build was aborted
      Finished: ABORTED

      But, on Slave machine, the Ant/Java process doing build still runs to completion, the descendent processes that were running were not killed properly on Slave to truely abort the build.

          [JENKINS-6188] When a build is aborted, Hudson does not kill all the descendant processes recursively to avoid run-away processes on the Slave Build machines.

          I have the same issue with following configuration

          Jenkins 1.461 on Windows Server 2008 Enterprise Edition using build in winstone
          SSH slave plugin 0.21
          Slave running on AIX IBM Java 1.6
          Process started wsadmin.sh (scripting environment from Websphere Application Server 8.0)

          peter_schuetze added a comment - I have the same issue with following configuration Jenkins 1.461 on Windows Server 2008 Enterprise Edition using build in winstone SSH slave plugin 0.21 Slave running on AIX IBM Java 1.6 Process started wsadmin.sh (scripting environment from Websphere Application Server 8.0)

          Tully Foote added a comment - - edited

          We've had the same issue using pbuilder for several years. We're running 1.513 now, but we've been seeing this issue for a long time.

          Jobs end up with orphaned process trees like this:

          root     32018  0.0  0.0  53788  1524 ?        S    08:50   0:00 sudo pbuilder execute --basetgz /var/cache/pbuilder/jenkins_tools.precise.am
          root     32019  0.0  0.1  12824  1812 ?        S    08:50   0:00  \_ /bin/bash /usr/sbin/pbuilder execute --basetgz /var/cache/pbuilder/jenki
          root     10649  0.0  0.0   9424  1256 ?        S    08:50   0:00      \_ /bin/bash -ex /runscript doc
          root     10656  0.1  1.3  75456 24148 ?        S    08:50   0:36      |   \_ python /home/rosbuild/hudson/workspace/doc-fuerte-tum/jenkins_sc
          root      9009  0.0  0.0   9768  1692 ?        S    16:55   0:00      |       \_ bash -c source /opt/ros/fuerte/setup.bash && export PYTHONPA
          root      9026  1.1  1.3  61092 22904 ?        S    16:55   0:00      |           \_ /usr/bin/python /usr/local/bin/rosdoc_lite /home/rosbuil
          root      9027 52.0 18.6 350096 327324 ?       D    16:55   0:43      |               \_ doxygen /tmp/tmpwC1LwG
          root      9365  0.0  0.0   4308   352 ?        S    16:56   0:00      \_ sleep 5s
          

          Where the pbuilder process used to be a subprocess of the script being run by jenkins.

          However you can see that jenkins is in a separate process tree:

          root       747  0.0  0.0  49948   244 ?        Ss   Apr16   0:01 /usr/sbin/sshd -D
          root     12760  0.0  0.0  92224   372 ?        Ss   Apr30   0:00  \_ sshd: rosbuild [priv]
          rosbuild 12904  0.0  0.0  92408   528 ?        S    Apr30   0:50  |   \_ sshd: rosbuild@notty
          rosbuild 12918  0.0  0.0  12296   228 ?        Ss   Apr30   0:00  |       \_ bash -c cd '/home/rosbuild/hudson' && java  -jar slave.jar
          rosbuild 12919  0.1  4.4 1598760 77928 ?       Sl   Apr30   1:55  |           \_ java -jar slave.jar
          root      9084  0.0  0.2  90164  4052 ?        Ss   16:56   0:00  \_ sshd: root@pts/3    
          root      9281  0.5  0.3  24056  5300 pts/3    Ss   16:56   0:00      \_ -bash
          root      9366  0.0  0.0  18128  1200 pts/3    R+   16:56   0:00          \_ ps auxf
          

          We're running a linux master and slaves with SSH Slaves plugin 0.23

          Tully Foote added a comment - - edited We've had the same issue using pbuilder for several years. We're running 1.513 now, but we've been seeing this issue for a long time. Jobs end up with orphaned process trees like this: root 32018 0.0 0.0 53788 1524 ? S 08:50 0:00 sudo pbuilder execute --basetgz /var/cache/pbuilder/jenkins_tools.precise.am root 32019 0.0 0.1 12824 1812 ? S 08:50 0:00 \_ /bin/bash /usr/sbin/pbuilder execute --basetgz /var/cache/pbuilder/jenki root 10649 0.0 0.0 9424 1256 ? S 08:50 0:00 \_ /bin/bash -ex /runscript doc root 10656 0.1 1.3 75456 24148 ? S 08:50 0:36 | \_ python /home/rosbuild/hudson/workspace/doc-fuerte-tum/jenkins_sc root 9009 0.0 0.0 9768 1692 ? S 16:55 0:00 | \_ bash -c source /opt/ros/fuerte/setup.bash && export PYTHONPA root 9026 1.1 1.3 61092 22904 ? S 16:55 0:00 | \_ /usr/bin/python /usr/local/bin/rosdoc_lite /home/rosbuil root 9027 52.0 18.6 350096 327324 ? D 16:55 0:43 | \_ doxygen /tmp/tmpwC1LwG root 9365 0.0 0.0 4308 352 ? S 16:56 0:00 \_ sleep 5s Where the pbuilder process used to be a subprocess of the script being run by jenkins. However you can see that jenkins is in a separate process tree: root 747 0.0 0.0 49948 244 ? Ss Apr16 0:01 /usr/sbin/sshd -D root 12760 0.0 0.0 92224 372 ? Ss Apr30 0:00 \_ sshd: rosbuild [priv] rosbuild 12904 0.0 0.0 92408 528 ? S Apr30 0:50 | \_ sshd: rosbuild@notty rosbuild 12918 0.0 0.0 12296 228 ? Ss Apr30 0:00 | \_ bash -c cd '/home/rosbuild/hudson' && java -jar slave.jar rosbuild 12919 0.1 4.4 1598760 77928 ? Sl Apr30 1:55 | \_ java -jar slave.jar root 9084 0.0 0.2 90164 4052 ? Ss 16:56 0:00 \_ sshd: root@pts/3 root 9281 0.5 0.3 24056 5300 pts/3 Ss 16:56 0:00 \_ -bash root 9366 0.0 0.0 18128 1200 pts/3 R+ 16:56 0:00 \_ ps auxf We're running a linux master and slaves with SSH Slaves plugin 0.23

            Unassigned Unassigned
            jburrows John Burrows
            Votes:
            15 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: