Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22641

Jenkins no longer kills running processes after job fails

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • core
    • Jenkins 1.553 or later on CentOS 6.3 with Oracle Java 7 JDK

      Starting at version 1.553, Jenkins no longer seems to kill running processes after a build failure.

      We have several jobs that start a Tomcat instance and run various end-to-end-tests; if the build fails Jenkins doesn't execute the shutdown scripts and we rely on the process killer to clean up the Tomcat instance.

      This can be duplicated more easily by creating a free-form job and adding two shell scripts, the first that starts a simple command such as "nohup sleep 10000 &" and the second "/bin/false". After the job exits the sleep process is still running. Prior to version 1.553, it would be killed.

      There are no log messages to indicate a problem.

      I can reproduce this on CentOS 6, Red Hat EL 5 and Red Hat EL 4, both with a job running on the local master, and on a slave node. Also tested with both 32-bit and 64-bit Oracle Java 7 JDKs.

      We're using the built-in Winstone container.

          [JENKINS-22641] Jenkins no longer kills running processes after job fails

          I can confirm this issue with 0.558 on a Debian 7.4 and Ubuntu 12.04 with Open/Oracle 6/7 JDKs, on master/slave.

          And I can also confirm that downgrading to 0.552 solved it (didn't have the time to do a "bisect" on intermediary versions).

          Thanks for reporting the issue! I thought I'm misunderstanding the processtreekiller feature

          Andrei Neculau added a comment - I can confirm this issue with 0.558 on a Debian 7.4 and Ubuntu 12.04 with Open/Oracle 6/7 JDKs, on master/slave. And I can also confirm that downgrading to 0.552 solved it (didn't have the time to do a "bisect" on intermediary versions). Thanks for reporting the issue! I thought I'm misunderstanding the processtreekiller feature

          Todd Perry added a comment -

          An update: apparently it doesn't matter whether the job failed or not and this is not necessary to reproduce it. The problem is only critical to us for failed jobs, since a failed job doesn't run the final step which shuts down the running servlet container.

          Todd Perry added a comment - An update: apparently it doesn't matter whether the job failed or not and this is not necessary to reproduce it. The problem is only critical to us for failed jobs, since a failed job doesn't run the final step which shuts down the running servlet container.

          Maciej Pasternacki added a comment - - edited

          I can confirm: this is affecting me too, with succeeding and failing jobs. One of the job has been relying on the process reaping mechanism to clean up (started a redis-server daemon before the build and relied on Jenkins killing it afterwards). That was working fine on Jenkins 1.554, and stopped working with upgrade to 1.558; this issue still occurs on 1.559. Ubuntu 12.04, x86_64, Oracle JDK 7.

          Maciej Pasternacki added a comment - - edited I can confirm: this is affecting me too, with succeeding and failing jobs. One of the job has been relying on the process reaping mechanism to clean up (started a redis-server daemon before the build and relied on Jenkins killing it afterwards). That was working fine on Jenkins 1.554, and stopped working with upgrade to 1.558; this issue still occurs on 1.559. Ubuntu 12.04, x86_64, Oracle JDK 7.

          Daniel Beck added a comment - - edited

          I was able to reproduce the issue in 1.553, and 1.552 wasn't affected. So...

          $ git bisect bad
          4c32649db54a4a4f6162793179143fb12ae9e521 is the first bad commit
          commit 4c32649db54a4a4f6162793179143fb12ae9e521
          Author: christ66 <...>
          Date:   Thu Feb 20 21:50:14 2014 -0800
          
              Upgrade to commons-io 2.4
              
              In order to maintain backwards compatibility we need to keep IOUtils the
              same as in commons-io 1.4. This code is backwards compatible, however
              most of the methods have been deprecated and should instead use the
              org.apache.commons.io.IOUtils version instead.
          
          :040000 040000 d42c82833d3b7661a5fdd449e8b436817079b889 4c49b27df1a7adcbc2c58ea9d15535936916fdaa M	core
          :100644 100644 3c88f96220b54a8690cea0d272cd4447bfa98ffb cebccfcc1a20211b732fe96aaf90725ca0194fc8 M	pom.xml
          $ git bisect log
          git bisect start
          # bad: [1f28286ee683470a0b0a15b9425046292cc7d2a5] [maven-release-plugin] prepare release jenkins-1.553
          git bisect bad 1f28286ee683470a0b0a15b9425046292cc7d2a5
          # good: [0738314cb054f287cb1c66232672a57140987f2b] [maven-release-plugin] prepare release jenkins-1.552
          git bisect good 0738314cb054f287cb1c66232672a57140987f2b
          # good: [b09f828706c9a16b8acc4158f2161117f9359047] [JENKINS-21159] Noting #663 merge.
          git bisect good b09f828706c9a16b8acc4158f2161117f9359047
          # bad: [5d91ed77f7402133fd7c9e77b1a6676c6695ec5a] Merge branch 'JENKINS-20965' of github.com:christ66/jenkins
          git bisect bad 5d91ed77f7402133fd7c9e77b1a6676c6695ec5a
          # bad: [8482756860d8cb838e20f7007c241aa77f2524b7] typo in changelog
          git bisect bad 8482756860d8cb838e20f7007c241aa77f2524b7
          # good: [1b2ac716ba14945fd0654376d44a35feef62d531] Slave started from JNLP can now install itself as systemd service.
          git bisect good 1b2ac716ba14945fd0654376d44a35feef62d531
          # bad: [3b472e5b867a4664f2cd40633cb197d09f470a16] Merge pull request #1135 from christ66/commons-io-up
          git bisect bad 3b472e5b867a4664f2cd40633cb197d09f470a16

          How I reproduced it:

          Built and ran Jenkins on OS X master: mvn -DskipTests=true clean verify
          Added an SSH slave (Linux CentOS 6)
          Created a freestyle job. Two shell build steps: one nohup sleep 10000 &, the other /bin/false.
          After every build, ps aux | grep sleep on the slave. If it's there, kill it and git bisect bad on the master, else git bisect good. Repeat.

          Daniel Beck added a comment - - edited I was able to reproduce the issue in 1.553, and 1.552 wasn't affected. So... $ git bisect bad 4c32649db54a4a4f6162793179143fb12ae9e521 is the first bad commit commit 4c32649db54a4a4f6162793179143fb12ae9e521 Author: christ66 <...> Date: Thu Feb 20 21:50:14 2014 -0800 Upgrade to commons-io 2.4 In order to maintain backwards compatibility we need to keep IOUtils the same as in commons-io 1.4. This code is backwards compatible, however most of the methods have been deprecated and should instead use the org.apache.commons.io.IOUtils version instead. :040000 040000 d42c82833d3b7661a5fdd449e8b436817079b889 4c49b27df1a7adcbc2c58ea9d15535936916fdaa M core :100644 100644 3c88f96220b54a8690cea0d272cd4447bfa98ffb cebccfcc1a20211b732fe96aaf90725ca0194fc8 M pom.xml $ git bisect log git bisect start # bad: [1f28286ee683470a0b0a15b9425046292cc7d2a5] [maven-release-plugin] prepare release jenkins-1.553 git bisect bad 1f28286ee683470a0b0a15b9425046292cc7d2a5 # good: [0738314cb054f287cb1c66232672a57140987f2b] [maven-release-plugin] prepare release jenkins-1.552 git bisect good 0738314cb054f287cb1c66232672a57140987f2b # good: [b09f828706c9a16b8acc4158f2161117f9359047] [JENKINS-21159] Noting #663 merge. git bisect good b09f828706c9a16b8acc4158f2161117f9359047 # bad: [5d91ed77f7402133fd7c9e77b1a6676c6695ec5a] Merge branch 'JENKINS-20965' of github.com:christ66/jenkins git bisect bad 5d91ed77f7402133fd7c9e77b1a6676c6695ec5a # bad: [8482756860d8cb838e20f7007c241aa77f2524b7] typo in changelog git bisect bad 8482756860d8cb838e20f7007c241aa77f2524b7 # good: [1b2ac716ba14945fd0654376d44a35feef62d531] Slave started from JNLP can now install itself as systemd service. git bisect good 1b2ac716ba14945fd0654376d44a35feef62d531 # bad: [3b472e5b867a4664f2cd40633cb197d09f470a16] Merge pull request #1135 from christ66/commons-io-up git bisect bad 3b472e5b867a4664f2cd40633cb197d09f470a16 How I reproduced it: Built and ran Jenkins on OS X master: mvn -DskipTests=true clean verify Added an SSH slave (Linux CentOS 6) Created a freestyle job. Two shell build steps: one nohup sleep 10000 & , the other /bin/false . After every build, ps aux | grep sleep on the slave. If it's there, kill it and git bisect bad on the master, else git bisect good . Repeat.

          I can report the same issue. The problem does not occured in version 1.556. But after the upgrade to version 1.563, the issue occurs (both succeeding and failing jobs)

          Jan Řezníček added a comment - I can report the same issue. The problem does not occured in version 1.556. But after the upgrade to version 1.563, the issue occurs (both succeeding and failing jobs)

          Daniel Beck added a comment -

          I'm trying to find out the cause for the different versions reported here. It would help if you report all of the following in an unambiguous format:

          • last known good Jenkins version
          • first known broken Jenkins version
          • OS of the master and whether this issue appears with jobs executed there
          • OS of all relevant slaves (i.e. slaves the issue appears, or is known not to appear on), and how they're started (e.g. JNLP, SSH slaves)
          • Whether anything is reported in the logs
          • The simplest reproducible job the issue appears with. does two shell build steps, one nohup sleep 10000 &, the other /bin/false do the job? If so, on which nodes? More/fewer than your real issue?

          Daniel Beck added a comment - I'm trying to find out the cause for the different versions reported here. It would help if you report all of the following in an unambiguous format: last known good Jenkins version first known broken Jenkins version OS of the master and whether this issue appears with jobs executed there OS of all relevant slaves (i.e. slaves the issue appears, or is known not to appear on), and how they're started (e.g. JNLP, SSH slaves) Whether anything is reported in the logs The simplest reproducible job the issue appears with. does two shell build steps, one nohup sleep 10000 & , the other /bin/false do the job? If so, on which nodes? More/fewer than your real issue?

          Szilard P added a comment -

          We may be experiencing the same bug in out jeknins setup - although I have seen orphaned processes left behind by jenkins before, but now it's become a plague.

          • last known good Jenkins version:
            no idea
          • first known broken Jenkins version:
            1.554.1
          • OS of the master and whether this issue appears with jobs executed there:
            Ubuntu 12.04.4 LTS, does not run jobs
          • OS of all relevant slaves (i.e. slaves the issue appears, or is known not to appear on), and how they're started (e.g. JNLP, SSH slaves)
            Ubuntu 10.04, 12.04, 13.10; OS X 10.7 (SSH slaves)
          • Whether anything is reported in the logs
            could not find anything relevant

          Note that I have the feeling this only happens to aborted jobs, but I can't confirm it ATM.

          Szilard P added a comment - We may be experiencing the same bug in out jeknins setup - although I have seen orphaned processes left behind by jenkins before, but now it's become a plague. last known good Jenkins version: no idea first known broken Jenkins version: 1.554.1 OS of the master and whether this issue appears with jobs executed there: Ubuntu 12.04.4 LTS, does not run jobs OS of all relevant slaves (i.e. slaves the issue appears, or is known not to appear on), and how they're started (e.g. JNLP, SSH slaves) Ubuntu 10.04, 12.04, 13.10; OS X 10.7 (SSH slaves) Whether anything is reported in the logs could not find anything relevant Note that I have the feeling this only happens to aborted jobs, but I can't confirm it ATM.

          I was able to reproduce this issue locally and I am working on a fix.

          Steven Christou added a comment - I was able to reproduce this issue locally and I am working on a fix.

          I created a pull request: https://github.com/jenkinsci/jenkins/pull/1322 to resolve this issue.

          Steven Christou added a comment - I created a pull request: https://github.com/jenkinsci/jenkins/pull/1322 to resolve this issue.

          This looks to be a regression in commons-io library. I logged IO-453 to track the changes for the commons-io library.

          Steven Christou added a comment - This looks to be a regression in commons-io library. I logged IO-453 to track the changes for the commons-io library.

          Daniel Beck added a comment -

          schristou: I merged your fix into a private build of 1.554.3 and verified this fixes the issue as described in my earlier comment from 09/May/14 11:42 PM. Unpatched 1.554.3 doesn't kill 'sleep' on the slave, patched does.

          Daniel Beck added a comment - schristou : I merged your fix into a private build of 1.554.3 and verified this fixes the issue as described in my earlier comment from 09/May/14 11:42 PM. Unpatched 1.554.3 doesn't kill 'sleep' on the slave, patched does.

          Code changed in jenkins
          User: christ66
          Path:
          core/src/main/java/hudson/util/ProcessTree.java
          http://jenkins-ci.org/commit/jenkins/410f06adfa798d29118c77ed01c5c02fc207cb02
          Log:
          [FIXED JENKINS-22641] FileUtils.readFileToByteArray behavior has changed in the latest version of commons-io.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: christ66 Path: core/src/main/java/hudson/util/ProcessTree.java http://jenkins-ci.org/commit/jenkins/410f06adfa798d29118c77ed01c5c02fc207cb02 Log: [FIXED JENKINS-22641] FileUtils.readFileToByteArray behavior has changed in the latest version of commons-io.

          Code changed in jenkins
          User: christ66
          Path:
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34
          Log:
          Add test unit for JENKINS-22641

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: christ66 Path: test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34 Log: Add test unit for JENKINS-22641

          Code changed in jenkins
          User: christ66
          Path:
          core/src/main/java/hudson/util/ProcessTree.java
          http://jenkins-ci.org/commit/jenkins/9ac68b92dcc0bc093c7983cdb0aab72342165df4
          Log:
          Merge branch 'JENKINS-22641' of github.com:christ66/jenkins into JENKINS-22641

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: christ66 Path: core/src/main/java/hudson/util/ProcessTree.java http://jenkins-ci.org/commit/jenkins/9ac68b92dcc0bc093c7983cdb0aab72342165df4 Log: Merge branch ' JENKINS-22641 ' of github.com:christ66/jenkins into JENKINS-22641

          Code changed in jenkins
          User: Jesse Glick
          Path:
          core/src/main/java/hudson/util/ProcessTree.java
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/f03e67a49bf01343635c60e894b9617483a03011
          Log:
          Merge branch 'JENKINS-22641' of github.com:christ66/jenkins

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/util/ProcessTree.java test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/f03e67a49bf01343635c60e894b9617483a03011 Log: Merge branch ' JENKINS-22641 ' of github.com:christ66/jenkins

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: changelog.html http://jenkins-ci.org/commit/jenkins/14a147eda8eebfd78c1cc53e1d59232d720de6b7 Log: JENKINS-22641 Noting merge of #1322. Compare: https://github.com/jenkinsci/jenkins/compare/a8756c6c0ddc...14a147eda8ee

          dogfood added a comment -

          Integrated in jenkins_main_trunk #3538
          [FIXED JENKINS-22641] FileUtils.readFileToByteArray behavior has changed in the latest version of commons-io. (Revision 410f06adfa798d29118c77ed01c5c02fc207cb02)
          Add test unit for JENKINS-22641 (Revision 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34)
          JENKINS-22641 Noting merge of #1322. (Revision 14a147eda8eebfd78c1cc53e1d59232d720de6b7)

          Result = SUCCESS
          schristou88 : 410f06adfa798d29118c77ed01c5c02fc207cb02
          Files :

          • core/src/main/java/hudson/util/ProcessTree.java

          schristou88 : 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34
          Files :

          • test/src/test/java/hudson/util/ProcessTreeKillerTest.java

          Jesse Glick : 14a147eda8eebfd78c1cc53e1d59232d720de6b7
          Files :

          • changelog.html

          dogfood added a comment - Integrated in jenkins_main_trunk #3538 [FIXED JENKINS-22641] FileUtils.readFileToByteArray behavior has changed in the latest version of commons-io. (Revision 410f06adfa798d29118c77ed01c5c02fc207cb02) Add test unit for JENKINS-22641 (Revision 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34) JENKINS-22641 Noting merge of #1322. (Revision 14a147eda8eebfd78c1cc53e1d59232d720de6b7) Result = SUCCESS schristou88 : 410f06adfa798d29118c77ed01c5c02fc207cb02 Files : core/src/main/java/hudson/util/ProcessTree.java schristou88 : 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34 Files : test/src/test/java/hudson/util/ProcessTreeKillerTest.java Jesse Glick : 14a147eda8eebfd78c1cc53e1d59232d720de6b7 Files : changelog.html

          Code changed in jenkins
          User: Jesse Glick
          Path:
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/0f3574bbc109935e37bbd473c2a9e7b7625e3ced
          Log:
          Merge branch 'JENKINS-22641' of github.com:christ66/jenkins

          Compare: https://github.com/jenkinsci/jenkins/compare/37c5bc2fd861...0f3574bbc109

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/0f3574bbc109935e37bbd473c2a9e7b7625e3ced Log: Merge branch ' JENKINS-22641 ' of github.com:christ66/jenkins Compare: https://github.com/jenkinsci/jenkins/compare/37c5bc2fd861...0f3574bbc109

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          aggregator/src/test/java/org/jenkinsci/plugins/workflow/steps/durable_task/ShellStepTest.java
          http://jenkins-ci.org/commit/workflow-plugin/e74fd8349c15568354fc3934010713503d22761f
          Log:
          added a test bug this is blocked by JENKINS-22641

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: aggregator/src/test/java/org/jenkinsci/plugins/workflow/steps/durable_task/ShellStepTest.java http://jenkins-ci.org/commit/workflow-plugin/e74fd8349c15568354fc3934010713503d22761f Log: added a test bug this is blocked by JENKINS-22641

          Code changed in jenkins
          User: christ66
          Path:
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/86c44d8f5fa7d98246bda76d9fb22fd60a4f8530
          Log:
          Add test unit for JENKINS-22641

          (cherry picked from commit 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: christ66 Path: test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/86c44d8f5fa7d98246bda76d9fb22fd60a4f8530 Log: Add test unit for JENKINS-22641 (cherry picked from commit 0d4c0cb6b274bfa18c810fcf761e3ad8b27ceb34)

          Don Bogardus added a comment -

          This issue has returned as of version 1.587 (through current version - 1.593)

          It was fixed in 1.565.3 through 1.586

          Same behavior as original bug, leaves processes running when job is killed.

          Don Bogardus added a comment - This issue has returned as of version 1.587 (through current version - 1.593) It was fixed in 1.565.3 through 1.586 Same behavior as original bug, leaves processes running when job is killed.

          Jesse Glick added a comment -

          dbogardus then your issue is probably a distinct bug with a similar symptom but potentially distinct preconditions and root cause. Better to file it as a new ticket, with any steps to reproduce you can muster, and mark it as “blocking” this one.

          Jesse Glick added a comment - dbogardus then your issue is probably a distinct bug with a similar symptom but potentially distinct preconditions and root cause. Better to file it as a new ticket, with any steps to reproduce you can muster, and mark it as “blocking” this one.

          Don Bogardus added a comment -

          Created JENKINS-26048 for similar bug returning in 1.587.

          Don Bogardus added a comment - Created JENKINS-26048 for similar bug returning in 1.587.

          Also see in Jenkins ver. 1.580.1
          Project type: Maven

          Dilip Mahadevappa added a comment - Also see in Jenkins ver. 1.580.1 Project type: Maven

          Daniel Beck added a comment -

          Dilip M: That's probably JENKINS-26048. Maven project type works differently from freestyle.

          Daniel Beck added a comment - Dilip M: That's probably JENKINS-26048 . Maven project type works differently from freestyle.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          pom.xml
          src/test/java/org/jenkinsci/plugins/durabletask/BourneShellScriptTest.java
          http://jenkins-ci.org/commit/durable-task-plugin/fc48f447763cb69d6ebf7c67e54e476d9e903bbd
          Log:
          Added test for stop.
          JENKINS-22641 means that this does not work in 1.554.3, so need to bump up the dependency to 1.565.3.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/test/java/org/jenkinsci/plugins/durabletask/BourneShellScriptTest.java http://jenkins-ci.org/commit/durable-task-plugin/fc48f447763cb69d6ebf7c67e54e476d9e903bbd Log: Added test for stop. JENKINS-22641 means that this does not work in 1.554.3, so need to bump up the dependency to 1.565.3.

            schristou Steven Christou
            toadnik17 Todd Perry
            Votes:
            12 Vote for this issue
            Watchers:
            23 Start watching this issue

              Created:
              Updated:
              Resolved: