Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28968

Aborting builds does not kill surefire sub-process

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core, (1)
      maven-plugin
    • None

      I have a test that (unfortunately) occasionally hangs waiting on an external dependency. I recently noticed that if the test is aborted, the surefire instance remains running on the slave machine!

      This does not happen when running "sleep 50000" in a command window (i.e. this is killed with the job).

          [JENKINS-28968] Aborting builds does not kill surefire sub-process

          I think tibor17 worked on such issues in surefire recently

          On windows we have this bug https://issues.apache.org/jira/browse/SUREFIRE-1261 but I think I saw several issues related to surefirebooter too

          In Jenkins side, I never succeeded to reproduce/prove it but I'm ~ sure that this case happens when we have a slave disconnection (especially with old remoting)

           

          Arnaud Héritier added a comment - I think tibor17 worked on such issues in surefire recently On windows we have this bug https://issues.apache.org/jira/browse/SUREFIRE-1261  but I think I saw several issues related to surefirebooter too In Jenkins side, I never succeeded to reproduce/prove it but I'm ~ sure that this case happens when we have a slave disconnection (especially with old remoting)  

          Don Bogardus added a comment - - edited

          Just to make sure this is part of the discussion. The problem has come and gone in previous jenkins versions. I installed and tested many versions of jenkins - 

          First noticed in 1.553 

          Fixed in version 1.565.2 (Bug - https://issues.jenkins-ci.org/browse/JENKINS-22641)

          Came back in 1.587, and is still present in weekly 2.66.1 

          (Was tested in on master and slaves, same results)

          All of these tests were for the maven/surefire scenario. I tried a few things to reproduce the bug outside of this scenario and was not able to. 

           

          Don Bogardus added a comment - - edited Just to make sure this is part of the discussion. The problem has come and gone in previous jenkins versions. I installed and tested many versions of jenkins -  First noticed in 1.553  Fixed in version 1.565.2 (Bug - https://issues.jenkins-ci.org/browse/JENKINS-22641 ) Came back in 1.587, and is still present in weekly 2.66.1  (Was tested in on master and slaves, same results) All of these tests were for the maven/surefire scenario. I tried a few things to reproduce the bug outside of this scenario and was not able to.   

          jchatham added a comment -

          This issue is probably related to JENKINS-28125, and I have encountered similar problems with non-surefire sub-processes not dying when a build is stopped.

          jchatham added a comment - This issue is probably related to JENKINS-28125 , and I have encountered similar problems with non-surefire sub-processes not dying when a build is stopped.

          boris ivan added a comment -

          Hoping this can get some attention. We have an extensive test suite that does some destructive things regarding the initialization of a large test  topology. If I realize that I need to abort the run, it would be great if it really aborted. Instead, Jenkins reports it as aborted, but the entire suite is still running in the background, and I need to scramble to remember the IP address and credentials for the slave agent, so I can remote desktop to it, launch task manager, and kill all java processes.

          This bug has existed for years.

          boris ivan added a comment - Hoping this can get some attention. We have an extensive test suite that does some destructive things regarding the initialization of a large test  topology. If I realize that I need to abort the run, it would be great if it really aborted. Instead, Jenkins reports it as aborted, but the entire suite is still running in the background, and I need to scramble to remember the IP address and credentials for the slave agent, so I can remote desktop to it, launch task manager, and kill all java processes. This bug has existed for years.

          Oleg Nenashev added a comment -

          borisivan Which Java do you use on the agent? If you use a 32bit Java on the 64bit platform, this is an expected behavior. There are other cases like Cygwin binaries when it is an expected behavior

          Oleg Nenashev added a comment - borisivan Which Java do you use on the agent? If you use a 32bit Java on the 64bit platform, this is an expected behavior. There are other cases like Cygwin binaries when it is an expected behavior

          boris ivan added a comment -

          It's 64 bit Windows (have seen this on Windows 7, Windows 10), and 64 bit Java being executed in a 64 bit powershell window, to load slave.jar from the command line.

          As far as the maven job goes, I think that starts with a 64 bit version of java too, but will try and make sure. But as far as loading the jenkins slave agent goes, it's definitely being loaded via 64 bit java.

          boris ivan added a comment - It's 64 bit Windows (have seen this on Windows 7, Windows 10), and 64 bit Java being executed in a 64 bit powershell window, to load slave.jar from the command line. As far as the maven job goes, I think that starts with a 64 bit version of java too, but will try and make sure. But as far as loading the jenkins slave agent goes, it's definitely being loaded via 64 bit java.

          Martin Gerdes added a comment -

          We have this problem too.

          For some yet unresolved reason we have surefire and mavenInstallation processes which neven finish (developers are still trying to determine the cause).

          But because of that, this bug is pretty terrible for us: When developers stop a job in Jenkins, the sufrefire and mavenInstallation processes remain, consuming memory and CPU until the server becomes unresponsive or oom events occur.

          Environment:

          Jenkins ver. 2.73.1 running in a docker instance (jenkins/jenkins:lts, which uses Debian Version 9.1)
          Used Java:
            surefire: Java SE Development Kit 7u80 (installed from within Jenkins)
            mavenInstallation: openjdk-8-jdk:amd64           8u141-b15-1~deb9u1 (system wide java installation in the docker container)

          Martin Gerdes added a comment - We have this problem too. For some yet unresolved reason we have surefire and mavenInstallation processes which neven finish (developers are still trying to determine the cause). But because of that, this bug is pretty terrible for us: When developers stop a job in Jenkins, the sufrefire and mavenInstallation processes remain, consuming memory and CPU until the server becomes unresponsive or oom events occur. Environment: Jenkins ver. 2.73.1 running in a docker instance (jenkins/jenkins:lts, which uses Debian Version 9.1) Used Java:   surefire: Java SE Development Kit 7u80 (installed from within Jenkins)   mavenInstallation: openjdk-8-jdk:amd64           8u141-b15-1~deb9u1 (system wide java installation in the docker container)

          Oleg Nenashev added a comment -

          OK, so it happens in Linux as well. Interesting...

          Oleg Nenashev added a comment - OK, so it happens in Linux as well. Interesting...

          Martin Gerdes added a comment -

          It also definitely is not a case of mixing 32 and 64bit java versions:

          /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -version
          openjdk version "1.8.0_141"
          OpenJDK Runtime Environment (build 1.8.0_141-8u141-b15-1~deb9u1-b15)
          OpenJDK 64-Bit Server VM (build 25.141-b15, mixed mode)

          /var/jenkins_home/tools/hudson.model.JDK/JDK-7/bin/java -version
          java version "1.7.0_80"
          Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
          Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

          Could it be because we are mixing java 7 and java 8 here?

          Any other ideas of what to change to avoid being affected by this bug?

          Martin Gerdes added a comment - It also definitely is not a case of mixing 32 and 64bit java versions: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -version openjdk version "1.8.0_141" OpenJDK Runtime Environment (build 1.8.0_141-8u141-b15-1~deb9u1-b15) OpenJDK 64-Bit Server VM (build 25.141-b15, mixed mode) /var/jenkins_home/tools/hudson.model.JDK/JDK-7/bin/java -version java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) Could it be because we are mixing java 7 and java 8 here? Any other ideas of what to change to avoid being affected by this bug?

          Nirmit Srivastava added a comment - - edited

          Is there any solution to above problem. We are facing similar issue where surefire booter process keeps on running on a linux slave machine.

          Jenkins version being used : Jenkins ver. 2.136

          Nirmit Srivastava added a comment - - edited Is there any solution to above problem. We are facing similar issue where surefire booter process keeps on running on a linux slave machine. Jenkins version being used :  Jenkins ver. 2.136

            Unassigned Unassigned
            rddesmond Ryan Desmond
            Votes:
            6 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated: