Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39231

WinSW: Automatically terminate runaway processes in Windows services

    XMLWordPrintable

Details

    • New Feature
    • Status: Resolved (View Workflow)
    • Major
    • Resolution: Fixed
    • core

    Description

      In Jenkins projects we have many users complaining that the slave/agent is "already connected", because they have a runaway slave/agent process. It happens when WinSW gets terminated without executing the process shutdown logic (force kill) or when WinSW fails to terminate the process.

      As a part of WinSW 2.0, it would be great to create a logic, which...

      • records PID of the created process to the disc
      • performs status check of the previously spawned process upon restart
      • terminates the runaway process if required

      It can be done via WinSW 2 "plugin"
      Issue: https://github.com/kohsuke/winsw/issues/125

      Attachments

        Issue Links

          Activity

            krogan mark mann added a comment -

            Jenkins master is on 2.32.1
            Master and slaves running Win2012
            The symptoms sound very familiar to a problem where we've had a jenkins slave up.... then we reboot the windows server (slave).
            When the server returns and the slave is automatically started, it hangs around for about 30secs then terminates connection which kills our job.
            We've also witnessed the hosting windows service winsw 1.17 (which auto upgrades to 1.18) bombs out but leaves the java process running.
            The java process is still keeping the slave active to the master for an indiscriminate amount of time (anywhere between 20secs to 2hrs) before eventually dying of its own accord, with no fresh jobs sent or interaction with the windows service.

            krogan mark mann added a comment - Jenkins master is on 2.32.1 Master and slaves running Win2012 The symptoms sound very familiar to a problem where we've had a jenkins slave up.... then we reboot the windows server (slave). When the server returns and the slave is automatically started, it hangs around for about 30secs then terminates connection which kills our job. We've also witnessed the hosting windows service winsw 1.17 (which auto upgrades to 1.18) bombs out but leaves the java process running. The java process is still keeping the slave active to the master for an indiscriminate amount of time (anywhere between 20secs to 2hrs) before eventually dying of its own accord, with no fresh jobs sent or interaction with the windows service.
            oleg_nenashev Oleg Nenashev added a comment -

            krogan Smells like another issue.
            Anyway, makes sense to retest after the WinSW 2 integration. You can install the new Windows service wrapper manually

            oleg_nenashev Oleg Nenashev added a comment - krogan Smells like another issue. Anyway, makes sense to retest after the WinSW 2 integration. You can install the new Windows service wrapper manually

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml
            http://jenkins-ci.org/commit/windows-slave-installer-module/8dcf02da16e7c95c67c7de95fd078089a8ecf8df
            Log:
            JENKINS-39231 - Integrate the Runaway Process Killer extension to terminate runaway processes

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml http://jenkins-ci.org/commit/windows-slave-installer-module/8dcf02da16e7c95c67c7de95fd078089a8ecf8df Log: JENKINS-39231 - Integrate the Runaway Process Killer extension to terminate runaway processes

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/pom.xml
            core/src/main/resources/windows-service/jenkins-slave.xml
            core/src/main/resources/windows-service/jenkins.xml
            war/pom.xml
            http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2
            Log:
            Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765)

                1. WinSW changes

            The update includes many fixes and improvements, the full list is provided in the [WinSW changelog](https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md). There are several issues referenced in Jenkins bugtracker:

                1. Windows Agent Installer changes
            • Adapt the default configurations to pick fixes above
            • Slave => Agent renaming where possible
                1. Jenkins core changes
            • Modify the configuration template, reference advanced options
            • Enable Runaway Process Killer by default
            • Update Windows Agent Installer to 1.7
            • Remove the obsolete jenkins-slave.xml file from the core.

            Now it is within windows-slave-installer

            • Use the deployed Snapshot for CI
            • Pick the release version of windows-slave-installer-1.7
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/pom.xml core/src/main/resources/windows-service/jenkins-slave.xml core/src/main/resources/windows-service/jenkins.xml war/pom.xml http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2 Log: Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765) WinSW changes The update includes many fixes and improvements, the full list is provided in the [WinSW changelog] ( https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md ). There are several issues referenced in Jenkins bugtracker: JENKINS-22692 ( https://issues.jenkins-ci.org/browse/JENKINS-22692 ) - Connection reset issues when WinSW gets terminated due to the system shutdown JENKINS-23487 ( https://issues.jenkins-ci.org/browse/JENKINS-23487)- Support of shared directories in WinSW JENKINS-39231 ( https://issues.jenkins-ci.org/browse/JENKINS-39231 ) - Enable Runaway Process Killer by default JENKINS-39237 ( https://issues.jenkins-ci.org/browse/JENKINS-39237 ) - Auto-upgrade of JNLP agent versions on the slaves Windows Agent Installer changes Adapt the default configurations to pick fixes above Slave => Agent renaming where possible Jenkins core changes Modify the configuration template, reference advanced options Enable Runaway Process Killer by default Update Windows Agent Installer to 1.7 Remove the obsolete jenkins-slave.xml file from the core. Now it is within windows-slave-installer Use the deployed Snapshot for CI Pick the release version of windows-slave-installer-1.7
            oleg_nenashev Oleg Nenashev added a comment -

            Released in Jenkins 2.50. See the upgrade guidelines for more info

            oleg_nenashev Oleg Nenashev added a comment - Released in Jenkins 2.50. See the upgrade guidelines for more info

            People

              oleg_nenashev Oleg Nenashev
              oleg_nenashev Oleg Nenashev
              Votes:
              4 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: