Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22692

Jenkins Windows-Slave throwing exception on shutdown causes connection reset issues

    XMLWordPrintable

Details

    Description

      Using the most recent build of Jenkins I have been seeing connection issues after issuing reboot commands to targets. After looking at some logs it looks like the slave service on Windows may not be shutting down correctly and is not disconnecting from the Jenkins Server. I suspect that this is causing the reconnect issues that I'm seeing after the machine comes back online. Below is the failure as reported by the Jenkins server and excerpts from log files on the slave machine when the issue reproduced.

      Error as reported from Jenkins Server:
      FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      at hudson.remoting.Request.call(Request.java:174)
      at hudson.remoting.Channel.call(Channel.java:722)
      at hudson.FilePath.act(FilePath.java:1009)
      at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
      at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
      at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
      at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481)
      at hudson.model.Run.execute(Run.java:1700)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:231)
      Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      at hudson.remoting.Request.abort(Request.java:299)
      at hudson.remoting.Channel.terminate(Channel.java:782)
      at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      Caused by: java.net.SocketException: Connection reset
      at java.net.SocketInputStream.read(SocketInputStream.java:185)
      at java.io.FilterInputStream.read(FilterInputStream.java:133)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
      at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:77)
      at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
      at hudson.remoting.Command.readFrom(Command.java:92)
      at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:71)
      at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

      From jenkins-slave.err.log:

      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: <Slave Machine>
      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among http://<Jenkins Server>/
      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to <Jenkins Server>
      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Apr 19, 2014 11:52:09 PM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: <Slave Machine> is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: <Slave Machine> is already connected to this master. Rejecting this connection.
      at hudson.remoting.Engine.onConnectionRejected(Engine.java:303)
      at hudson.remoting.Engine.run(Engine.java:276)

      From jenkins-slave.wrapper.log:

      2014-04-19 23:52:14 - Stopping jenkinsslave-C__Jenkins
      2014-04-19 23:52:14 - ProcessKill 3088
      2014-04-19 23:52:15 - Shutdown exception
      Message:A system shutdown is in progress. (Exception from HRESULT: 0x8007045B)
      Stacktrace: at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
      at System.Management.ManagementScope.InitializeGuts(Object o)
      at System.Management.ManagementScope.Initialize()
      at System.Management.ManagementObjectSearcher.Initialize()
      at System.Management.ManagementObjectSearcher.Get()
      at winsw.WrapperService.StopProcessAndChildren(Int32 pid)
      at winsw.WrapperService.StopIt()
      at winsw.WrapperService.OnShutdown()

      Attachments

        Issue Links

          Activity

            oleg_nenashev Oleg Nenashev added a comment -

            The change is going to be integrated soon: https://github.com/jenkinsci/windows-slave-installer-module/pull/5

            oleg_nenashev Oleg Nenashev added a comment - The change is going to be integrated soon: https://github.com/jenkinsci/windows-slave-installer-module/pull/5

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            pom.xml
            http://jenkins-ci.org/commit/windows-slave-installer-module/e7e5cfb57e7289376e542d680004a763be55033b
            Log:
            Update Windows Service Wrapper from 1.18 to 2.0.1

            Addresses JENKINS-22692, JENKINS-23487 and several others.
            Full changelog: https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml http://jenkins-ci.org/commit/windows-slave-installer-module/e7e5cfb57e7289376e542d680004a763be55033b Log: Update Windows Service Wrapper from 1.18 to 2.0.1 Addresses JENKINS-22692 , JENKINS-23487 and several others. Full changelog: https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/pom.xml
            core/src/main/resources/windows-service/jenkins-slave.xml
            core/src/main/resources/windows-service/jenkins.xml
            war/pom.xml
            http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2
            Log:
            Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765)

                1. WinSW changes

            The update includes many fixes and improvements, the full list is provided in the [WinSW changelog](https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md). There are several issues referenced in Jenkins bugtracker:

                1. Windows Agent Installer changes
            • Adapt the default configurations to pick fixes above
            • Slave => Agent renaming where possible
                1. Jenkins core changes
            • Modify the configuration template, reference advanced options
            • Enable Runaway Process Killer by default
            • Update Windows Agent Installer to 1.7
            • Remove the obsolete jenkins-slave.xml file from the core.

            Now it is within windows-slave-installer

            • Use the deployed Snapshot for CI
            • Pick the release version of windows-slave-installer-1.7
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/pom.xml core/src/main/resources/windows-service/jenkins-slave.xml core/src/main/resources/windows-service/jenkins.xml war/pom.xml http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2 Log: Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765) WinSW changes The update includes many fixes and improvements, the full list is provided in the [WinSW changelog] ( https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md ). There are several issues referenced in Jenkins bugtracker: JENKINS-22692 ( https://issues.jenkins-ci.org/browse/JENKINS-22692 ) - Connection reset issues when WinSW gets terminated due to the system shutdown JENKINS-23487 ( https://issues.jenkins-ci.org/browse/JENKINS-23487)- Support of shared directories in WinSW JENKINS-39231 ( https://issues.jenkins-ci.org/browse/JENKINS-39231 ) - Enable Runaway Process Killer by default JENKINS-39237 ( https://issues.jenkins-ci.org/browse/JENKINS-39237 ) - Auto-upgrade of JNLP agent versions on the slaves Windows Agent Installer changes Adapt the default configurations to pick fixes above Slave => Agent renaming where possible Jenkins core changes Modify the configuration template, reference advanced options Enable Runaway Process Killer by default Update Windows Agent Installer to 1.7 Remove the obsolete jenkins-slave.xml file from the core. Now it is within windows-slave-installer Use the deployed Snapshot for CI Pick the release version of windows-slave-installer-1.7
            oleg_nenashev Oleg Nenashev added a comment -

            The fix has been released in Jenkins 2.50. Since it is a big chunk of changes in WinSW, there is no plan to backport this fix to 2.46.x I suppose.

            Marking as Release-candidate in order to discuss with olivergondza

            oleg_nenashev Oleg Nenashev added a comment - The fix has been released in Jenkins 2.50. Since it is a big chunk of changes in WinSW, there is no plan to backport this fix to 2.46.x I suppose. Marking as Release-candidate in order to discuss with olivergondza
            oleg_nenashev Oleg Nenashev added a comment -

            Rejecting from LTS 2.46.1 due to the JENKINS-42744 regression reported to 2.50

            oleg_nenashev Oleg Nenashev added a comment - Rejecting from LTS 2.46.1 due to the JENKINS-42744 regression reported to 2.50

            People

              oleg_nenashev Oleg Nenashev
              ryan_croom Ryan Croom
              Votes:
              8 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: