Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50219

Jenkins agent windows service fails to restart with an unhandled COMException in WinSw's log

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Labels:
    • Environment:
      Jenkins 2.9.84
      windows-slave-installer 1.9.2
      jenkins-slave.exe running on Windows 10 + jre 1.8.0.161 under local admin credentials.
    • Similar Issues:

      Description

      We have noticed that our Windows 10 nodes have been intermittently failing to come back online after Jenkins / master restarts.

       

      The following entries are shown in WinSw's Jenkins-slave.wrapper.log file at the time of the failures. In particular it appears that the unhandled COMException is preventing the service restarting successfully.

       

      2018-03-16 12:26:38,011 DEBUG - Starting ServiceWrapper in the CLI mode
      2018-03-16 12:26:38,589 INFO  - Restarting the service with id 'jenkinsslave-c__jenkins'
      2018-03-16 12:26:38,682 DEBUG - Completed. Exit code is 0
      2018-03-16 12:26:38,870 DEBUG - Starting ServiceWrapper in the CLI mode
      2018-03-16 12:26:39,151 INFO  - Restarting the service with id 'jenkinsslave-c__jenkins'
      2018-03-16 12:26:39,276 INFO  - Stopping jenkinsslave-c__jenkins
      2018-03-16 12:26:39,292 DEBUG - ProcessKill 332
      2018-03-16 12:26:39,386 INFO  - Found child process: 420 Name: conhost.exe
      2018-03-16 12:26:39,433 INFO  - Stopping process 420
      2018-03-16 12:26:39,448 INFO  - Send SIGINT 420
      2018-03-16 12:26:39,464 WARN  - SIGINT to 420 failed - Killing as fallback
      2018-03-16 12:26:39,464 INFO  - Stopping process 332
      2018-03-16 12:26:39,479 INFO  - Send SIGINT 332
      2018-03-16 12:26:39,495 WARN  - SIGINT to 332 failed - Killing as fallback
      2018-03-16 12:26:39,511 INFO  - Finished jenkinsslave-c__jenkins
      2018-03-16 12:26:39,511 DEBUG - Completed. Exit code is 0
      2018-03-16 12:26:40,342 FATAL - Unhandled exception
      System.Runtime.InteropServices.COMException (0x80040150): Could not read key from registry (Exception from HRESULT: 0x80040150 (REGDB_E_READREGDB))
         at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
         at System.Management.ManagementObjectSearcher.Get()
         at WMI.WmiRoot.ClassHandler.Invoke(Object proxy, MethodInfo method, Object[] args)
         at winsw.WrapperService.Run(String[] _args, ServiceDescriptor descriptor)
         at winsw.WrapperService.Main(String[] args)
      

      After some inspection of WinSw's source code we have determined that the exception is being thrown from the line "s = svc.Select(d.Id);" in the following snippet (found in the "Run" method in "src/Core/ServiceWrapper/Main.cs lines 687-704). When a secondary WinSw process is run with the "restart!" parameter by Jenkins.

       

      if (args[0] == "restart")
      {
          Log.Info("Restarting the service with id '" + d.Id + "'");
          if (s == null) 
              ThrowNoSuchService();
          if(s.Started)
              s.StopService();
          while (s.Started)
          {
              Thread.Sleep(1000);
              s = svc.Select(d.Id);
          }
          s.StartService();
          return;
      }

      We are currently able to work around the problem by restarting the service manually on affected nodes.

       

       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              oleg_nenashev Oleg Nenashev
              Reporter:
              tom_m_third_dimension Tom Manning
              Votes:
              12 Vote for this issue
              Watchers:
              15 Start watching this issue

                Dates

                Created:
                Updated: