-
Bug
-
Resolution: Fixed
-
Critical
-
Powered by SuggestiMate
https://wiki.jenkins-ci.org/display/JENKINS/Amazon+EC2+Plugin
I followed the guidelines for setting up Windows AMI. While I can see that the EC2 is launched in AWS, the instance is not able to connect to master as slave. logs show something like
'waiting for Windows RM ... going to sleep ..'
Notes for Windows AMI:
EC2 Windows slaves are accessed with CIFS (to send the initial Jenkins slave.jar) and WinRM to launch and connect to the slave afterward. This windows AMI must be configured with:
a security group allowing SMB over TCP (incoming TCP port 445) and WinRM (incoming TCP port 5985)
windows firewall should allow incoming SMB over TCP
java should be installed and available in the %PATH%
WinRM should be enabled with the following commands (for more information see: Microsoft article 555966):
winrm quickconfig
winrm set winrm/config/service/Auth @
winrm set winrm/config/service @
{AllowUnencrypted="true"}winrm set winrm/config/winrs @
{MaxMemoryPerShellMB="1024"}[JENKINS-25385] Jenkins EC2 plugin is not able to launch Windows Slaves in AWS
I see the same problem. Tried to debug it with tcpdump (see attached file, win-dump.txt), and have the following findings:
1. The TCP connection is set up, so there are no firewall issues
2. The initial http request jenkins-server ==> raised windows-node does not have any username/password for authentication
3. The windows-node replies with HTTP/1.1 401 (authentication failed)
The node logfile logs like this to me:
Node Windows build machine (i-7665db97)(i-7665db97) is still stopping, waiting 5s
[above line repeated while node boots up]
Node Windows build machine (i-7665db97)(i-7665db97) is ready
Windows build machine (i-7665db97) booted at 1415975164000
Connecting to ip-10-124-9-246.release.in.here.com(10.124.9.246) with WinRM as Administrator
Waiting for WinRM to come up. Sleeping 10s.
[above two lines repeated indefinitely]
For people seeing this issue, please enable a custom logger for hudson.plugins.ec2.win.winrm so that we can see how it is failing to connect.
I was getting this issue too, and just a sec ago I managed to successfully start windows slave (without https support). I did two things:
1) Help states that winrm should configure /winrm/config/service entries, but I had to run
winrm set winrm/config/client @{AllowUnencrypted="true"} winrm set winrm/config/client/Auth @{Basic="true"}
to allow external connections. Before that I wasn't able to telnet to port 5985, after it was possible.
2) In WinRMClient.buildHTTPClient() I've removed unregister for AuthPolicy.SPNEGO and added unregister for KERBEROS, DIGEST and NTLM. Before this change there was exception about unsupported auth method, but it was getting lost somewhere in the code flow.
NOTE: I do not know what I am doing. I have very little idea about how exactly winrm works, I am just gathering info from random pages and trying to apply it. But I hope it will help to fix this properly
@kohsuke
I can manually telnet into the windows ec2 instance. However, Jenkins cannot seem to add it as a slave. I see the following error in the winrm logger
Request:
POST http://xxx.xxx.xx.xx:5985/wsman
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd"><env:Header><a:To>http://xxx.xxx.xx.xx:5985/wsman</a:To><a:ReplyTo><a:Address mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address></a:ReplyTo><w:MaxEnvelopeSize mustUnderstand="true">153600</w:MaxEnvelopeSize><a:MessageID>uuid:AC60C672-A7F9-4283-B161-17B5A37A9F63</a:MessageID><w:Locale mustUnderstand="false" xml:lang="en-US"/><p:DataLocale mustUnderstand="false" xml:lang="en-US"/><w:OperationTimeout>PT60S</w:OperationTimeout><a:Action mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/09/transfer/Create</a:Action><w:ResourceURI mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</w:ResourceURI><w:OptionSet><w:Option Name="WINRS_NOPROFILE">FALSE</w:Option><w:Option Name="WINRS_CODEPAGE">437</w:Option></w:OptionSet></env:Header><env:Body><rsp:Shell><rsp:InputStreams>stdin</rsp:InputStreams><rsp:OutputStreams>stdout stderr</rsp:OutputStreams></rsp:Shell></env:Body></env:Envelope>
Dec 05, 2014 8:46:40 PM SEVERE hudson.plugins.ec2.win.winrm.WinRMClient sendRequest
I/O Exception in HTTP POST
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:117)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:178)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:244)
at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:215)
at hudson.plugins.ec2.win.winrm.WinRMClient.openShell(WinRMClient.java:94)
at hudson.plugins.ec2.win.winrm.WinRM.ping(WinRM.java:29)
at hudson.plugins.ec2.win.WinConnection.ping(WinConnection.java:117)
at hudson.plugins.ec2.win.EC2WindowsLauncher.connectToWinRM(EC2WindowsLauncher.java:118)
at hudson.plugins.ec2.win.EC2WindowsLauncher.launch(EC2WindowsLauncher.java:29)
at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:101)
at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:241)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I did what Tomasz suggested
2) In WinRMClient.buildHTTPClient() I've removed unregister for AuthPolicy.SPNEGO and added unregister for KERBEROS, DIGEST and NTLM. Before this change there was exception about unsupported auth method, but it was getting lost somewhere in the code flow.
private DefaultHttpClient buildHTTPClient() { DefaultHttpClient httpclient = new DefaultHttpClient(); //httpclient.getAuthSchemes().unregister(AuthPolicy.SPNEGO); httpclient.getAuthSchemes().unregister(AuthPolicy.KERBEROS); httpclient.getAuthSchemes().unregister(AuthPolicy.DIGEST); httpclient.getAuthSchemes().unregister(AuthPolicy.NTLM);
Code changed in jenkins
User: Jason Mittertreiner
Path:
src/main/java/hudson/plugins/ec2/win/EC2WindowsLauncher.java
src/main/java/hudson/plugins/ec2/win/winrm/WinRMClient.java
http://jenkins-ci.org/commit/ec2-plugin/0e840f7129b91af5101cb8f08f938743dc188ff9
Log:
JENKINS-27260 SPNEGO for Windows in EC2 Plugin
Fixed the Windows temp directory getting set to ""
Enabled SPNEGO authentication
JENKINS-25385 and JENKINS-4995 both have comments complaining about
infinite loops when creating Windows slaves. Because SPNEGO is
unregistered for, the httpclient throws and exception that is silently
caught and causes the infinite loop.
Hi guys.
So is this working or not? I just tried it and just sits there waiting:
Waiting for WinRM to come up. Sleeping 10s. Connecting to ec2-52-17-36-66.eu-west-1.compute.amazonaws.com(52.17.36.66) with WinRM as Waiting for WinRM to come up. Sleeping 10s. Connecting to ec2-52-17-36-66.eu-west-1.compute.amazonaws.com(52.17.36.66) with WinRM as Waiting for WinRM to come up. Sleeping 10s.
I'm having the same problem. I've setup winrm as specified in pull request #67, and also added the setup reported by Thomasz. I'm using version 1.28, which to my understanding (after reviewing the Github commits) includes the SPNEGO negotiation.
I've enabled logging, and the logs (after a lot of failures during initiation, which is expected) is now simply reporting:
Jun 14, 2015 4:20:46 PM FINEST hudson.plugins.ec2.win.winrm.WinRMClient Request: POST http://172.16.0.252:5985/wsman <?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd"><env:Header><a:To>http://172.16.0.252:5985/wsman</a:To><a:ReplyTo><a:Address mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address></a:ReplyTo><w:MaxEnvelopeSize mustUnderstand="true">153600</w:MaxEnvelopeSize><a:MessageID>uuid:24BD995E-DFCC-4B42-AF00-E5C85B66445B</a:MessageID><w:Locale mustUnderstand="false" xml:lang="en-US"/><p:DataLocale mustUnderstand="false" xml:lang="en-US"/><w:OperationTimeout>PT60S</w:OperationTimeout><a:Action mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/09/transfer/Create</a:Action><w:ResourceURI mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</w:ResourceURI><w:OptionSet><w:Option Name="WINRS_NOPROFILE">FALSE</w:Option><w:Option Name="WINRS_CODEPAGE">437</w:Option></w:OptionSet></env:Header><env:Body><rsp:Shell><rsp:InputStreams>stdin</rsp:InputStreams><rsp:OutputStreams>stdout stderr</rsp:OutputStreams></rsp:Shell></env:Body></env:Envelope> Jun 14, 2015 4:20:46 PM WARNING hudson.plugins.ec2.win.winrm.WinRMClient sendRequest winrm returned 401 - shouldn't happen though - retrying in 2 minutes
I don't see the authentication header in the log.
Running the Ruby WinRM gem, everything seems correct:
$ irb -r winrm > puts WinRM::WinRMWebService.new("http://172.16.0.252:5985/wsman", :plaintext, user: "Administrator", pass: "XHNyRGud.K", basic_auth_only: true).cmd("dir")[:data].collect{ |r| r[:stdout] } Volume in drive C has no label. Volume Serial Number is 12A7-BAEB ...
I've made a TCP dump of the request, and it looks like the HTTP client sends the credentials incorrectly. Here's a tshark network analysis dump:
Hypertext Transfer Protocol
POST /wsman HTTP/1.1\r\n
[Expert Info (Chat/Sequence): POST /wsman HTTP/1.1\r\n]
[Message: POST /wsman HTTP/1.1\r\n]
[Severity level: Chat]
[Group: Sequence]
Request Method: POST
Request URI: /wsman
Request Version: HTTP/1.1
Content-Length: 1320\r\n
[Content length: 1320]
Content-Type: application/soap+xml; charset=UTF-8\r\n
Host: 172.16.0.252:5985\r\n
Connection: Keep-Alive\r\n
User-Agent: Apache-HttpClient/4.3 (java 1.5)\r\n
Authorization: Basic OjEyMzQ1Ngo=\r\n
Credentials: :123456
\r\n
[Full request URI: http://172.16.0.252:5985/wsman]
[HTTP request 1/1]
It looks like the user name is completely missing from the Authorization header.
Ok, the above problem was on my side - the EC2 plugin still needs to have the user specified in the AMI configuration, even though this is a Windows system. Kinds of make sense, I guess.
After I fixed the permissions, I still can't get the plugin to work - the node log shows
Connecting to ip-172-16-0-107.us-west-2.compute.internal(172.16.0.107) with WinRM as Administrator
Waiting for WinRM to come up. Sleeping 10s.
looping until I give up.
The winrm system log has this to say:
Jun 15, 2015 8:52:19 AM FINE hudson.plugins.ec2.win.winrm.WinRMClient opening winrm shell to: http://172.16.0.107:5985/wsman Jun 15, 2015 8:52:19 AM FINEST hudson.plugins.ec2.win.winrm.WinRMClient Request: POST http://172.16.0.107:5985/wsman <?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd"><env:Header><a:To>http://172.16.0.107:5985/wsman</a:To><a:ReplyTo><a:Address mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address></a:ReplyTo><w:MaxEnvelopeSize mustUnderstand="true">153600</w:MaxEnvelopeSize><a:MessageID>uuid:7CC37EC1-0EBF-4D8F-8347-CA90977880E9</a:MessageID><w:Locale mustUnderstand="false" xml:lang="en-US"/><p:DataLocale mustUnderstand="false" xml:lang="en-US"/><w:OperationTimeout>PT60S</w:OperationTimeout><a:Action mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/09/transfer/Create</a:Action><w:ResourceURI mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</w:ResourceURI><w:OptionSet><w:Option Name="WINRS_NOPROFILE">FALSE</w:Option><w:Option Name="WINRS_CODEPAGE">437</w:Option></w:OptionSet></env:Header><env:Body><rsp:Shell><rsp:InputStreams>stdin</rsp:InputStreams><rsp:OutputStreams>stdout stderr</rsp:OutputStreams></rsp:Shell></env:Body></env:Envelope> Jun 15, 2015 8:52:19 AM FINEST hudson.plugins.ec2.win.winrm.WinRMClient Response: <?xml version="1.0" encoding="UTF-8"?> <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:x="http://schemas.xmlsoap.org/ws/2004/09/transfer" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell" xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd" xml:lang="en-US"><s:Header><a:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/CreateResponse</a:Action><a:MessageID>uuid:D729BFE3-6F66-4343-8516-DC023A81CADE</a:MessageID><a:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:To><a:RelatesTo>uuid:7CC37EC1-0EBF-4D8F-8347-CA90977880E9</a:RelatesTo></s:Header><s:Body><x:ResourceCreated><a:Address>http://172.16.0.107:5985/wsman</a:Address><a:ReferenceParameters><w:ResourceURI>http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</w:ResourceURI><w:SelectorSet><w:Selector Name="ShellId">75644E5E-AFFA-429A-AD47-E05819E470F2</w:Selector></w:SelectorSet></a:ReferenceParameters></x:ResourceCreated><rsp:Shell><rsp:ShellId>75644E5E-AFFA-429A-AD47-E05819E470F2</rsp:ShellId><rsp:ResourceUri>http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</rsp:ResourceUri><rsp:Owner>Administrator</rsp:Owner><rsp:ClientIP>172.16.0.12</rsp:ClientIP><rsp:IdleTimeOut>PT7200.000S</rsp:IdleTimeOut><rsp:InputStreams>stdin</rsp:InputStreams><rsp:OutputStreams>stdout stderr</rsp:OutputStreams><rsp:ShellRunTime>P0DT0H0M0S</rsp:ShellRunTime><rsp:ShellInactivity>P0DT0H0M0S</rsp:ShellInactivity></rsp:Shell></s:Body></s:Envelope> Jun 15, 2015 8:52:19 AM FINER hudson.plugins.ec2.win.winrm.WinRMClient shellid: 75644E5E-AFFA-429A-AD47-E05819E470F2 Jun 15, 2015 8:52:19 AM FINE hudson.plugins.ec2.win.winrm.WinRMClient closing winrm shell 75644E5E-AFFA-429A-AD47-E05819E470F2 Jun 15, 2015 8:52:19 AM FINEST hudson.plugins.ec2.win.winrm.WinRMClient Request: POST http://172.16.0.107:5985/wsman <?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd"><env:Header><a:To>http://172.16.0.107:5985/wsman</a:To><a:ReplyTo><a:Address mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address></a:ReplyTo><w:MaxEnvelopeSize mustUnderstand="true">153600</w:MaxEnvelopeSize><a:MessageID>uuid:41DD51F9-4B36-437E-ABD0-7F60DAC94394</a:MessageID><w:Locale mustUnderstand="false" xml:lang="en-US"/><p:DataLocale mustUnderstand="false" xml:lang="en-US"/><w:OperationTimeout>PT60S</w:OperationTimeout><a:Action mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/09/transfer/Delete</a:Action><w:SelectorSet><w:Selector Name="ShellId">75644E5E-AFFA-429A-AD47-E05819E470F2</w:Selector></w:SelectorSet><w:ResourceURI mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd</w:ResourceURI></env:Header><env:Body/></env:Envelope> Jun 15, 2015 8:52:19 AM FINEST hudson.plugins.ec2.win.winrm.WinRMClient Response: <?xml version="1.0" encoding="UTF-8"?> <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd" xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd" xml:lang="en-US"><s:Header><a:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/DeleteResponse</a:Action><a:MessageID>uuid:6BC43E0E-F959-41D8-B313-5A915F4EB9C8</a:MessageID><a:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:To><a:RelatesTo>uuid:41DD51F9-4B36-437E-ABD0-7F60DAC94394</a:RelatesTo></s:Header><s:Body/></s:Envelope>
And this goes on forever.
Even though the above exchange seems successful to me (all I can see is WinRMClient opens a session successfully and then immediately closes it), the node manager shows the new node as being "offline", no details are populated in the node table and the "Response Time" column only says "Time out for last 1 try".
I found that my problem was that the SMB access was not configured correctly (some bad interaction between the master security group, slave security group, and the routing in VPC through public IP addresses). I found that out after modifying the EC2 plugin with a lot of debug logging.
I've created a pull request ( https://github.com/jenkinsci/ec2-plugin/pull/152 ) to add the critical log where the most relevant information about the problem I had was thrown away in the official release of the plugin.
If you're interested in checking out whether this modification provides the missing information for your use case, the changed plugin can be gotten from: https://jenkins.ci.cloudbees.com/job/plugins/job/ec2-plugin/277/org.jenkins-ci.plugins$ec2/
tcpdump output