-
Bug
-
Resolution: Unresolved
-
Major
-
Windows Server 2003, 1 vCPU, 4GB RAM (32bit) 8GB RAM (64bit), 50GB virtual disk, VMware Hypervisor.
Windows slaves randomly disconnect while idle. This appears to be caused by free space threads which are stuck or still running, resulting in the SSH conenction being terminated and connections being reestablished.
I am not exactly sure what the expected behavior is for the low-level handling and communication. However, at a high level, the expected behavior is for the slave connections to persist the channel pinger not to cause a reset.
jenkins.log
Nov 4, 2011 8:34:48 AM hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init> WARNING: Previous Free Swap Space monitoring activity still in progress. Interrupting Nov 4, 2011 8:40:18 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel. Exception in thread "Monitoring w64-09 for Free Swap Space" hudson.remoting.RequestAbortedException: hudson.remotin g.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.call(Request.java:149) at hudson.remoting.Channel.call(Channel.java:660) at hudson.node_monitors.SwapSpaceMonitor$1.monitor(SwapSpaceMonitor.java:83) at hudson.node_monitors.SwapSpaceMonitor$1.monitor(SwapSpaceMonitor.java:81) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:202) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.abort(Request.java:269) at hudson.remoting.Channel.terminate(Channel.java:711) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:794) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1024) Caused by: hudson.remoting.Channel$OrderlyShutdown ... 2 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:62) at hudson.remoting.Command.<init>(Command.java:47) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel.close(Channel.java:835) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:793) ... 1 more Exception in thread "Monitoring w64-09 for Free Temp Space" hudson.remoting.RequestAbortedException: hudson.remotin g.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.call(Request.java:149) at hudson.remoting.Channel.call(Channel.java:660) at hudson.FilePath.act(FilePath.java:745) at hudson.FilePath.act(FilePath.java:738) at hudson.node_monitors.TemporarySpaceMonitor$1.getFreeSpace(TemporarySpaceMonitor.java:73) at hudson.node_monitors.DiskSpaceMonitorDescriptor.monitor(DiskSpaceMonitorDescriptor.java:135) at hudson.node_monitors.DiskSpaceMonitorDescriptor.monitor(DiskSpaceMonitorDescriptor.java:49) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:202) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.abort(Request.java:269) at hudson.remoting.Channel.terminate(Channel.java:711) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:794) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1024) Caused by: hudson.remoting.Channel$OrderlyShutdown ... 2 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:62) at hudson.remoting.Command.<init>(Command.java:47) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel.close(Channel.java:835) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:793) ... 1 more Nov 4, 2011 8:40:57 AM hudson.slaves.SlaveComputer tryReconnect INFO: Attempting to reconnect w64-09 Nov 4, 2011 9:34:48 AM hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init> WARNING: Previous Free Swap Space monitoring activity still in progress. Interrupting Nov 4, 2011 9:34:48 AM hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init> WARNING: Previous Free Temp Space monitoring activity still in progress. Interrupting Nov 4, 2011 9:40:18 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel. Exception in thread "Monitoring w64-09 for Free Swap Space" hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.call(Request.java:149) at hudson.remoting.Channel.call(Channel.java:660) at hudson.node_monitors.SwapSpaceMonitor$1.monitor(SwapSpaceMonitor.java:83) at hudson.node_monitors.SwapSpaceMonitor$1.monitor(SwapSpaceMonitor.java:81) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:202) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Request.abort(Request.java:269) at hudson.remoting.Channel.terminate(Channel.java:711) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:794) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1024) Caused by: hudson.remoting.Channel$OrderlyShutdown ... 2 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:62) at hudson.remoting.Command.<init>(Command.java:47) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:790) at hudson.remoting.Channel.close(Channel.java:835) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:793) ... 1 more Nov 4, 2011 9:40:57 AM hudson.slaves.SlaveComputer tryReconnect INFO: Attempting to reconnect w64-09
Please note, this issue can be mitigated by disabling the Free Swap Space check for all slaves. However, this a less than optimal solution.