-
Bug
-
Resolution: Not A Defect
-
Blocker
-
None
-
Jenkins 2.187
Amazon EC2 1.44.1
Swarm 3.13
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config:
The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements).
Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following:
Slave log:
12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748)
On the master's log I can see:
Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING
Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout
EC2 instance idle time expired: i-000908b57bb5d82a7
Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
Terminated EC2 instance (terminated): i-000908b57bb5d82a7
Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException
Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute
aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE
Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
Removed EC2 instance from jenkins master: i-000908b57bb5d82a7
After that period of time the slave is disconnected even though the build was running on it. Any help in tracking down the problem is much appreciated!
[JENKINS-59579] EC2 Plugin stops slave when build is running
Summary | Original: Amazon nodes automatically disconnect from Master after a period of time | New: Amazon slave idle timeout, where to find it and how to increase it? |
Description |
Original:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} ouch, stdout exception for java -jar C:\Windows\Temp\remoting.jar -workDir C:\Users\clujouch, stdout exception for java -jar C:\Windows\Temp\remoting.jar -workDir C:\Users\clujjava.lang.NumberFormatException: For input string: "4294967295" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:583) at java.lang.Integer.parseInt(Integer.java:615) at hudson.plugins.ec2.win.winrm.WinRMClient.slurpOutput(WinRMClient.java:151) at hudson.plugins.ec2.win.winrm.WindowsProcess$1.run(WindowsProcess.java:99) Sep 27, 2019 10:45:27 AM INFO hudson.remoting.SynchronousCommandTransport$ReaderThread runI/O error in channel EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-03c88c3229acf1e7a)java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Sep 27, 2019 10:45:27 AM SEVERE hudson.plugins.ec2.win.winrm.WinRMClient sendRequestI/O Exception in HTTP POSTjava.io.IOException: Attempted read from closed stream. at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:165) at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.Reader.read(Reader.java:140) at org.apache.http.util.EntityUtils.toString(EntityUtils.java:227) at org.apache.http.util.EntityUtils.toString(EntityUtils.java:308) at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:261) at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:188) at hudson.plugins.ec2.win.winrm.WinRMClient.sendInput(WinRMClient.java:120) at hudson.plugins.ec2.win.winrm.WindowsProcess$2.run(WindowsProcess.java:134) Sep 27, 2019 10:45:27 AM WARNING hudson.plugins.ec2.win.winrm.WindowsProcess$2 runouch, STDIN exception for java -jar C:\Windows\Temp\remoting.jar -workDir C:\Users\clujjava.io.IOException: Attempted read from closed stream. at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:165) at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.Reader.read(Reader.java:140) at org.apache.http.util.EntityUtils.toString(EntityUtils.java:227) at org.apache.http.util.EntityUtils.toString(EntityUtils.java:308) at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:261)Caused: hudson.plugins.ec2.win.winrm.RuntimeIOException: I/O Exception Attempted read from closed stream. at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:276) at hudson.plugins.ec2.win.winrm.WinRMClient.sendRequest(WinRMClient.java:188) at hudson.plugins.ec2.win.winrm.WinRMClient.sendInput(WinRMClient.java:120) at hudson.plugins.ec2.win.winrm.WindowsProcess$2.run(WindowsProcess.java:134) {code} Any help in tracking down the problem is much appreciated. |
New:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout EC2 instance idle time expired: i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Terminated EC2 instance (terminated): i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Removed EC2 instance from jenkins master: i-000908b57bb5d82a7 {code} Any help in tracking down the problem is much appreciated. |
Issue Type | Original: Bug [ 1 ] | New: New Feature [ 2 ] |
Priority | Original: Blocker [ 1 ] | New: Major [ 3 ] |
Description |
Original:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout EC2 instance idle time expired: i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Terminated EC2 instance (terminated): i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Removed EC2 instance from jenkins master: i-000908b57bb5d82a7 {code} Any help in tracking down the problem is much appreciated. |
New:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: Slave log: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout EC2 instance idle time expired: i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Terminated EC2 instance (terminated): i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Removed EC2 instance from jenkins master: i-000908b57bb5d82a7 {code} After that period of time the slave is disconnected even though the build was running on it, resulting in: *15:40:46* FATAL: command execution failed*15:40:46* java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.176/10.230.0.176:49733' is disconnected.*15:40:46* at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)*15:40:46* at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)*15:40:46* at com.sun.proxy.$Proxy89.isAlive(Unknown Source)*15:40:46* at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)*15:40:46* at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)*15:40:46* at hudson.Launcher$ProcStarter.join(Launcher.java:492)*15:40:46* at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)*15:40:46* at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)*15:40:46* at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)*15:40:46* at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)*15:40:46* at hudson.model.Build$BuildExecution.build(Build.java:206)*15:40:46* at hudson.model.Build$BuildExecution.doRun(Build.java:163)*15:40:46* at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)*15:40:46* at hudson.model.Run.execute(Run.java:1815)*15:40:46* at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)*15:40:46* at hudson.model.ResourceController.execute(ResourceController.java:97)*15:40:46* at hudson.model.Executor.run(Executor.java:429)*15:40:46* Caused by: java.nio.channels.ClosedChannelException*15:40:46* at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)*15:40:46* at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)*15:40:46* at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)*15:40:46* at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)*15:40:46* at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)*15:40:46* at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*15:40:46* at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*15:40:46* at java.lang.Thread.run(Thread.java:748) |
Issue Type | Original: New Feature [ 2 ] | New: Bug [ 1 ] |
Priority | Original: Major [ 3 ] | New: Blocker [ 1 ] |
Summary | Original: Amazon slave idle timeout, where to find it and how to increase it? | New: EC2 Plugin stops slave when build is running |
Description |
Original:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: Slave log: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout EC2 instance idle time expired: i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Terminated EC2 instance (terminated): i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Removed EC2 instance from jenkins master: i-000908b57bb5d82a7 {code} After that period of time the slave is disconnected even though the build was running on it, resulting in: *15:40:46* FATAL: command execution failed*15:40:46* java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.176/10.230.0.176:49733' is disconnected.*15:40:46* at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)*15:40:46* at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)*15:40:46* at com.sun.proxy.$Proxy89.isAlive(Unknown Source)*15:40:46* at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)*15:40:46* at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)*15:40:46* at hudson.Launcher$ProcStarter.join(Launcher.java:492)*15:40:46* at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)*15:40:46* at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)*15:40:46* at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)*15:40:46* at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)*15:40:46* at hudson.model.Build$BuildExecution.build(Build.java:206)*15:40:46* at hudson.model.Build$BuildExecution.doRun(Build.java:163)*15:40:46* at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)*15:40:46* at hudson.model.Run.execute(Run.java:1815)*15:40:46* at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)*15:40:46* at hudson.model.ResourceController.execute(ResourceController.java:97)*15:40:46* at hudson.model.Executor.run(Executor.java:429)*15:40:46* Caused by: java.nio.channels.ClosedChannelException*15:40:46* at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)*15:40:46* at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)*15:40:46* at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)*15:40:46* at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)*15:40:46* at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)*15:40:46* at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*15:40:46* at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*15:40:46* at java.lang.Thread.run(Thread.java:748) |
New:
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736! The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements). Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following: Slave log: {code:java} 12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected. 12:49:46 at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) 12:49:46 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) 12:49:46 at com.sun.proxy.$Proxy89.isAlive(Unknown Source) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172) 12:49:46 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164) 12:49:46 at hudson.Launcher$ProcStarter.join(Launcher.java:492) 12:49:46 at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) 12:49:46 at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) 12:49:46 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741) 12:49:46 at hudson.model.Build$BuildExecution.build(Build.java:206) 12:49:46 at hudson.model.Build$BuildExecution.doRun(Build.java:163) 12:49:46 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) 12:49:46 at hudson.model.Run.execute(Run.java:1815) 12:49:46 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 12:49:46 at hudson.model.ResourceController.execute(ResourceController.java:97) 12:49:46 at hudson.model.Executor.run(Executor.java:429) 12:49:46 Caused by: java.nio.channels.ClosedChannelException 12:49:46 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) 12:49:46 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179) 12:49:46 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) 12:49:46 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 12:49:46 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) 12:49:46 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 12:49:46 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 12:49:46 at java.lang.Thread.run(Thread.java:748) {code} On the master's log I can see: {code:java} Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout EC2 instance idle time expired: i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Terminated EC2 instance (terminated): i-000908b57bb5d82a7 Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate Removed EC2 instance from jenkins master: i-000908b57bb5d82a7 {code} After that period of time the slave is disconnected even though the build was running on it. Any help in tracking down the problem is much appreciated! |
Resolution | New: Not A Defect [ 7 ] | |
Status | Original: Open [ 1 ] | New: Closed [ 6 ] |