-
Bug
-
Resolution: Unresolved
-
Major
-
Jenkins host on Ubuntu 20.04, Jenkins 2.335, OpenJDK 64-Bit Server VM (build 11.0.14+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)
Jeninks agent on Windows 10, amd64, Agent and remoting v4.11.2, Eclipse OpenJ9 VM 11.0.13.0 (build openj9-0.29.0, JRE 11 Windows 10 amd64-64-Bit Compressed References 20211022_218 (JIT enabled, AOT enabled)Jenkins host on Ubuntu 20.04, Jenkins 2.335, OpenJDK 64-Bit Server VM (build 11.0.14+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing) Jeninks agent on Windows 10, amd64, Agent and remoting v4.11.2, Eclipse OpenJ9 VM 11.0.13.0 (build openj9-0.29.0, JRE 11 Windows 10 amd64-64-Bit Compressed References 20211022_218 (JIT enabled, AOT enabled)
In our setup we run a Jenkins 2.335 host on Ubuntu 20.04. Jenkins nodes and Windows VMs (VirtualBox running on the Jenkins host machine) are dynamically created during the build, the Windows VMs run the agent, connect to the host, do their job and are deleted afterwards.
Jenkins host is reverse-proxied by nginx but it's all on the same machine.
All jenkins agents can successfully connect but sometimes fail during the build. The host detects this in the build log as:
11:17:06 Cannot contact DnmcNd_NODENAME: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@6e67bc5a:DnmcNd_NODENAME": Remote call on DnmcNd_NODENAME failed. The channel is closing down or has closed down
On the agent (started with
java -jar agent.jar -jnlpUrl https://CIHOST/computer/DnmcNd_NDOENAME/jenkins-agent.jnlp -secret NODESECRET
), the log shows:
Mar 14, 2022 3:15:39 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up agent: DnmcNd_NODENAME Mar 14, 2022 3:15:39 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Mar 14, 2022 3:15:39 AM hudson.remoting.Engine startEngine INFO: Using Remoting version: 4.11.2 Mar 14, 2022 3:15:39 AM hudson.remoting.Engine startEngine WARNING: No Working Directory. Using the legacy JAR Cache location: C:\Users\Win10 amd64\.jenkins\cache\jars Mar 14, 2022 3:15:40 AM hudson.remoting.jnlp.Main$CuiListener status INFO: WebSocket connection open Mar 14, 2022 3:15:40 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Mar 14, 2022 3:17:06 AM io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter processError SEVERE: Connection error has occurred javax.net.ssl.SSLHandshakeException: Received fatal alert: protocol_version at java.base/sun.security.ssl.Alert.createSSLException(Unknown Source) at java.base/sun.security.ssl.Alert.createSSLException(Unknown Source) at java.base/sun.security.ssl.TransportContext.fatal(Unknown Source) at java.base/sun.security.ssl.Alert$AlertConsumer.consume(Unknown Source) at java.base/sun.security.ssl.TransportContext.dispatch(Unknown Source) at java.base/sun.security.ssl.SSLTransport.decode(Unknown Source) at java.base/sun.security.ssl.SSLEngineImpl.decode(Unknown Source) at java.base/sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source) at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) at java.base/javax.net.ssl.SSLEngine.unwrap(Unknown Source) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.SslFilter.handleRead(SslFilter.java:365) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.SslFilter.processRead(SslFilter.java:347) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.onRead(Filter.java:111) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.onRead(Filter.java:113) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter$4.completed(TransportFilter.java:294) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter$4.completed(TransportFilter.java:278) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Unknown Source) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Unknown Source) at java.base/sun.nio.ch.WindowsAsynchronousSocketChannelImpl$ReadTask.completed(Unknown Source) at java.base/sun.nio.ch.Iocp$EventHandlerTask.run(Unknown Source) at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)
I'm a bit that there seems to be a protocol_version problem when the initial connection worked without a problem. In nginx's log files, I also cannot see any problems or errors. It's all on the same machine so I wouldn't assume any packet loss or latency.
This would not disturb us much if there was a way to recover. I would expect the agent to automatically reconnect after some time OR quit the process so I could wrap it in restarting-loop on cmd level.
However, instead of that it just sits there blocking the build. If I connect to that VM, kill the agent and restart it, the build continues just fine.