Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68029

WebSocket agent on Windows disconnects while running job with javax.net.ssl.SSLHandshakeException

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • remoting

      In our setup we run a Jenkins 2.335 host on Ubuntu 20.04. Jenkins nodes and Windows VMs (VirtualBox running on the Jenkins host machine) are dynamically created during the build, the Windows VMs run the agent, connect to the host, do their job and are deleted afterwards.

      Jenkins host is reverse-proxied by nginx but it's all on the same machine.

      All jenkins agents can successfully connect but sometimes fail during the build. The host detects this in the build log as:

      11:17:06  Cannot contact DnmcNd_NODENAME: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@6e67bc5a:DnmcNd_NODENAME": Remote call on DnmcNd_NODENAME failed. The channel is closing down or has closed down

      On the agent (started with

      java -jar agent.jar -jnlpUrl https://CIHOST/computer/DnmcNd_NDOENAME/jenkins-agent.jnlp -secret NODESECRET

      ), the log shows:

      Mar 14, 2022 3:15:39 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: DnmcNd_NODENAME
      Mar 14, 2022 3:15:39 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Mar 14, 2022 3:15:39 AM hudson.remoting.Engine startEngine
      INFO: Using Remoting version: 4.11.2
      Mar 14, 2022 3:15:39 AM hudson.remoting.Engine startEngine
      WARNING: No Working Directory. Using the legacy JAR Cache location: C:\Users\Win10 amd64\.jenkins\cache\jars
      Mar 14, 2022 3:15:40 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: WebSocket connection open
      Mar 14, 2022 3:15:40 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Mar 14, 2022 3:17:06 AM io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter processError
      SEVERE: Connection error has occurred
      javax.net.ssl.SSLHandshakeException: Received fatal alert: protocol_version
              at java.base/sun.security.ssl.Alert.createSSLException(Unknown Source)
              at java.base/sun.security.ssl.Alert.createSSLException(Unknown Source)
              at java.base/sun.security.ssl.TransportContext.fatal(Unknown Source)
              at java.base/sun.security.ssl.Alert$AlertConsumer.consume(Unknown Source)
              at java.base/sun.security.ssl.TransportContext.dispatch(Unknown Source)
              at java.base/sun.security.ssl.SSLTransport.decode(Unknown Source)
              at java.base/sun.security.ssl.SSLEngineImpl.decode(Unknown Source)
              at java.base/sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source)
              at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source)
              at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source)
              at java.base/javax.net.ssl.SSLEngine.unwrap(Unknown Source)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.SslFilter.handleRead(SslFilter.java:365)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.SslFilter.processRead(SslFilter.java:347)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.onRead(Filter.java:111)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.onRead(Filter.java:113)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter$4.completed(TransportFilter.java:294)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter$4.completed(TransportFilter.java:278)
              at java.base/sun.nio.ch.Invoker.invokeUnchecked(Unknown Source)
              at java.base/sun.nio.ch.Invoker.invokeUnchecked(Unknown Source)
              at java.base/sun.nio.ch.WindowsAsynchronousSocketChannelImpl$ReadTask.completed(Unknown Source)
              at java.base/sun.nio.ch.Iocp$EventHandlerTask.run(Unknown Source)
              at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(Unknown Source)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
              at java.base/java.lang.Thread.run(Unknown Source)

      I'm a bit that there seems to be a protocol_version problem when the initial connection worked without a problem. In nginx's log files, I also cannot see any problems or errors. It's all on the same machine so I wouldn't assume any packet loss or latency.

      This would not disturb us much if there was a way to recover. I would expect the agent to automatically reconnect after some time OR quit the process so I could wrap it in restarting-loop on cmd level.

      However, instead of that it just sits there blocking the build. If I connect to that VM, kill the agent and restart it, the build continues just fine.

            Unassigned Unassigned
            jachstetsea Jannis Achstetter
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: