[JENKINS-64598] Jenkins agent disconnects on k8s with SIGHUP / ClosedChannelException - Jenkins Jira

Type: Bug
Resolution: Not A Defect
Priority: Major
Component/s: core, kubernetes-plugin, remoting
Labels:
- websocket
Environment:
jenkins instance:
jenkins core 2.263.1
CentOS Linux 7 (Core)
kubernetes plugin 1.28.4

jenkins agent remoting VERSION=4.6
-websocket flag passed to jenkins agent

Similar Issues:
Powered by SuggestiMate

Show

I get intermittent agent disconnects while build is running. I'll try to provide as much info, let me know what else I can check.

Jenkins master java version 11 (java-11-openjdk-11.0.5.10) started with hudson.slaves.ChannelPinger.pingIntervalSeconds 30 in order to avoid disconnects
Nginx reverse proxy in use and ssl timeout is 5 minutes, which was too close to the default hudson.slaves.ChannelPinger.pingIntervalSeconds, so was reduced to 30 seconds with good results, and reduced the number of disconnects per day (stack trace was different and did not show a SIGHUP)
jenkins masters are on premise
jenkins agents are in GKE GCP kubernetes version 1.16.5
jenkins agent container image has default java -version
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-8u232-b09-1~deb9u1-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
remoting VERSION=4.6
-websocket flag passed to jenkins agent via the k8s plugin extra cli command, I noticed afterwards there is a checkbox for websocket (in the kubernetes plugin config), but couldn't find docs to go with it, should I switch to using that?
In terms of sizing, we peak to about 400 jenkins-agents / pods connected at a time, the limit is set to 500 in the jenkins kubernetes plugin configuration
The issue happens even when load is low

The connection is established fine, but intermittently gets disconnected. Let me know what else I can look at.

Stack trace:

SignalException: SIGHUP
FATAL: command execution failed
java.nio.channels.ClosedChannelException
	at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:141)
	at jenkins.websocket.WebSocketSession.onWebSocketSomething(WebSocketSession.java:91)
	at com.sun.proxy.$Proxy105.onWebSocketClose(Unknown Source)
	at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onClose(JettyListenerEventDriver.java:149)
	at org.eclipse.jetty.websocket.common.WebSocketSession.callApplicationOnClose(WebSocketSession.java:394)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.disconnect(AbstractWebSocketConnection.java:316)
	at org.eclipse.jetty.websocket.common.io.DisconnectCallback.succeeded(DisconnectCallback.java:42)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection$CallbackBridge.writeSuccess(AbstractWebSocketConnection.java:86)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.notifyCallbackSuccess(FrameFlusher.java:359)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.succeedEntries(FrameFlusher.java:288)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.succeeded(FrameFlusher.java:280)
	at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:293)
	at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.flush(FrameFlusher.java:264)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.process(FrameFlusher.java:193)
	at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
	at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:581)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.close(AbstractWebSocketConnection.java:181)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:510)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:440)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
	at java.base/java.lang.Thread.run(Thread.java:834)

relates to

JENKINS-62576 Websockets connection unstable since remoting 4.2.1 (LTS 2.222.4)

Open

Details

Description

Attachments

Issue Links

Activity

People

Dates