-
Bug
-
Resolution: Fixed
-
Critical
-
Jenkins: 2.303.1
jenkins k8s controller (master) pod version: jenkins/jenkins:2.289.3-jdk11
jenkins k8s agent (slave) pod version: jenkins/inbound-agent:4.9-1-jdk11
Kubernetes plugin: 1.30.1
-
-
Jenkins 2.338, Remoting 4.13, 2.332.2
I'm using jenkins helm chart for my jenkins controller (master) with persistence EFS volume together with Kubernetes plugin with inbound-agent (jdk11) pods for my agents (slaves) over websocket connection
It seems that the Agent (slave) pods are unable to re-connect the controller (master) after a reboot to the controller (master).
It seems that the process exit with exit code 0 from some reason instead of reconnecting.
jenkins controller (master) log
2021-08-24 13:26:33.976+0000 [id=64] INFO o.c.j.p.k.KubernetesLauncher#launch: Agent has already been launched, activating: jenkins-agent-3q3j5
jenkins agent (slave) log:
Aug 24, 2021 1:23:56 PM hudson.remoting.Engine lambda$new$1 Aug 24, 2021 1:23:56 PM hudson.remoting.Engine lambda$new$1SEVERE: Uncaught exception in Engine thread Thread[Thread-0,5,main] java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:91) at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:54) at hudson.remoting.Engine.runWebSocket(Engine.java:687) at hudson.remoting.Engine.run(Engine.java:496) Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:215) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 4 more
jenkins agent (slave) pod status:
NAME READY STATUS RESTARTS AGE jenkins-agent-3q3j5 1/2 NotReady 0 15m lastState: {} name: jnlp ready: false restartCount: 0 started: false state: terminated: containerID: docker://9b490eafd5078fa95cbd915bed16bcee767c1005a7a96158a2017d1551fba87b exitCode: 0 finishedAt: "2021-09-08T15:00:07Z" reason: Completed startedAt: "2021-09-08T14:53:19Z"
inbound-agent conf
- name: JENKINS_SECRET value: 271983729187392173921798379281739821793b - name: JENKINS_AGENT_NAME value: jenkins-agent-3q3j5 - name: DOCKER_HOST value: tcp://localhost:2375 - name: JENKINS_WEB_SOCKET value: "true" - name: JAVA_OPTS value: -Xms512m -Xmx1500m - name: JENKINS_NAME value: jenkins-agent-3q3j5 - name: JENKINS_AGENT_WORKDIR value: /home/jenkins/agent - name: JENKINS_URL value: http://jenkins:8080/ - name: AWS_DEFAULT_REGION value: us-east-1 - name: AWS_REGION value: us-east-1 - name: AWS_ROLE_ARN value: arn:aws:iam::123456789:role/jenkins-agent-pod - name: AWS_WEB_IDENTITY_TOKEN_FILE value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
jenkins agent (slave) pod process
/opt/java/openjdk/bin/java -Xms512m -Xmx1500m -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -url http://jenkins:8080/ -workDir /home /jenkins/agent -webSocket 271983729187392173921798379281739821793b jenkins-agent-3q3j5
same behavior in
jenkins/jenkins:2.303.1-jdk11 jenkins/inbound-agent:4.10-2-jdk11
It seems that this issue happens also on windows agent (slave)
might be related: https://issues.jenkins.io/browse/JENKINS-59910
- is duplicated by
-
JENKINS-67062 Jenkins fails to resume builds during restarts when the Agent is connected with WebSockets
- Closed
-
JENKINS-52283 Jenkins Slaves Not Communicated w/ Master After restart
- Closed
-
JENKINS-50458 JNLP agent died while reconnecting to master with java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller
- Closed
- relates to
-
JENKINS-19055 In case of connection loss, slave JVM should restart itself if it can
- Resolved
- links to