[JENKINS-67062] Jenkins fails to resume builds during restarts when the Agent is connected with WebSockets - Jenkins Jira

Type: Bug
Resolution: Duplicate
Priority: Critical
Component/s: core, kubernetes-plugin, remoting
Labels:
None
Environment:
Jenkins: 2.303.3 JDK11 (latest LTS to date)
Kubernetes Plugin: 1.30.6 (latest to date)
jenkins/inbound-agent:4.11-1 (latest to date)

Similar Issues:
Powered by SuggestiMate

Show
Released As:
2.338

There is no error shown in the Jenkins logs itself, but the agent fails with:

❯ kubectl logs -f default-728vq
Warning: SECRET is defined twice in command-line arguments and the environment variable
Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
Nov 04, 2021 7:45:59 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: default-728vq
Nov 04, 2021 7:45:59 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Nov 04, 2021 7:45:59 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.11
Nov 04, 2021 7:45:59 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Nov 04, 2021 7:45:59 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Nov 04, 2021 7:46:00 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: WebSocket connection open
Nov 04, 2021 7:46:00 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Write side closed
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Read side closed
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Read side closed
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Read side closed
Nov 04, 2021 7:46:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: http://jenkins.default.svc.cluster.local:8080/login is not ready: 503
Nov 04, 2021 7:46:38 PM hudson.remoting.Engine lambda$new$1
SEVERE: Uncaught exception in Engine thread Thread[Thread-0,5,main]
java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
        at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:91)
        at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:54)
        at hudson.remoting.Engine.runWebSocket(Engine.java:687)
        at hudson.remoting.Engine.run(Engine.java:496)
Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller
        at java.base/java.net.URLClassLoader.findClass(Unknown Source)
        at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:215)
        at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
        at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
        ... 4 more

The issue can be easily reproduced in any environment:

$ kind create cluster

$ helm repo add jenkins https://charts.jenkins.io

$ helm repo update

$ helm upgrade jenkins jenkins/jenkins --install --wait --debug -f- <<'EOF'
controller:
  adminPassword: admin
  agentListenerEnabled: false
  # specifying plugins without version makes sure to use the latest
  installPlugins:
    - kubernetes
    - workflow-aggregator
    - git
    - configuration-as-code
    - job-dsl
    - saferestart
  JCasC:
    configScripts:
      my-jobs: |
        jobs:
          - script: |
              pipelineJob('testjob') {
                definition {
                  cps {
                    script("""\
                      pipeline {
                        agent any
                        stages {
                          stage ('test') {
                            steps {
                              sleep 1000
                            }
                          }
                        }
                      }""".stripIndent())
                    sandbox()
                  }
                }
              }
agent:
  websocket: true
  tag: 4.11-1
EOF

$ echo http://127.0.0.1:8080 && kubectl --namespace default port-forward svc/jenkins 8080:8080

Then:

1. Go to the Jenkins UI at http://127.0.0.1:8080
2. Login with "admin" as user and password
3. Trigger a build of "testjob"
4. Wait for a pod to be assigned to the build and the sleep command to start running
5. Start following the pod logs with kubectl logs -f <name-of-pod>
6. Go to Jenkins home page and click in Restart Safely and confirm
7. Watch the pod logs, it will fail with the stack trace mentioned above. The build will also fail.

duplicates

JENKINS-66446 WebSocket agent does not reconnect: ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller

Closed

Details

Description

Attachments

Issue Links

Activity

People

Dates