Loading...

Type: Bug
Resolution: Fixed
Priority: Major
Component/s: kubernetes-plugin, remoting
Labels:
None
Environment:
Jenkins v2.89.2
Kubernetes Plugin v1.3.3

Similar Issues:

Show
Released As:
Remoting 3.28

While provisioning slaves from a private Kubernetes instance, we've found that a lot of slaves terminate with the following (or similar) stack trace on the slave's side:

INFO: Setting up slave: kube1-medium-r9zf4
Apr 10, 2018 11:02:05 AM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Apr 10, 2018 11:02:05 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/<user>/workDir/remoting as a remoting work directory
Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server ...
Apr 10, 2018 11:02:06 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, CLI2-connect, JNLP-connect, Ping, CLI-connect, JNLP2-connect]
Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful <...>
pr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to <Jenkins Master>
Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Apr 10, 2018 11:02:07 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: <...>
Apr 10, 2018 11:02:07 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Apr 10, 2018 11:02:14 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Apr 10, 2018 11:02:14 AM hudson.remoting.UserRequest perform
WARNING: LinkageError while performing UserRequest:jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2@3e708317
java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller$2$1
        at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call(JnlpSlaveRestarterInstaller.java:71)
        at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call(JnlpSlaveRestarterInstaller.java:53)
        at hudson.remoting.UserRequest.perform(UserRequest.java:207)
        at hudson.remoting.UserRequest.perform(UserRequest.java:53)
        at hudson.remoting.Request$2.run(Request.java:358)
        at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at hudson.remoting.Engine$1$1.run(Engine.java:98)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1
        at java.net.URLClassLoader.findClass(Unknown Source)
        at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:159)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        ... 11 more

The class that appears to not have been found isn't consistently the same. I've seen `FilePathFilter`, `LaunchConfiguration`, `StringBuilderWriter`, and some others being reported as well. Sometimes, there's also exceptions related to `JarCacheSupport` not being able to resolve jars (I don't have the exact stacktrace at hand - will post it if I find it again).

On the master's side, these exceptions generally manifest as `ChannelClosedException`s, or weird Exception-less failures in pipeline branches.

ERROR: Issue with creating launcher for agent kube1-medium-r9zf4. The agent has not been fully initialized yet

ERROR: Issue with creating launcher for agent kube1-medium-r9zf4. The agent has not been fully initialized yet

remote file operation failed: /home/<user>/workspace/<job_name> at hudson.remoting.Channel@6639429c:JNLP4-connect connection from <some-host-name>/<ip-address>:60326: hudson.remoting.ChannelClosedException: Remote call on JNLP4-connect connection from <some-host-name>/<ip-address>:60326 failed. The channel is closing down or has closed down

I haven't been able to consistently reproduce the error, but it does manifest enough to be causing major pain to users (especially since we extensively use pipelines with a large number of parallel nodes, and a failure in any one of the nodes causes the entire pipeline to fail).

is related to

JENKINS-52283 Jenkins Slaves Not Communicated w/ Master After restart

Closed

links to

CloudBees Internal FNDN-235

Details

Description

Attachments

Issue Links

Activity

People

Dates