-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Jenkins v2.89.2
Kubernetes Plugin v1.3.3
-
-
Remoting 3.28
While provisioning slaves from a private Kubernetes instance, we've found that a lot of slaves terminate with the following (or similar) stack trace on the slave's side:
INFO: Setting up slave: kube1-medium-r9zf4 Apr 10, 2018 11:02:05 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Apr 10, 2018 11:02:05 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using /home/<user>/workDir/remoting as a remoting work directory Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server ... Apr 10, 2018 11:02:06 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, CLI2-connect, JNLP-connect, Ping, CLI-connect, JNLP2-connect] Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Agent discovery successful <...> pr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to <Jenkins Master> Apr 10, 2018 11:02:06 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Trying protocol: JNLP4-connect Apr 10, 2018 11:02:07 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Remote identity confirmed: <...> Apr 10, 2018 11:02:07 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Apr 10, 2018 11:02:14 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Apr 10, 2018 11:02:14 AM hudson.remoting.UserRequest perform WARNING: LinkageError while performing UserRequest:jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2@3e708317 java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller$2$1 at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call(JnlpSlaveRestarterInstaller.java:71) at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call(JnlpSlaveRestarterInstaller.java:53) at hudson.remoting.UserRequest.perform(UserRequest.java:207) at hudson.remoting.UserRequest.perform(UserRequest.java:53) at hudson.remoting.Request$2.run(Request.java:358) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:98) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 at java.net.URLClassLoader.findClass(Unknown Source) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:159) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) ... 11 more
The class that appears to not have been found isn't consistently the same. I've seen `FilePathFilter`, `LaunchConfiguration`, `StringBuilderWriter`, and some others being reported as well. Sometimes, there's also exceptions related to `JarCacheSupport` not being able to resolve jars (I don't have the exact stacktrace at hand - will post it if I find it again).
On the master's side, these exceptions generally manifest as `ChannelClosedException`s, or weird Exception-less failures in pipeline branches.
ERROR: Issue with creating launcher for agent kube1-medium-r9zf4. The agent has not been fully initialized yet ERROR: Issue with creating launcher for agent kube1-medium-r9zf4. The agent has not been fully initialized yet remote file operation failed: /home/<user>/workspace/<job_name> at hudson.remoting.Channel@6639429c:JNLP4-connect connection from <some-host-name>/<ip-address>:60326: hudson.remoting.ChannelClosedException: Remote call on JNLP4-connect connection from <some-host-name>/<ip-address>:60326 failed. The channel is closing down or has closed down
I haven't been able to consistently reproduce the error, but it does manifest enough to be causing major pain to users (especially since we extensively use pipelines with a large number of parallel nodes, and a failure in any one of the nodes causes the entire pipeline to fail).
- is related to
-
JENKINS-52283 Jenkins Slaves Not Communicated w/ Master After restart
- Closed
- links to