Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50458

JNLP agent died while reconnecting to master with java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller

    • Icon: Improvement Improvement
    • Resolution: Duplicate
    • Icon: Minor Minor
    • core

      First agent is well started, and identicated on the master :

      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Locating server among [http://xxxxxxxxxx:8080/]
      mars 22, 2018 5:40:04 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFOS: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Agent discovery successful
        Agent address: xxxxxxxxxx
        Agent port:    9999
        Identity:      xxxxxxxxxx
      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Handshaking
      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Connecting to topvm09.sesame.infotel.com:9999
      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Trying protocol: JNLP4-connect
      mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Remote identity confirmed: xxxxxxxxxx
      mars 22, 2018 5:40:05 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Connected
      mars 22, 2018 5:40:06 PM com.youdevise.hudson.slavestatus.SlaveListener call
      INFOS: Slave-status listener starting
      mars 22, 2018 5:40:06 PM com.youdevise.hudson.slavestatus.SocketHTTPListener waitForConnection
      INFOS: Slave-status listener ready on port 3141
      

      Then master is unavailable (lots of OutOfMemory) and has been restarted.

      In the meantime, the JNLP agent try to reconnect to master until connection is OK:

      mars 28, 2018 1:49:25 PM hudson.slaves.ChannelPinger$1 onDead
      INFOS: Ping failed. Terminating the channel JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999.
      java.util.concurrent.TimeoutException: Ping started at 1522237525477 hasn't completed by 1522237765505
          at hudson.remoting.PingThread.ping(PingThread.java:134)
          at hudson.remoting.PingThread.run(PingThread.java:90)
      
      [... Repeated multiple times...]
      
      mars 28, 2018 2:26:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Terminated
      mars 28, 2018 2:26:45 PM hudson.util.ProcessTree getKillers
      AVERTISSEMENT: Failed to obtain killers
      hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999 failed. The channel is closing down or has closed down
          at hudson.remoting.Channel.call(Channel.java:945)
          at hudson.util.ProcessTree.getKillers(ProcessTree.java:159)
          at hudson.util.ProcessTree$OSProcess.killByKiller(ProcessTree.java:220)
          at hudson.util.ProcessTree$WindowsOSProcess.killRecursively(ProcessTree.java:436)
          at hudson.util.ProcessTree.killAll(ProcessTree.java:146)
          at hudson.Proc$LocalProc.destroy(Proc.java:384)
          at hudson.Proc$LocalProc.join(Proc.java:357)
          at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1304)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:927)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:901)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:850)
          at hudson.remoting.UserRequest.perform(UserRequest.java:210)
          at hudson.remoting.UserRequest.perform(UserRequest.java:53)
          at hudson.remoting.Request$2.run(Request.java:364)
          at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at hudson.remoting.Engine$1$1.run(Engine.java:94)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
          at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
          ... 4 more
      
      [... Repeated multiple times...]
      
      mars 28, 2018 2:26:46 PM hudson.remoting.Request$2 run
      AVERTISSEMENT: Failed to send back a reply to the request hudson.remoting.Request$2@34a893f6
      hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@71af25fc:JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999": channel is already closed
          at hudson.remoting.Channel.send(Channel.java:715)
          at hudson.remoting.Request$2.run(Request.java:377)
          at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at hudson.remoting.Engine$1$1.run(Engine.java:94)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
          at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
          ... 4 more
      
      mars 28, 2018 2:27:00 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      
      [... Repeated multiple times...]
      
      mars 28, 2018 2:31:49 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
      mars 28, 2018 2:32:00 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
      mars 28, 2018 2:32:15 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException Read timed out
      mars 28, 2018 2:32:30 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException Read timed out
      mars 28, 2018 2:32:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
      
      

       

      But when the master is back, then the agent died with the following stacktrace :

      mars 28, 2018 2:32:50 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
      mars 28, 2018 2:33:01 PM hudson.remoting.jnlp.Main$CuiListener error
      GRAVE: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
      java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
          at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:97)
          at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:49)
          at hudson.remoting.Engine.innerRun(Engine.java:662)
          at hudson.remoting.Engine.run(Engine.java:469)
      Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller
          at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
          at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:171)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
          ... 4 more
      
      

      Please note that changelog of 2.112 says remoting has been updated to 3.18, and I use previous version of agent.

      If agent version mismatch is the root cause, I would expect Jenkins to complains about the deprecated version of agent.

      PS : I don't known if this a "core" component issue.

          [JENKINS-50458] JNLP agent died while reconnecting to master with java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller

          Philipp Garbe added a comment -

          Had a similar issue and updating the agent (to 3.19) fixed it.

          Philipp Garbe added a comment - Had a similar issue and updating the agent (to 3.19) fixed it.

          Régis Maura added a comment -

          pgarbe Thank you for the feedback. I have updated agent but can't test the fix now.

          However, It would be smart to warn administrator when some agent have lower version than required by master's version.

          Régis Maura added a comment - pgarbe Thank you for the feedback. I have updated agent but can't test the fix now. However, It would be smart to warn administrator when some agent have lower version than required by master's version.

          Jon Tancer added a comment - - edited

          I see the same error as the original poster, although I am using agent version 3.19.  Jenkins slaves connected via JNLP agent are unable to reconnect to Jenkins after the Jenkins web app reboots.  Rebooting, then reconnecting the slaves fixes the error... temporarily.  The error is logged below.

          My build slaves have a boot-up script which always pulls down the latest agent file before establishing the connection to Jenkins.

          For this reason, I should never have an issue relating to a mismatch in versions because all the slave reboot once daily.

          Apr 25, 2018 11:54:01 AM hudson.remoting.jnlp.Main$CuiListener errorSEVERE: jenkins/slaves/restarter/JnlpSlaveRestarterInstallerjava.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller        at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:97)

           

          Jon Tancer added a comment - - edited I see the same error as the original poster, although I am using agent version 3.19.  Jenkins slaves connected via JNLP agent are unable to reconnect to Jenkins after the Jenkins web app reboots.  Rebooting, then reconnecting the slaves fixes the error... temporarily.  The error is logged below. My build slaves have a boot-up script which always pulls down the latest agent file before establishing the connection to Jenkins. For this reason, I should never have an issue relating to a mismatch in versions because all the slave reboot once daily. Apr 25, 2018 11:54:01 AM hudson.remoting.jnlp.Main$CuiListener errorSEVERE: jenkins/slaves/restarter/JnlpSlaveRestarterInstallerjava.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller        at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:97)  

          Jon Tancer added a comment -

          I downgraded the JDK on my machine from 10 to 8 and this problem went away.

          Jon Tancer added a comment - I downgraded the JDK on my machine from 10 to 8 and this problem went away.

          Oleg Nenashev added a comment -

          rmaura pgarbe Are you also using Java 10? If yes, it is not supported. Jenkins will run reliably only on Java 8

          Oleg Nenashev added a comment - rmaura pgarbe Are you also using Java 10? If yes, it is not supported. Jenkins will run reliably only on Java 8

          Régis Maura added a comment -

          oleg_nenashev We are using Java 8 for both master and agent.
          Note : I have not tried to reproduce the bug since agent update to 3.19.

          Régis Maura added a comment - oleg_nenashev We are using Java 8 for both master and agent. Note : I have not tried to reproduce the bug since agent update to 3.19.

          Jeff Thompson added a comment -

          rmaura, it looks like this has been working fine for you so we should probably just close it.

          From the provided information, I don't have enough to figure out what is going on. Particularly without any steps to reproduce and with the reported variability.

          I see a couple of other similar reports JENKINS-50730 and JENKINS-52283 but certainly no indication that it is a widespread problem. There might be some similarities with Cloud or particularly Kubernetes environments.--

          In some cases the causes appear to be environment or version related. Getting the correct Remoting, Jenkins, or Java versions seems to have resolved it in some cases. In one case it appears to have been due to memory issues.

          Jeff Thompson added a comment - rmaura , it looks like this has been working fine for you so we should probably just close it. From the provided information, I don't have enough to figure out what is going on. Particularly without any steps to reproduce and with the reported variability. I see a couple of other similar reports  JENKINS-50730  and  JENKINS-52283  but certainly no indication that it is a widespread problem. There might be some similarities with Cloud or particularly Kubernetes environments.-- In some cases the causes appear to be environment or version related. Getting the correct Remoting, Jenkins, or Java versions seems to have resolved it in some cases. In one case it appears to have been due to memory issues.

          Per oleg_nenashev, closing as resolved with no response from submitter.

          Ashton Treadway added a comment - Per oleg_nenashev , closing as resolved with no response from submitter.

          Klaus added a comment - - edited

          3 years+, marked fix but unreleased.

          I've recently got this error in the docker agent with jdk11 (alpine base) running on websockets.
          From time to time some vpn tunnel closes doors, and the reconnect attempt immediately shows this error.

          Nov 16, 2021 9:42:27 AM hudson.remoting.Engine lambda$new$1
          SEVERE: Uncaught exception in Engine thread Thread[Thread-0,5,main]
          java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
                  at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:92)
                  at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:54)
                  at hudson.remoting.Engine.runWebSocket(Engine.java:687)
                  at hudson.remoting.Engine.run(Engine.java:496)
          Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller
                  at java.base/java.net.URLClassLoader.findClass(Unknown Source)
                  at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:215)
                  at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
                  at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
                  ... 4 more
          

          As the agent project https://github.com/jenkinsci/docker-inbound-agent delivers with java 11 by default, I thought, there is no compatibility error or any class(path) issue.

          Recommendation?

          Klaus added a comment - - edited 3 years+, marked fix but unreleased. I've recently got this error in the docker agent with jdk11 (alpine base) running on websockets. From time to time some vpn tunnel closes doors, and the reconnect attempt immediately shows this error. Nov 16, 2021 9:42:27 AM hudson.remoting.Engine lambda$new$1 SEVERE: Uncaught exception in Engine thread Thread[Thread-0,5,main] java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:92) at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:54) at hudson.remoting.Engine.runWebSocket(Engine.java:687) at hudson.remoting.Engine.run(Engine.java:496) Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller at java.base/java.net.URLClassLoader.findClass(Unknown Source) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:215) at java.base/java.lang.ClassLoader.loadClass(Unknown Source) at java.base/java.lang.ClassLoader.loadClass(Unknown Source) ... 4 more As the agent project https://github.com/jenkinsci/docker-inbound-agent  delivers with java 11 by default, I thought, there is no compatibility error or any class(path) issue. Recommendation?

          Basil Crow added a comment -

          Duplicates JENKINS-66446.

          Basil Crow added a comment - Duplicates JENKINS-66446 .

            jthompson Jeff Thompson
            rmaura Régis Maura
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: