Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-62316

Nodes (Slaves) are vanishing (disappearing) and reappearing on their own

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • core
    • Jenkins 2.121.2 (Runs on Windows Server 2012R2)

      Hello,

      We have started to see a major issue where Nodes (Slaves) are vanishing (disappearing) and reappearing on their own.

       

      The strange thing is that the issue impacts and occurs only on nodes who run parallel pipelines.

       

      During the multi pipeline parallel job execution, we see Nodes (Slaves) are vanishing (disappearing).

       

      On the Jenkins side, when we try to access the node, we receive:

      HTTP ERROR 404 Problem accessing /computer/Computer-2021. Reason: Not Found

       

      One the client-side, the agent thinks it is still connected, but when we disconnect the agent and try to connect again we receive:

      Failing to obtain http://jenkins:8080/computer/Computer-8021/slave-agent.jnlpFailing to obtain http://jenkins:8080/computer/Computer-8021/slave-agent.jnlp

      java.io.IOException: Failed to load http://jenkins:8080/computer/Computer-8021/slave-agent.jnlp: 404 Not Foundjava.io.IOException: Failed to load http://jenkins:8080/computer/Computer-8021/slave-agent.jnlp: 404 Not Found

      at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:496) at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:496)

      at hudson.remoting.Launcher.run(Launcher.java:322) at hudson.remoting.Launcher.run(Launcher.java:322)

      at hudson.remoting.Launcher.main(Launcher.java:283) at hudson.remoting.Launcher.main(Launcher.java:283)

       

      Out observation:

      1) After a certain time has passed & the parallel pipeline job ended, all of the missing nodes start to reappear again. about 1~24 hours.

      2) Reload configuration from disk, bring all the vanished slaved back instantly.

       

       

      More information:

      Jenkins server version: 2.212.2 (Runs on Windows Server 2012R2)

      All Slaves are JNLP agents running on Windows 7.

       

      Please help us...

          [JENKINS-62316] Nodes (Slaves) are vanishing (disappearing) and reappearing on their own

          Oleg Nenashev added a comment -

          Likely it is related to Slave=>agent renaming cleqnup in recent Core versions, have not checked the code

          Oleg Nenashev added a comment - Likely it is related to Slave=>agent renaming cleqnup in recent Core versions, have not checked the code

          Yaniv Koval added a comment - - edited

          oleg_nenashev,

          Please note that our Jenkins instance is old version 2.121.2.

          What do you mean by Remoting is 20+ versions behind?

          This is what the Jenkins master server gives me.

          How do I update it?

          What are your suggestions?

           

           

          Yaniv Koval added a comment - - edited oleg_nenashev , Please note that our Jenkins instance is old version 2.121.2. What do you mean by Remoting is 20+ versions behind? This is what the Jenkins master server gives me. How do I update it? What are your suggestions?    

          Daniel Beck added a comment -

          2.212.2.

          Do you mean 2.121.2?

          Daniel Beck added a comment - 2.212.2. Do you mean 2.121.2?

          Yaniv Koval added a comment -

          Hi danielbeck,

          Yes, 2.121.2.

          Sorry for the typo.

           

          Yaniv Koval added a comment - Hi danielbeck , Yes, 2.121.2. Sorry for the typo.  

          Oleg Nenashev added a comment -

          ykovalx076982 I suggest you update the Jenkins master and agent (remoting.jar)to recent versions. The versions you use are 2 years old, and they lack many stability and diagnosability improvements. 

          Oleg Nenashev added a comment - ykovalx076982 I suggest you update the Jenkins master and agent (remoting.jar)to recent versions. The versions you use are 2 years old, and they lack many stability and diagnosability improvements. 

          Daniel Beck added a comment -

          Needs to happen on a recent release of Jenkins for anyone to care.

          Daniel Beck added a comment - Needs to happen on a recent release of Jenkins for anyone to care.

          Yaniv Koval added a comment -

          oleg_nenashev,

          Our Jenkins is very sensitive to changes at the moment because of the custom plugins and flows we have built.

          This will require testing and validations.

           

          Do you think that it is possible that if we update the remoring.jar will possibly solve the issue we are facing?

          If so, Can you please elaborate on how can I update the remoting.jar and where to get it?

           

          Yaniv Koval added a comment - oleg_nenashev , Our Jenkins is very sensitive to changes at the moment because of the custom plugins and flows we have built. This will require testing and validations.   Do you think that it is possible that if we update the remoring.jar will possibly solve the issue we are facing? If so, Can you please elaborate on how can I update the remoting.jar and where to get it?  

          Oleg Nenashev added a comment -

          ykovalx076982 I have created https://github.com/jenkinsci/remoting/pull/380 with links to the archive downloads

          > Do you think that it is possible that if we update the remoring.jar will possibly solve the issue we are facing?

          Maybe. Such configuration is not tested, and I cannot guarantee the stability. Please consult with Remoting changelog.

          Oleg Nenashev added a comment - ykovalx076982 I have created  https://github.com/jenkinsci/remoting/pull/380  with links to the archive downloads > Do you think that it is possible that if we update the remoring.jar will possibly solve the issue we are facing? Maybe. Such configuration is not tested, and I cannot guarantee the stability. Please consult with Remoting changelog.

          Yaniv Koval added a comment -

          Hello,

          After updating the Remoting to the latest version 4.3 the same thing happened again.

          Please advise ...

          Yaniv Koval added a comment - Hello, After updating the Remoting to the latest version 4.3 the same thing happened again. Please advise ...

          Daniel Beck added a comment -

          This bug report isn't useful, as you're ~115 weekly releases or ~25 LTS releases behind. So this is now just asking for help, which the issue tracker is the wrong environment for. Ask in chat or on the Jenkins users mailing list.

          Daniel Beck added a comment - This bug report isn't useful, as you're ~115 weekly releases or ~25 LTS releases behind. So this is now just asking for help, which the issue tracker is the wrong environment for. Ask in chat or on the Jenkins users mailing list.

            Unassigned Unassigned
            ykovalx076982 Yaniv Koval
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: