• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • remoting

      Hello folks,

      Recently I found that JNLP agent is not that good in reconnecting to known Jenkins Master. In case port is not available it will exit with exception instead of retry again and again (because potentially it holds the execution context, as far I understand that for plugins like git) - attached screenshot with error.

      So my question is - why jenkins agent don't retry forever if it's not it's first connection? Jenkins Master could restart or network have some errors in routing (with just drops of packets it's reconnecting just fine) - why the agent dies and losing the state? Or maybe I don't understand something important and jenkins agent will always pickup the state after restart?

      Additionally tested simple pipeline with git plugin - it gives no chance the agent to reconnect and fails the build... So in case the connectivity between the Jenkins Master & Agent is unstable or overloaded during git checkout - there is only one way to fail without continuing? How that could work in our real world where network connection is so unreliable?

      Thank you

          [JENKINS-73209] Jenkins agent reliable connection

          Sergei added a comment -

          In 2 words the issue is: how to make jnlp connection more reliable, so the plugins will not fail immediately and wait for reconnection of the agent forever (until pipeline timeout or whatever)?

          Sergei added a comment - In 2 words the issue is: how to make jnlp connection more reliable, so the plugins will not fail immediately and wait for reconnection of the agent forever (until pipeline timeout or whatever)?

          Sergei added a comment -

          Checked the remoting side of it, where the error comes from:

          hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@4f08d9c2:JNLP4-connect connection from <agent_ip>/<agent_ip>:49985": Remote call on JNLP4-connect connection from <agent_ip>/<agent_ip>:49985 failed. The channel is closing down or has closed down
          

          Seems the logic returns when sees channel is no more - but is there a way to wait for a good channel (when agent reconnected) and continue? Quite sure that will be not that simple - so what's the options to make it reliable?

          Sergei added a comment - Checked the remoting side of it, where the error comes from: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@4f08d9c2:JNLP4-connect connection from <agent_ip>/<agent_ip>:49985": Remote call on JNLP4-connect connection from <agent_ip>/<agent_ip>:49985 failed. The channel is closing down or has closed down Seems the logic returns when sees channel is no more - but is there a way to wait for a good channel (when agent reconnected) and continue? Quite sure that will be not that simple - so what's the options to make it reliable?

          Sergei added a comment -

          Checked the hudson.remoting.Channel and how it's used - yeah, seems not much to do with jnlp itself. What are the options we have?

          • WebSockets or Kafka? Are they stable enough? Will they give us any reliability?
          • Some third-party proxy/vpn system? We don't even need encryption or auth probably - just reliability layer. For example we setup server on jenkins node, then run it's client on jenkins agent node and utilizing client channel of this proxy/vpn to make connection more reliable?
          • Tune the agent/master kernel settings to increase tcp packets timeouts and retries? Worrying for mac/win that will be not that easy as for lin platform...
          • Maybe something else?

          Sergei added a comment - Checked the hudson.remoting.Channel and how it's used - yeah, seems not much to do with jnlp itself. What are the options we have? WebSockets or Kafka? Are they stable enough? Will they give us any reliability? Some third-party proxy/vpn system? We don't even need encryption or auth probably - just reliability layer. For example we setup server on jenkins node, then run it's client on jenkins agent node and utilizing client channel of this proxy/vpn to make connection more reliable? Tune the agent/master kernel settings to increase tcp packets timeouts and retries? Worrying for mac/win that will be not that easy as for lin platform... Maybe something else?

            jthompson Jeff Thompson
            sparshev2 Sergei
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: