Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-57713

Cloud agent provisioning errors due to Remoting behaviour change

    • Remoting 3.33, jenkins-2.182

      This change https://github.com/jenkinsci/remoting/pull/193/commits/69f9ebe72c608e14745fb3fc5f8d9e4c65758d89 is modifying the behaviour of the agent connection cycle.

      Retrying the connection even when the agent is not yet created in the master was something a cloud implementation could rely on (one of our proprietary implementations does).

      IMHO this was a breaking change. There should be at least one way to keep the previous behaviour (with a new parameter, or maybe reusing the existent "noReconnect").

          [JENKINS-57713] Cloud agent provisioning errors due to Remoting behaviour change

          Jesse Glick added a comment -

          Jesse Glick added a comment - CC witokondoria

          Certainly, trying to connect before the agent creation will cause the remoting attempt to fail and exit. The reconnect flag wont cover the initial bug, as it is interesting to keep on reconnecting while the agent is still registered (but stop doing so if the agent gets removed as happened after a master restart)

          Is there some rule or recommendation about the procedure and order of the "agent endpoint register" and "remoting start" order? It seems weird to kick a process that will have to retry until an endpoint becomes available

          Javier Delgado added a comment - Certainly, trying to connect before the agent creation will cause the remoting attempt to fail and exit. The reconnect flag wont cover the initial bug, as it is interesting to keep on reconnecting while the agent is still registered (but stop doing so if the agent gets removed as happened after a master restart ) Is there some rule or recommendation about the procedure and order of the "agent endpoint register" and "remoting start" order? It seems weird to kick a process that will have to retry until an endpoint becomes available

          The revert of the regression (JENKINS-57759) is landing to next Weekly and 1.176.1

          Oliver Gondža added a comment - The revert of the regression ( JENKINS-57759 ) is landing to next Weekly and 1.176.1

          Oleg Nenashev added a comment -

          The weekly revert was done in Jenkins 2.180. 

           

          Oleg Nenashev added a comment - The weekly revert was done in Jenkins 2.180.   

          Jeff Thompson added a comment -

          It's not clear to me how to preserve the sequencing behavior describes and also achieve the sequencing change witokondoria desired in #JENKINS-46515. Different scenarios or implementations rely on or are impacted by the sequence differently. There might be some way to add a flag to invoke the new behavior, while not impacting the existing but I'm not sure if that could meet the needs. Unfortunately, the sequencing isn't well-specified or clarified.

          I'll reference this Jira issue to remove the changes for #JENKINS-46515 / jenkinsci/remoting#193 from the Remoting master branch. Once done, I'll release a new Remoting version with the other changes.

          Jeff Thompson added a comment - It's not clear to me how to preserve the sequencing behavior describes and also achieve the sequencing change witokondoria desired in # JENKINS-46515 . Different scenarios or implementations rely on or are impacted by the sequence differently. There might be some way to add a flag to invoke the new behavior, while not impacting the existing but I'm not sure if that could meet the needs. Unfortunately, the sequencing isn't well-specified or clarified. I'll reference this Jira issue to remove the changes for # JENKINS-46515 / jenkinsci/remoting#193 from the Remoting master branch. Once done, I'll release a new Remoting version with the other changes.

          jthompson, I agree a switch to opt-in/out of the behavior is about the only thing to do as far as I can see. I do not think it is sane to rely on any asumptions with regards the sequence of actions on master/agent during cloud provisioning.

          Oliver Gondža added a comment - jthompson , I agree a switch to opt-in/out of the behavior is about the only thing to do as far as I can see. I do not think it is sane to rely on any asumptions with regards the sequence of actions on master/agent during cloud provisioning.

          Jeff Thompson added a comment -

          Removed the problematic change from Remoting in version 3.31.

          Jeff Thompson added a comment - Removed the problematic change from Remoting in version 3.31.

            jthompson Jeff Thompson
            amuniz Antonio Muñiz
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: