Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-20272

Disconnected nodes don't need to be checked for reponse time

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • core
    • 1.536 on OS X

      From the Jenkins log of a non-productive instance I set up to test something:

      Okt 25, 2013 11:26:43 AM hudson.WebAppMain$3 run
      INFO: Jenkins is fully up and running
      Okt 25, 2013 3:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      Okt 25, 2013 4:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      Okt 25, 2013 5:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      Okt 25, 2013 6:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      Okt 25, 2013 7:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      Okt 25, 2013 8:26:36 PM hudson.node_monitors.ResponseTimeMonitor$1 monitor
      WARNING: Making ffoooo offline because it’s not responding
      

      The node in question is a dumb slave configured to connect via JNLP. It never was connected.

      There should be no warnings about responsiveness for nodes that aren't connected.

          [JENKINS-20272] Disconnected nodes don't need to be checked for reponse time

          peter_schuetze added a comment - - edited

          I have lots of Nodes that are only used for deployments. They will only be taken online when needed and disconnect on 15 min inactivity. Looks very silly to have all of them marked as unresponsive just because they are offline. In addition, I am a fan of "only report what's an issue".

          peter_schuetze added a comment - - edited I have lots of Nodes that are only used for deployments. They will only be taken online when needed and disconnect on 15 min inactivity. Looks very silly to have all of them marked as unresponsive just because they are offline. In addition, I am a fan of "only report what's an issue".

          David Aldrich added a comment -

          If I mark a node as offline, I notice that Jenkins still tries to connect to it, fails and gives a warning. Is that the same point as stated above? I agree that Jenkins shouldn't try to connect to nodes marked offline, it fills the log with pointless warning messages.

          David Aldrich added a comment - If I mark a node as offline, I notice that Jenkins still tries to connect to it, fails and gives a warning. Is that the same point as stated above? I agree that Jenkins shouldn't try to connect to nodes marked offline, it fills the log with pointless warning messages.

          Daniel Beck added a comment -

          davida2009: Your slave retention strategy is "Keep online as much as possible", and that is what Jenkins does. "Temporarily mark offline" just makes it unavailable for building jobs. See also JENKINS-13140.

          Workaround: Build and install https://github.com/daniel-beck/jenkins-keep-slave-disconnected-plugin and configure the retention strategy provided by it.

          Daniel Beck added a comment - davida2009 : Your slave retention strategy is "Keep online as much as possible", and that is what Jenkins does. "Temporarily mark offline" just makes it unavailable for building jobs. See also JENKINS-13140 . Workaround: Build and install https://github.com/daniel-beck/jenkins-keep-slave-disconnected-plugin and configure the retention strategy provided by it.

          David Aldrich added a comment -

          Hi Daniel,
          Thanks for your reply, I understand now. I would like to try your plugin. Are there instructions for how to build it?

          Could you be persuaded to publish a pre-built version please?

          Best regards
          David

          David Aldrich added a comment - Hi Daniel, Thanks for your reply, I understand now. I would like to try your plugin. Are there instructions for how to build it? Could you be persuaded to publish a pre-built version please? Best regards David

          David Aldrich added a comment -

          Hi Daniel

          Thanks for making the 'Keep Offline Slaves Disconnected Retention Strategy Plugin' available through plugin manager. I have installed the plugin, restarted and set the Availability for nodes I have taken offline to:

          Keep this slave online as much as possible, but don't reconnect if temporarily marked offline by the user

          I no longer see messages in the log of type:

          Attempting to reconnect xxxx

          But I do still messages of type:

          Making xxxx offline because it’s not responding

          Is that what you would expect?

          Best regards

          David

          David Aldrich added a comment - Hi Daniel Thanks for making the 'Keep Offline Slaves Disconnected Retention Strategy Plugin' available through plugin manager. I have installed the plugin, restarted and set the Availability for nodes I have taken offline to: Keep this slave online as much as possible, but don't reconnect if temporarily marked offline by the user I no longer see messages in the log of type: Attempting to reconnect xxxx But I do still messages of type: Making xxxx offline because it’s not responding Is that what you would expect? Best regards David

          Daniel Beck added a comment -

          Hi David,

          that was all Nicolas – thank him.

          Making xxxx offline because it’s not responding

          That's a known issue in the response time monitor which doesn't care that a node is disconnected: JENKINS-20272

          Daniel Beck added a comment - Hi David, that was all Nicolas – thank him. Making xxxx offline because it’s not responding That's a known issue in the response time monitor which doesn't care that a node is disconnected: JENKINS-20272

          David Aldrich added a comment -

          Well, thank you Nicolas, if you are watching this issue.

          Thanks for your reply Daniel. I guess you don't mean JENKINS-20272 because that is this issue?

          David Aldrich added a comment - Well, thank you Nicolas, if you are watching this issue. Thanks for your reply Daniel. I guess you don't mean JENKINS-20272 because that is this issue?

          Daniel Beck added a comment -

          Uh, no, given your response I thought this was the reconnecting issue. This issue is still unresolved and it happens for all disconnected nodes, so yes, I'd expect this to happen.

          Daniel Beck added a comment - Uh, no, given your response I thought this was the reconnecting issue. This issue is still unresolved and it happens for all disconnected nodes, so yes, I'd expect this to happen.

          David Aldrich added a comment -

          Hi Daniel

          It's a few years since this we discussed this. It seems that the ResponseTimeMonitor still warns that it is marking disconnected nodes offline. We often have a number of nodes that are intentionally taken offline, so our system log is crowded with these warning messages. Could they be disabled?

          Best regards

          David

          David Aldrich added a comment - Hi Daniel It's a few years since this we discussed this. It seems that the ResponseTimeMonitor still warns that it is marking disconnected nodes offline. We often have a number of nodes that are intentionally taken offline, so our system log is crowded with these warning messages. Could they be disabled? Best regards David

          Andrew Bayer added a comment -

          Andrew Bayer added a comment - PR up at https://github.com/jenkinsci/jenkins/pull/2911

          Code changed in jenkins
          User: Andrew Bayer
          Path:
          core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java
          test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java
          http://jenkins-ci.org/commit/jenkins/5f125d110eb9f65ec6bd6030466df846c4c96f34
          Log:
          [FIXED JENKINS-20272] Don't monitor response on offline agents (#2911)

          • [FIXED JENKINS-20272] Don't monitor response on offline agents
          • Updating to only not check if channel is null.
          • Fix broken test.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Andrew Bayer Path: core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java http://jenkins-ci.org/commit/jenkins/5f125d110eb9f65ec6bd6030466df846c4c96f34 Log: [FIXED JENKINS-20272] Don't monitor response on offline agents (#2911) [FIXED JENKINS-20272] Don't monitor response on offline agents Updating to only not check if channel is null. Fix broken test.

          abayer danielbeck it should be marked as solved ? Merged in 2.72 but not backported in 2.60 LTS AFAIK

          Arnaud Héritier added a comment - abayer danielbeck it should be marked as solved ? Merged in 2.72 but not backported in 2.60 LTS AFAIK

          I suspect the fix is insufficient. The test attached there was passing before the fix, I still see machine disconnected in an improved test[1] and most important of all, I see provisioned computers in openstack-cloud plugin turned offline before launcher kicks in.

          Hence I am reopening and self-assigning this.

          [1] https://github.com/olivergondza/jenkins/commit/a19980b577129d707106b6650305dd49aac5ca8a

          Oliver Gondža added a comment - I suspect the fix is insufficient. The test attached there was passing before the fix, I still see machine disconnected in an improved test [1] and most important of all, I see provisioned computers in openstack-cloud plugin turned offline before launcher kicks in. Hence I am reopening and self-assigning this. [1] https://github.com/olivergondza/jenkins/commit/a19980b577129d707106b6650305dd49aac5ca8a

          Oliver Gondža added a comment - Fix proposed: https://github.com/jenkinsci/jenkins/pull/3453

          Code changed in jenkins
          User: Oliver Gondža
          Path:
          core/src/main/java/hudson/node_monitors/AbstractAsyncNodeMonitorDescriptor.java
          core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java
          test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java
          http://jenkins-ci.org/commit/jenkins/8872b44cc8823c2106dfc5ba9a344fd1d82e4823
          Log:
          JENKINS-20272 - Disconnected nodes should not be disconnected repeatedly (#3453)

          • [FIX JENKINS-20272] Do not disconnect skipped computers

          *NOTE:* This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/

          Functionality will be removed from GitHub.com on January 31st, 2019.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: core/src/main/java/hudson/node_monitors/AbstractAsyncNodeMonitorDescriptor.java core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java http://jenkins-ci.org/commit/jenkins/8872b44cc8823c2106dfc5ba9a344fd1d82e4823 Log: JENKINS-20272 - Disconnected nodes should not be disconnected repeatedly (#3453) JENKINS-20272 Reproduce in test [FIX JENKINS-20272] Do not disconnect skipped computers JENKINS-20272 Clean the API a bit * NOTE: * This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/ Functionality will be removed from GitHub.com on January 31st, 2019.

          Oleg Nenashev added a comment -

          Extra fix has been applied in 2.126

          Oleg Nenashev added a comment - Extra fix has been applied in 2.126

          Code changed in jenkins
          User: Oliver Gondža
          Path:
          core/src/main/java/hudson/node_monitors/AbstractAsyncNodeMonitorDescriptor.java
          core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java
          test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java
          http://jenkins-ci.org/commit/jenkins/527141e5e8dafe09e4d1c68f10f582243c53c456
          Log:
          JENKINS-20272 - Disconnected nodes should not be disconnected repeatedly (#3453)

          • [FIX JENKINS-20272] Do not disconnect skipped computers

          (cherry picked from commit 8872b44cc8823c2106dfc5ba9a344fd1d82e4823)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: core/src/main/java/hudson/node_monitors/AbstractAsyncNodeMonitorDescriptor.java core/src/main/java/hudson/node_monitors/ResponseTimeMonitor.java test/src/test/java/hudson/node_monitors/ResponseTimeMonitorTest.java http://jenkins-ci.org/commit/jenkins/527141e5e8dafe09e4d1c68f10f582243c53c456 Log: JENKINS-20272 - Disconnected nodes should not be disconnected repeatedly (#3453) JENKINS-20272 Reproduce in test [FIX JENKINS-20272] Do not disconnect skipped computers JENKINS-20272 Clean the API a bit (cherry picked from commit 8872b44cc8823c2106dfc5ba9a344fd1d82e4823)

            olivergondza Oliver Gondža
            danielbeck Daniel Beck
            Votes:
            7 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: