Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-18438

Node monitoring should run in parallel




      As of 1.520, AbstractNodeMonitorDescriptor monitors nodes sequentially. As the # of slaves go up, this will take a long time to complete, and this also makes the monitoring susceptive to a hang.

      While a ping thread is there to detect unresponsive nodes, its interval is 10mins and the time out is 4mins, so a few unresonsive nodes can quickly push the total running time of node monitoring beyond the default monitoring cycle of 1 hour.

      A better approach is to make asynchronous remoting calls to all the slaves at once, then wait for the results to come back. This way, we can get the result back for ones that are functioning.


        Issue Links


            kohsuke Kohsuke Kawaguchi created issue -
            kohsuke Kohsuke Kawaguchi made changes -
            Field Original Value New Value
            Link This issue is related to JENKINS-18152 [ JENKINS-18152 ]
            scm_issue_link SCM/JIRA link daemon made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            zfil Philippe Jandot made changes -
            Assignee Kohsuke Kawaguchi [ kohsuke ]
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            sogabe sogabe made changes -
            Link This issue is related to JENKINS-18671 [ JENKINS-18671 ]
            jglick Jesse Glick made changes -
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 149745 ] JNJira + In-Review [ 193273 ]


              kohsuke Kohsuke Kawaguchi
              kohsuke Kohsuke Kawaguchi
              2 Vote for this issue
              9 Start watching this issue