Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68954

Agent with remoting 4.14+ hangs up or show response time >5s

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • remoting
    • None

      After upgrading remoting to 4.14 we could observe agent node Response time more than 3s on node monitoring board, agent node hangs up on huge pipeline and also monitoring timeout in agent logs. 

      Reproduced with Kubernretes agent provisioning.

      Agent successfully connected and online
      ERROR: Failed to monitor for Clock Difference
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for JVM Version
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Response Time
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.ResponseTimeMonitor$1.monitor(ResponseTimeMonitor.java:57)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Temp Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Swap Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Disk Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Architecture
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Clock Difference
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Architecture
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Temp Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Disk Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Free Swap Space
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for JVM Version
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305)
      ERROR: Failed to monitor for Response Time
      java.util.concurrent.TimeoutException
      	at hudson.remoting.Request$1.get(Request.java:321)
      	at hudson.remoting.Request$1.get(Request.java:240)
      	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:66)
      	at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:112)
      	at hudson.node_monitors.ResponseTimeMonitor$1.monitor(ResponseTimeMonitor.java:57)
      	at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305) 

          [JENKINS-68954] Agent with remoting 4.14+ hangs up or show response time >5s

          Konstantin Bulanov created issue -
          Konstantin Bulanov made changes -
          Attachment New: image-2022-07-07-18-49-35-738.png [ 58433 ]
          Priority Original: Blocker [ 1 ] New: Critical [ 2 ]
          Konstantin Bulanov made changes -
          Attachment New: image-2022-07-07-19-12-01-853.png [ 58434 ]
          Konstantin Bulanov made changes -
          Attachment New: image-2022-07-07-19-12-16-868.png [ 58435 ]

          Konstantin Bulanov added a comment - - edited

          Introduced by  Pull request https://github.com/jenkinsci/remoting/pull/523

          After replacing https://github.com/jenkinsci/remoting/pull/523/files#diff-df5e02eb0e94358acfe92fb427f75ea298f7e8a58857efb4fcff7c9f602974d6R54 with 

          private static final ExecutorService THREAD_POOL = Executors.newFixedThreadPool(1, new NamingThreadFactory(Executors.defaultThreadFactory(), AnonymousClassWarnings.class.getSimpleName())); 

          Response time reduced to 50-300ms

          Konstantin Bulanov added a comment - - edited Introduced by  Pull request https://github.com/jenkinsci/remoting/pull/523 After replacing https://github.com/jenkinsci/remoting/pull/523/files#diff-df5e02eb0e94358acfe92fb427f75ea298f7e8a58857efb4fcff7c9f602974d6R54 with  private static final ExecutorService THREAD_POOL = Executors.newFixedThreadPool(1, new NamingThreadFactory(Executors.defaultThreadFactory(), AnonymousClassWarnings. class. getSimpleName())); Response time reduced to 50-300ms
          Konstantin Bulanov made changes -
          Remote Link New: This issue links to "Proposed PR (Web Link)" [ 27936 ]
          Konstantin Bulanov made changes -
          Attachment Original: image-2022-07-07-19-12-16-868.png [ 58435 ]
          Jesse Glick made changes -
          Assignee Original: Jeff Thompson [ jthompson ] New: Konstantin Bulanov [ bulanovk ]
          Jesse Glick made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Jesse Glick made changes -
          Status Original: In Progress [ 3 ] New: In Review [ 10005 ]

            jglick Jesse Glick
            bulanovk Konstantin Bulanov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: