Possible thread leak 'QueueSubTaskMetrics' in metrics - Allow finishing builds when SubTask badly fail


    • jenkins-2.205

      In one large instance we can see 3000 threads with this shape:


      "QueueSubTaskMetrics [#11342]" #5508106 daemon prio=5 os_prio=0 tid=0x00007efcf085a800 nid=0x52c7 in Object.wait() [0x00007efbccb32000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:75) - locked <0x0000000512c01d10> (a hudson.model.queue.FutureImpl) at jenkins.metrics.impl.JenkinsMetricProviderImpl.lambda$asSupplier$3(JenkinsMetricProviderImpl.java:1142) at jenkins.metrics.impl.JenkinsMetricProviderImpl$$Lambda$388/1851215464.get(Unknown Source) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - <0x00000004dc401fb0> (a java.util.concurrent.ThreadPoolExecutor$Worker)


      There is way less number of jobs running or waiting (130) and the number of these threads never decreases.

      After investigating the code:

      • Metrics listens to the queue with a QueueListener creating threads with the name QueueSubTaskMetrics
      • The 3000 threads shown are all waiting for a future to be completed.

      I think it could be avoided calling the get method with a timeout instead of getting blocked there: https://github.com/jenkinsci/metrics-plugin/commit/e803bc3b82b54bfe27e66a6393dedf53bdf1896e#diff-b02885f3ba6b4982b5322b73e664c0b6R1049



