Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23244

Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master

      So I went to try to see the usage for a slave on builds.apache.org, and the page had no builds on it. I eventually noticed the "Calculation in progress" bit and thought "Oh, ok, I'll leave this up and check again later". That was a mistake. Now there are 30+ threads on the master like the ones in https://gist.github.com/abayer/88e390e3f0859f8b64e2 - i.e., a whole ton of HTTP POST requests to /computer/foo/timeline/data, all but one blocking on the one that's running, and the one that's running takes a long time to finish.

      This means (a) that the build history page for a slave is useless and (b) that we're churning CPU/IO and, I'm guessing, doing so repeatedly without caching, since when I check it now, even an hour and a half later, there's no data on the page.

          [JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master

          Andrew Bayer created issue -

          Adjusting the priority since it only affects relatively unvisited pages of large deployments.

          Kohsuke Kawaguchi added a comment - Adjusting the priority since it only affects relatively unvisited pages of large deployments.
          Kohsuke Kawaguchi made changes -
          Priority Original: Critical [ 2 ] New: Major [ 3 ]

          Looking at the thread dump, the call stack indicates this call resulted in loading all the build records (via AbstractLazyLoadRunMap.all), which looks suspicious.
          I'd think this operation would only require walking newer build records.

          "Handling POST /computer/hadoop4/timeline/data/ : http-bio-8090-exec-895" Id=22685 Group=main RUNNABLE
          	at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
          	at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
          	at java.io.File.exists(File.java:813)
          	at hudson.model.RunMap.retrieve(RunMap.java:219)
          	at hudson.model.RunMap.retrieve(RunMap.java:59)
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:687)
          	-  locked hudson.model.RunMap@3fa6ce65
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:649)
          	-  locked hudson.model.RunMap@3fa6ce65
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:381)
          	at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:219)
          	at hudson.tasks.Fingerprinter$FingerprintAction.compact(Fingerprinter.java:360)
          	at hudson.tasks.Fingerprinter$FingerprintAction.onLoad(Fingerprinter.java:349)
          	at hudson.model.Run.onLoad(Run.java:337)
          	at hudson.model.RunMap.retrieve(RunMap.java:223)
          	at hudson.model.RunMap.retrieve(RunMap.java:59)
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:687)
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:670)
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.all(AbstractLazyLoadRunMap.java:622)
          	-  locked hudson.model.RunMap@3fa6ce65
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.entrySet(AbstractLazyLoadRunMap.java:277)
          	at java.util.AbstractMap$2$1.<init>(AbstractMap.java:378)
          	at java.util.AbstractMap$2.iterator(AbstractMap.java:377)
          	at hudson.util.RunList.iterator(RunList.java:97)
          	at com.google.common.collect.Iterables$15.apply(Iterables.java:1128)
          	at com.google.common.collect.Iterables$15.apply(Iterables.java:1125)
          	at com.google.common.collect.Iterators$8.next(Iterators.java:812)
          	at com.google.common.collect.Iterators$MergingIterator.<init>(Iterators.java:1306)
          	at com.google.common.collect.Iterators.mergeSorted(Iterators.java:1274)
          	at com.google.common.collect.Iterables$14.iterator(Iterables.java:1113)
          	at com.google.common.collect.Iterables$UnmodifiableIterable.iterator(Iterables.java:94)
          	at com.google.common.collect.Iterables$6.iterator(Iterables.java:585)
          	at hudson.util.RunList$2.iterator(RunList.java:210)
          	at hudson.util.RunList$2.iterator(RunList.java:210)
          	at com.google.common.collect.Iterables$6.iterator(Iterables.java:585)
          	at hudson.util.RunList.iterator(RunList.java:97)
          	at hudson.model.BuildTimelineWidget.doData(BuildTimelineWidget.java:63)
          

          Kohsuke Kawaguchi added a comment - Looking at the thread dump, the call stack indicates this call resulted in loading all the build records (via AbstractLazyLoadRunMap.all ), which looks suspicious. I'd think this operation would only require walking newer build records. "Handling POST /computer/hadoop4/timeline/data/ : http-bio-8090-exec-895" Id=22685 Group=main RUNNABLE at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.exists(File.java:813) at hudson.model.RunMap.retrieve(RunMap.java:219) at hudson.model.RunMap.retrieve(RunMap.java:59) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:687) - locked hudson.model.RunMap@3fa6ce65 at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:649) - locked hudson.model.RunMap@3fa6ce65 at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:381) at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:219) at hudson.tasks.Fingerprinter$FingerprintAction.compact(Fingerprinter.java:360) at hudson.tasks.Fingerprinter$FingerprintAction.onLoad(Fingerprinter.java:349) at hudson.model.Run.onLoad(Run.java:337) at hudson.model.RunMap.retrieve(RunMap.java:223) at hudson.model.RunMap.retrieve(RunMap.java:59) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:687) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:670) at jenkins.model.lazy.AbstractLazyLoadRunMap.all(AbstractLazyLoadRunMap.java:622) - locked hudson.model.RunMap@3fa6ce65 at jenkins.model.lazy.AbstractLazyLoadRunMap.entrySet(AbstractLazyLoadRunMap.java:277) at java.util.AbstractMap$2$1.<init>(AbstractMap.java:378) at java.util.AbstractMap$2.iterator(AbstractMap.java:377) at hudson.util.RunList.iterator(RunList.java:97) at com.google.common.collect.Iterables$15.apply(Iterables.java:1128) at com.google.common.collect.Iterables$15.apply(Iterables.java:1125) at com.google.common.collect.Iterators$8.next(Iterators.java:812) at com.google.common.collect.Iterators$MergingIterator.<init>(Iterators.java:1306) at com.google.common.collect.Iterators.mergeSorted(Iterators.java:1274) at com.google.common.collect.Iterables$14.iterator(Iterables.java:1113) at com.google.common.collect.Iterables$UnmodifiableIterable.iterator(Iterables.java:94) at com.google.common.collect.Iterables$6.iterator(Iterables.java:585) at hudson.util.RunList$2.iterator(RunList.java:210) at hudson.util.RunList$2.iterator(RunList.java:210) at com.google.common.collect.Iterables$6.iterator(Iterables.java:585) at hudson.util.RunList.iterator(RunList.java:97) at hudson.model.BuildTimelineWidget.doData(BuildTimelineWidget.java:63)

          Andrew Bayer added a comment -

          fwiw, it's now looking a lot better - no blocked threads, build history's showing up for all slaves now, so far as I can tell.

          Andrew Bayer added a comment - fwiw, it's now looking a lot better - no blocked threads, build history's showing up for all slaves now, so far as I can tell.

          Daniel Beck added a comment -

          abayer: What changed?

          Daniel Beck added a comment - abayer : What changed?

          Andrew Bayer added a comment -

          Nothing - just time after startup and first attempt to load it.

          Andrew Bayer added a comment - Nothing - just time after startup and first attempt to load it.

          Jesse Glick added a comment -

          Yup.

          Jesse Glick added a comment - Yup.
          Jesse Glick made changes -
          Link New: This issue duplicates JENKINS-18065 [ JENKINS-18065 ]
          Jesse Glick made changes -
          Resolution New: Duplicate [ 3 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

            jimilian Alexander A
            abayer Andrew Bayer
            Votes:
            8 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: