Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23244

Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master

      So I went to try to see the usage for a slave on builds.apache.org, and the page had no builds on it. I eventually noticed the "Calculation in progress" bit and thought "Oh, ok, I'll leave this up and check again later". That was a mistake. Now there are 30+ threads on the master like the ones in https://gist.github.com/abayer/88e390e3f0859f8b64e2 - i.e., a whole ton of HTTP POST requests to /computer/foo/timeline/data, all but one blocking on the one that's running, and the one that's running takes a long time to finish.

      This means (a) that the build history page for a slave is useless and (b) that we're churning CPU/IO and, I'm guessing, doing so repeatedly without caching, since when I check it now, even an hour and a half later, there's no data on the page.

          [JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master

          Andrew Bayer added a comment -

          fwiw, it's now looking a lot better - no blocked threads, build history's showing up for all slaves now, so far as I can tell.

          Andrew Bayer added a comment - fwiw, it's now looking a lot better - no blocked threads, build history's showing up for all slaves now, so far as I can tell.

          Daniel Beck added a comment -

          abayer: What changed?

          Daniel Beck added a comment - abayer : What changed?

          Andrew Bayer added a comment -

          Nothing - just time after startup and first attempt to load it.

          Andrew Bayer added a comment - Nothing - just time after startup and first attempt to load it.

          Jesse Glick added a comment -

          Yup.

          Jesse Glick added a comment - Yup.

          Ivan Kalinin added a comment -

          We are still experiencing great deal of trouble with slave buld history thing.

          I just tried to open that for one slave and got all the Jenkins master locked up UI-side.

          The thread that calls `AbstractLazyLoadRunMap.load` goes on foverer (yes, we have a great deal of builds), but somehow other threads from the UI pool keep getting locked. Eventually, Jenkins became unresponsive altogether – but the jobs were still running.

          Maybe we could use a separate thread pool for this kind of stuff so it wont lock all the UI threads?

          BTW, we are running current LTS

          Ivan Kalinin added a comment - We are still experiencing great deal of trouble with slave buld history thing. I just tried to open that for one slave and got all the Jenkins master locked up UI-side. The thread that calls `AbstractLazyLoadRunMap.load` goes on foverer (yes, we have a great deal of builds), but somehow other threads from the UI pool keep getting locked. Eventually, Jenkins became unresponsive altogether – but the jobs were still running. Maybe we could use a separate thread pool for this kind of stuff so it wont lock all the UI threads? BTW, we are running current LTS

          Steps to reproduce:
          1. Display slave builds history page. Wait for it to render, there should be a small progress bar with "Computation in progress" hint
          2. Request any other page (e.g. the main page) - it will hang

          Sample thread dump illustrating the problem attached.
          Thread 30745 is processing request for slave builds history (http://jenkins/computer/slave_name/builds)
          All other requests now hang on jenkins.model.lazy.AbstractLazyLoadRunMap.load for up to 2 minutes in our case.

          Lukasz Karnasiewicz added a comment - Steps to reproduce: 1. Display slave builds history page. Wait for it to render, there should be a small progress bar with "Computation in progress" hint 2. Request any other page (e.g. the main page) - it will hang Sample thread dump illustrating the problem attached. Thread 30745 is processing request for slave builds history ( http://jenkins/computer/slave_name/builds ) All other requests now hang on jenkins.model.lazy.AbstractLazyLoadRunMap.load for up to 2 minutes in our case.

          I'm seeing this in our installation. It severely impacts the repsonsiveness of the system.

          Matthew Mitchell added a comment - I'm seeing this in our installation. It severely impacts the repsonsiveness of the system.

          (FYI this installation is around 6-7k builds a day)

          Even in the case of walking newer builds, it seems like this woudl be super expensive. Maybe it's better to keep an index of buildname/number to machine to avoid loading the metadata at all?

          Matthew Mitchell added a comment - (FYI this installation is around 6-7k builds a day) Even in the case of walking newer builds, it seems like this woudl be super expensive. Maybe it's better to keep an index of buildname/number to machine to avoid loading the metadata at all?

          Code changed in jenkins
          User: Akbashev Alexander
          Path:
          core/src/main/resources/hudson/model/BuildTimelineWidget/control.jelly
          http://jenkins-ci.org/commit/jenkins/2a0ac4f0989407a20e277444a7737e9c5f7ea78a
          Log:
          [FIX JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master (#2584)

          Mainly commit are doing two things:
          1) Show only selected (visible) builds
          2) Query build one-by-one - not it parallel

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Akbashev Alexander Path: core/src/main/resources/hudson/model/BuildTimelineWidget/control.jelly http://jenkins-ci.org/commit/jenkins/2a0ac4f0989407a20e277444a7737e9c5f7ea78a Log: [FIX JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master (#2584) Mainly commit are doing two things: 1) Show only selected (visible) builds 2) Query build one-by-one - not it parallel

          Code changed in jenkins
          User: Akbashev Alexander
          Path:
          core/src/main/resources/hudson/model/BuildTimelineWidget/control.jelly
          http://jenkins-ci.org/commit/jenkins/4421d1b94d143956475f20a03c63fb1a367321f2
          Log:
          [FIX JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master (#2584)

          Mainly commit are doing two things:
          1) Show only selected (visible) builds
          2) Query build one-by-one - not it parallel
          (cherry picked from commit 2a0ac4f0989407a20e277444a7737e9c5f7ea78a)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Akbashev Alexander Path: core/src/main/resources/hudson/model/BuildTimelineWidget/control.jelly http://jenkins-ci.org/commit/jenkins/4421d1b94d143956475f20a03c63fb1a367321f2 Log: [FIX JENKINS-23244] Slave build history page has no data and spawns a ton of very long-lived blocking threads on the master (#2584) Mainly commit are doing two things: 1) Show only selected (visible) builds 2) Query build one-by-one - not it parallel (cherry picked from commit 2a0ac4f0989407a20e277444a7737e9c5f7ea78a)

            jimilian Alexander A
            abayer Andrew Bayer
            Votes:
            8 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: