Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-44995

Very slow activity/pipeline screen load (often when logged in)

    • Blue Ocean 1.2-beta1, Blue Ocean 1.2-beta2, Blue Ocean 1.4 - beta 3, Blue Ocean 1.5 - beta 1

      (to investigate)

       

      • Some people are finding dashboard/pipeline screens slow when logged in vs not logged in (see appropriate comments below) 
        • See comments and support bundles below for this - eg via bksaville. In some cases it is related to number of runs. 
      • Some users see activity screen as very slow when there are a large number of runs (this seems to be the more common and serious case...)

       

       

       

      — ORIGINAL TICKET —

      I've noticed that the dashboard loads quickly when I'm not authenticated.

      (classic loads normally)

       

      Deleting the config history like suggested in https://issues.jenkins-ci.org/browse/JENKINS-43208 did not work.

      I have sent an HAR file via email to jamesdumay.

      Jenkins version 2.46.3, BlueOcean 1.1.2

        1. haranalysis.png
          haranalysis.png
          118 kB
        2. plugins.txt
          9 kB
        3. support_2017-11-17_08.03.54.zip
          743 kB
        4. support_2017-11-17_08.03.56.zip
          743 kB
        5. support_2017-11-17_08.03.59.zip
          743 kB
        6. support_2017-11-17_08.04.02.zip
          743 kB
        7. support_2017-11-17_08.04.13.zip
          743 kB
        8. support_2017-11-17_08.04.18.zip
          742 kB
        9. support_2017-11-17_11.26.08.zip
          401 kB
        10. support_2017-11-20_23.21.08.zip
          269 kB

          [JENKINS-44995] Very slow activity/pipeline screen load (often when logged in)

          Michael Neale added a comment -

          ok some more investiagtion bksaville imeredith

          So it looks like the favourite thing has been solved with a release of autofav plugin - allowing people to opt out (or system wide opt out). You still need to remove the favourites, but I think with this the logged in slowness case may be solved..

           

           

          However, this does leave the general slowness around activity screens. Some new facts: 

           

          kshultz do you happen to have a test set of data that could work with 1.1, 1.2, 1.3 to compare activity screen load times for a multibranch project (large number of branches perhaps?), so we know this is a regression or not? 

           

          Michael Neale added a comment - ok some more investiagtion bksaville imeredith :  So it looks like the favourite thing has been solved with a release of autofav plugin - allowing people to opt out (or system wide opt out). You still need to remove the favourites, but I think with this the logged in slowness case may be solved..     However, this does leave the general slowness around activity screens. Some new facts:  ci.jenkins.io has this slowness on the core pipeline, but the number of runs doesn't seem too high, so it may not be related to number of runs (others seem to be reporting similar) ci.jenkins.io has a lot of branches on the core pipeline, so some of the slowness may be to do with that ci.blueocean.io also has a similar number of runs, and many branches, but doesn't seem slow at all. so there may be some plugin/config of jenkins.io that changes its behavior  You can see this on this URL: https://ci.blueocean.io/blue/rest/organizations/jenkins/pipelines/blueocean/runs/?start=0&limit=26 loads pretty fast However: https://ci.jenkins.io/blue/rest/organizations/jenkins/pipelines/Core/jenkins/runs/?start=0&limit=26  is really slow This seems to be a regression on 1.3 (but still not clear) If you reduce the number of rows returned on ci.jenkins.io it is fast:  https://ci.jenkins.io/blue/rest/organizations/jenkins/pipelines/Core/jenkins/runs/?start=0&limit=5 So probably not related to pagination? Maybe just N branches * 26 rows of each to be fetched? Why is it so much faster just for that small diff of data?    kshultz do you happen to have a test set of data that could work with 1.1, 1.2, 1.3 to compare activity screen load times for a multibranch project (large number of branches perhaps?), so we know this is a regression or not?   

          Michael Neale added a comment -

          batmat is going to get a stacktrace while slowness is happening. 
          teilo has suggested it may be to do with there being many unstable runs of a specific pipeline, and it has to (erroneosly) recursively load the runs (this may explain why in some cases it is bad, but not others). 

          Both of them confirm it seem to be 1.3 that is generally worse  

          Michael Neale added a comment - batmat is going to get a stacktrace while slowness is happening.  teilo has suggested it may be to do with there being many unstable runs of a specific pipeline, and it has to (erroneosly) recursively load the runs (this may explain why in some cases it is bad, but not others).  Both of them confirm it seem to be 1.3 that is generally worse  

          Karl Shultz added a comment -

          michaelneale, apologies, I missed your question from a few days ago. The short answer is no, I don't have a particular dataset, but do have some ideas:

          • We could pick from any number repositories to fork in order to create a test data set. In JENKINS-45372, teilo was using Apache Maven. That one's got 29 branches in it and would give a test instance plenty of "stuff to do," so to speak. But it's not hundreds of branches.
          • We could create a test repo programatically, and test performance that way. Perhaps as an offshoot of BitbucketServerTest. That might make things nicely self-contained. And would give us some additional Bitbucket coverage implicitly, which might be a welcome "side effect."

          I thought I'd created an Improvement ticket for at least starting the ball rolling on performance testing of BO, but I don't see it. I'll do so later today.

          Karl Shultz added a comment - michaelneale , apologies, I missed your question from a few days ago. The short answer is no, I don't have a particular dataset, but do have some ideas: We could pick from any number repositories to fork in order to create a test data set. In JENKINS-45372 , teilo was using Apache Maven. That one's got 29 branches in it and would give a test instance plenty of "stuff to do," so to speak. But it's not hundreds of branches. We could create a test repo programatically, and test performance that way. Perhaps as an offshoot of BitbucketServerTest . That might make things nicely self-contained. And would give us some additional Bitbucket coverage implicitly, which might be a welcome "side effect." I thought I'd created an Improvement ticket for at least starting the ball rolling on performance testing of BO, but I don't see it. I'll do so later today.

          Ivan Meredith added a comment -

          After spending time investigating the activity API, I have hit a roadblock.

          The performance issues comes while the data is being serialised and sent to the client. This is as opposed to before serialisation happens while we are generating the iterators of BlueRuns. In the past when we have had performance issues it has been during the pre-serialization step that has been problematic.

          This problem only seems to affect multibranch activity api. And only for some multibranch pipelines. On the blueocean ci server, we have 400ish branches and many builds with no issue, however the jenkinsci ci server has issues with <150 branches and probably less builds.

          Because multibranch activity is loading the 25 latest runs from all branches, it may be loading the last run from 25 different branches. kzantow pointed out that it might be a slowness related to loading runs from so many branches, which would not be as obvious if for example the last 25 runs only came from 2 branches despite many more branches existing. This could explain the difference between what I see on the CI servers.

          I'm not sure what is the best way forward here. I see [at least] 3 alternatives

          • Something can be fixed in core or pipeline to make this faster assuming the issue is there.
          • Add some caching to the activity api. Maybe use an H2 database or just basic in memory cache.
          • Make the branches page the default page instead of activity for multibranch pipelines. This doesn't fix anything but maybe makes it less annoying in the default case.

          Ivan Meredith added a comment - After spending time investigating the activity API, I have hit a roadblock. The performance issues comes while the data is being serialised and sent to the client. This is as opposed to before serialisation happens while we are generating the iterators of BlueRuns. In the past when we have had performance issues it has been during the pre-serialization step that has been problematic. This problem only seems to affect multibranch activity api. And only for some multibranch pipelines. On the blueocean ci server, we have 400ish branches and many builds with no issue, however the jenkinsci ci server has issues with <150 branches and probably less builds. Because multibranch activity is loading the 25 latest runs from all branches, it may be loading the last run from 25 different branches. kzantow pointed out that it might be a slowness related to loading runs from so many branches, which would not be as obvious if for example the last 25 runs only came from 2 branches despite many more branches existing. This could explain the difference between what I see on the CI servers. I'm not sure what is the best way forward here. I see [at least]  3 alternatives Something can be fixed in core or pipeline to make this faster assuming the issue is there. Add some caching to the activity api. Maybe use an H2 database or just basic in memory cache. Make the branches page the default page instead of activity for multibranch pipelines. This doesn't fix anything but maybe makes it less annoying in the default case.

          Michael Neale added a comment -

          vivek in light of Ivan's comments above - do you have any more ideas? I still don't have concrete info that it is a regression as of 1.3, but it seems a bit slower. Perhaps it is time to bite the bullet and cache activity screen? (the logged in case should be solved by now BTW). 

          Michael Neale added a comment - vivek in light of Ivan's comments above - do you have any more ideas? I still don't have concrete info that it is a regression as of 1.3, but it seems a bit slower. Perhaps it is time to bite the bullet and cache activity screen? (the logged in case should be solved by now BTW). 

          Vivek Pandey added a comment - - edited

          Bug identified and fixed. PR opened https://github.com/jenkinsci/blueocean-plugin/pull/1632.

           

          Details on what was causing and fix:

          Analyzing har file showed, FavoriteStatePreloader was returning large number of favorite jobs resulting in most time taken responding to loading dashboard. Bug was in `FavoriteContainer.iterator()` not paginating. Fix adds pagination by default.

          Vivek Pandey added a comment - - edited Bug identified and fixed. PR opened https://github.com/jenkinsci/blueocean-plugin/pull/1632.   Details on what was causing and fix: Analyzing har file showed, FavoriteStatePreloader was returning large number of favorite jobs resulting in most time taken responding to loading dashboard. Bug was in `FavoriteContainer.iterator()` not paginating. Fix adds pagination by default.

          Keith Zantow added a comment -

          I don't see a problem with that PR, but doesn't the page display all favorites? So I'm not sure how paging would fix that.

          Keith Zantow added a comment - I don't see a problem with that PR, but doesn't the page display all favorites? So I'm not sure how paging would fix that.

          Vivek Pandey added a comment -

          kzantow Currently yes, it fetches all possible favorites. container iterators are supposed to return default page size. If you call favorite API you get default page size so it makes it consistent.

          Favorites are expensive and displaying 100s of favorite is rather pain to user as bunch of them are auto-favorited. I think a new ticket should be opened to add UI pagination support ('Show more') for favorite as well. Current page size is 100, maybe it should be 26 like other things shown in UI?

           

           

          Vivek Pandey added a comment - kzantow  Currently yes, it fetches all possible favorites. container iterators are supposed to return default page size. If you call favorite API you get default page size so it makes it consistent. Favorites are expensive and displaying 100s of favorite is rather pain to user as bunch of them are auto-favorited. I think a new ticket should be opened to add UI pagination support ('Show more') for favorite as well. Current page size is 100, maybe it should be 26 like other things shown in UI?    

          Vivek Pandey added a comment -

          kzantow over to you as discussed. Listing the issues we discussed to fix as part of this improvement:

          • Paginate favorite list on dashboard (my fix fixes that, that is default list of favorite is default page size of 100, possibly it should be 26 like other objects)
          • Evaluate if top level pipeline object can include 'favorite' as boolean value without much impact
          • Minimize 'item' object properties to include only what favorite UI needs (name of pipeline, favorited or not, commitId...)
          • Fix bug in frontend where it calls favorites API even though it already got pre-loaded list of favorites

          Vivek Pandey added a comment - kzantow over to you as discussed. Listing the issues we discussed to fix as part of this improvement: Paginate favorite list on dashboard (my fix fixes that, that is default list of favorite is default page size of 100, possibly it should be 26 like other objects) Evaluate if top level pipeline object can include 'favorite' as boolean value without much impact Minimize 'item' object properties to include only what favorite UI needs (name of pipeline, favorited or not, commitId...) Fix bug in frontend where it calls favorites API even though it already got pre-loaded list of favorites

          Michael Neale added a comment -

          sophistifunk as mentioned - worth taking a look at (I will take a look at test failures too)

          Michael Neale added a comment - sophistifunk as mentioned - worth taking a look at (I will take a look at test failures too)

            nicu Nicolae Pascu
            schulzha Hans Schulz
            Votes:
            7 Vote for this issue
            Watchers:
            21 Start watching this issue

              Created:
              Updated:
              Resolved: