Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-18377

Improve the performance when listing many jobs (GSoC 2019, coding phase 2)

    • RoleStrategy-Performance

      This affects a jenkins installation with 750+ jobs.

      When loading the "Overview" to list all jobs, the loading time is over 60 seconds.

      With the role-strategy plugin disabled, the loading time goes down to 5 seconds.

          [JENKINS-18377] Improve the performance when listing many jobs (GSoC 2019, coding phase 2)

          Croesus Kall added a comment -

          Profiling shows that the following uses a lot of CPU time

          com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$2perform()

          And this also seems to relate to:
          https://issues.jenkins-ci.org/browse/JENKINS-17122

          Croesus Kall added a comment - Profiling shows that the following uses a lot of CPU time com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$2perform() And this also seems to relate to: https://issues.jenkins-ci.org/browse/JENKINS-17122

          I am also experiencing this problem, but not with that many jobs. This started occurring after an upgrade from Jenkins 1.510 to 1.521

          Walter Kacynski added a comment - I am also experiencing this problem, but not with that many jobs. This started occurring after an upgrade from Jenkins 1.510 to 1.521

          I am increasing the priority of this issue since this prevents upgrades to newer releases of Jenkins.

          Walter Kacynski added a comment - I am increasing the priority of this issue since this prevents upgrades to newer releases of Jenkins.

          I linked JENKINS-17122 to see if it has any bearing on this problem.

          Walter Kacynski added a comment - I linked JENKINS-17122 to see if it has any bearing on this problem.

          Stanislav Bashkyrtsev added a comment - - edited

          I've raised issues with detailed analysis: JENKINS-18721 JENKINS-18723 They might be the reason of current problem. RBAC is what makes the problem visible, but the problem itself is not in RBAC (at least that was what my analysis showed to me), but in ListView.

          Stanislav Bashkyrtsev added a comment - - edited I've raised issues with detailed analysis: JENKINS-18721 JENKINS-18723 They might be the reason of current problem. RBAC is what makes the problem visible, but the problem itself is not in RBAC (at least that was what my analysis showed to me), but in ListView.

          Oleg Nenashev added a comment -

          Seems that I'll experience this issue soon

          Regarding views...
          It is possible to implement caching of the permissions, which will reduce pages loading time. I'll try to implement such cache in the next version of the plugin. Anyway, please feel free to contribute.

          Oleg Nenashev added a comment - Seems that I'll experience this issue soon Regarding views... It is possible to implement caching of the permissions, which will reduce pages loading time. I'll try to implement such cache in the next version of the plugin. Anyway, please feel free to contribute.

          Georg Sash added a comment -

          I used to have the same problem. But I did not encounter the problem in recent Jenkins versions. so I guess this issue is obsolete.

          Georg Sash added a comment - I used to have the same problem. But I did not encounter the problem in recent Jenkins versions. so I guess this issue is obsolete.

          Croesus Kall added a comment -

          I still have this problem - so I wouldn't say it was obsolete.

          Croesus Kall added a comment - I still have this problem - so I wouldn't say it was obsolete.

          Oleg Nenashev added a comment -

          Fixes in the core (JENKINS-18721) are available since 1.532.
          It partially resolves the issue, but I confirm that there are many improvements to be done inside the plugin.

          I'm going to keep the issue.
          BTW, the priority could be decreased if there is no critical performance complains.

          As a workaround, you can use RoleMacros (e.g. from the Ownership plugin). They greatly reduce the calculation speed due to the internal caching

          Oleg Nenashev added a comment - Fixes in the core ( JENKINS-18721 ) are available since 1.532. It partially resolves the issue, but I confirm that there are many improvements to be done inside the plugin. I'm going to keep the issue. BTW, the priority could be decreased if there is no critical performance complains. As a workaround, you can use RoleMacros (e.g. from the Ownership plugin). They greatly reduce the calculation speed due to the internal caching

          Croesus Kall added a comment -

          Definitely faster loading now. Loading time has gone down from over 60 seconds to around 15 seconds.

          But for my setup it takes 5 seconds without role-strategy, so there is still room for improvement.

          Croesus Kall added a comment - Definitely faster loading now. Loading time has gone down from over 60 seconds to around 15 seconds. But for my setup it takes 5 seconds without role-strategy, so there is still room for improvement.

          Raphaël UNIQUE added a comment - - edited

          We have the same problem
          non administrators user experience this issue, each action take a long time.
          for administrators, all is realy faster

          Raphaël UNIQUE added a comment - - edited We have the same problem non administrators user experience this issue, each action take a long time. for administrators, all is realy faster

          Oleg Nenashev added a comment -

          @Raphael
          It does not make sense to add new comments w/o providing RoleStrategy configurations (corner-cases, etc.) for slow installations.

          Oleg Nenashev added a comment - @Raphael It does not make sense to add new comments w/o providing RoleStrategy configurations (corner-cases, etc.) for slow installations.

          Raphaël UNIQUE added a comment - - edited

          some details about the installation:

          • 1300 jobs
          • 27 views
          • AD contain 14 000 users
          • 4 global roles assigned to
            • 21 AD groups (max 100 users)
            • 14 users
          • 38 project roles asigned to
            • 19 AD groups (max 100 users)
            • 11 users

          for non administrators users,each navigation between views can take up to 70 seconds
          for administrators, i takes only 5 second

          Raphaël UNIQUE added a comment - - edited some details about the installation: 1300 jobs 27 views AD contain 14 000 users 4 global roles assigned to 21 AD groups (max 100 users) 14 users 38 project roles asigned to 19 AD groups (max 100 users) 11 users for non administrators users,each navigation between views can take up to 70 seconds for administrators, i takes only 5 second

          Alexander Ost added a comment -

          This issue also heavily affects the search box – with role-based auth active (and approx 2500 jobs defined), the search box is barely usable (slow update, long time to feedback). After switching back to matrix-based authentication, the search box works fine.

          As previously mentioned, Job-listing is affected as well, but also access to other pages is much slower ("All" view loading takes about twice as long, master node display takes four times longer when RBAC is active).

          Running LTS Jenkins 1.609.2.

          Alexander Ost added a comment - This issue also heavily affects the search box – with role-based auth active (and approx 2500 jobs defined), the search box is barely usable (slow update, long time to feedback). After switching back to matrix-based authentication, the search box works fine. As previously mentioned, Job-listing is affected as well, but also access to other pages is much slower ("All" view loading takes about twice as long, master node display takes four times longer when RBAC is active). Running LTS Jenkins 1.609.2.

          Croesus Kall added a comment -

          With the new release, 2.3.0, the performance is worse than ever (Using jenkins 2.8)
          After downgrading to 2.2.0, things were ok.

          Croesus Kall added a comment - With the new release, 2.3.0, the performance is worse than ever (Using jenkins 2.8) After downgrading to 2.2.0, things were ok.

          Oleg Nenashev added a comment -

          croesus Caching of roles should have helped, but maybe there're some downsides.
          Do you have any metrics from your setup?

          Oleg Nenashev added a comment - croesus Caching of roles should have helped, but maybe there're some downsides. Do you have any metrics from your setup?

          Croesus Kall added a comment -

          oleg_nenashev Jenkins-wise I have this setup:
          Jobs: about 1900
          Global Roles: 4
          Project roles: 12
          Users: 25

          When I updated to 2.3.0, jenkins became quite slow and the load statistics for the machine were very high.

          Croesus Kall added a comment - oleg_nenashev Jenkins-wise I have this setup: Jobs: about 1900 Global Roles: 4 Project roles: 12 Users: 25 When I updated to 2.3.0, jenkins became quite slow and the load statistics for the machine were very high.

          Oleg Nenashev added a comment -

          croesus See 2.3.2 changelog

          Oleg Nenashev added a comment - croesus See 2.3.2 changelog

          I also see slow performance in large environments, even on the 2.3.2 upgrade.

          Andy Bernhagen added a comment - I also see slow performance in large environments, even on the 2.3.2 upgrade.

          Wouter Hünd added a comment - - edited

          oleg_nenashev I'm also still seeing poor performance with a large amount of jobs mapping against ldap roles, even with 2.3.2. Enabling the plugin increases response times of listings by 5-10 seconds.

          Wouter Hünd added a comment - - edited oleg_nenashev I'm also still seeing poor performance with a large amount of jobs mapping against ldap roles, even with 2.3.2. Enabling the plugin increases response times of listings by 5-10 seconds.

          Oliver Gondža added a comment - Related PR submitted https://github.com/jenkinsci/role-strategy-plugin/pull/28/

          Code changed in jenkins
          User: Oliver Gondža
          Path:
          src/main/java/com/michelin/cio/hudson/plugins/rolestrategy/Role.java
          http://jenkins-ci.org/commit/role-strategy-plugin/161c0358179a4c845d688b3e42db09f8aa91739d
          Log:
          JENKINS-18377 Cache Role#hashCode to speed up RoleMap#getRolesHavingPermission

          I have experienced severe slowdown caused by following stacktrace:
          at java.util.AbstractSet.hashCode(AbstractSet.java:126)
          at com.michelin.cio.hudson.plugins.rolestrategy.Role.hashCode(Role.java:149)
          at java.util.HashMap.hash(HashMap.java:338)
          at java.util.HashMap.put(HashMap.java:611)
          at java.util.HashSet.add(HashSet.java:219)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$1.perform(RoleMap.java:310)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$RoleWalker.walk(RoleMap.java:387)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$RoleWalker.<init>(RoleMap.java:376)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$1.<init>(RoleMap.java:307)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.getRolesHavingPermission(RoleMap.java:307)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.hasPermission(RoleMap.java:107)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.access$000(RoleMap.java:75)
          at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$AclImpl.hasPermission(RoleMap.java:362)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: src/main/java/com/michelin/cio/hudson/plugins/rolestrategy/Role.java http://jenkins-ci.org/commit/role-strategy-plugin/161c0358179a4c845d688b3e42db09f8aa91739d Log: JENKINS-18377 Cache Role#hashCode to speed up RoleMap#getRolesHavingPermission I have experienced severe slowdown caused by following stacktrace: at java.util.AbstractSet.hashCode(AbstractSet.java:126) at com.michelin.cio.hudson.plugins.rolestrategy.Role.hashCode(Role.java:149) at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.put(HashMap.java:611) at java.util.HashSet.add(HashSet.java:219) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$1.perform(RoleMap.java:310) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$RoleWalker.walk(RoleMap.java:387) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$RoleWalker.<init>(RoleMap.java:376) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$1.<init>(RoleMap.java:307) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.getRolesHavingPermission(RoleMap.java:307) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.hasPermission(RoleMap.java:107) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap.access$000(RoleMap.java:75) at com.michelin.cio.hudson.plugins.rolestrategy.RoleMap$AclImpl.hasPermission(RoleMap.java:362)

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          src/main/java/com/michelin/cio/hudson/plugins/rolestrategy/Role.java
          http://jenkins-ci.org/commit/role-strategy-plugin/165538fc1e85d9ff3bae99ab85035f34a4039440
          Log:
          Merge pull request #28 from olivergondza/optimize-RoleMap-getRolesHavingPermission

          JENKINS-18377 Cache Role#hashCode to speed up RoleMap#getRolesHavingPermission

          Compare: https://github.com/jenkinsci/role-strategy-plugin/compare/8dbec9ea93d1...165538fc1e85

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: src/main/java/com/michelin/cio/hudson/plugins/rolestrategy/Role.java http://jenkins-ci.org/commit/role-strategy-plugin/165538fc1e85d9ff3bae99ab85035f34a4039440 Log: Merge pull request #28 from olivergondza/optimize-RoleMap-getRolesHavingPermission JENKINS-18377 Cache Role#hashCode to speed up RoleMap#getRolesHavingPermission Compare: https://github.com/jenkinsci/role-strategy-plugin/compare/8dbec9ea93d1...165538fc1e85

          Oleg Nenashev added a comment -

          Unassigning the issue for now. We have added two Role Strategy plugin project ideas to GSoC 2019: https://jenkins.io/projects/gsoc/2019/project-ideas/. If somebody is interested in co-mentoring the ideas (including these tickets), please let us know

          Oleg Nenashev added a comment - Unassigning the issue for now. We have added two Role Strategy plugin project ideas to GSoC 2019: https://jenkins.io/projects/gsoc/2019/project-ideas/ . If somebody is interested in co-mentoring the ideas (including these tickets), please let us know

          Oleg Nenashev added a comment -

          I have converted this issue to EPIC, because there are multiple low-hanging fruits left to be done

          Oleg Nenashev added a comment - I have converted this issue to EPIC, because there are multiple low-hanging fruits left to be done

          Oleg Nenashev added a comment -

          abhyudaya is currently working on this EPIC as a part of hist GSoC project. https://jenkins.io/projects/gsoc/2019/role-strategy-performance/ .Some stories have been already addressed in the master branch.

          abhyudaya could you please take a look at the EPIc and update tasks or add missing ones?

          Oleg Nenashev added a comment - abhyudaya is currently working on this EPIC as a part of hist GSoC project.  https://jenkins.io/projects/gsoc/2019/role-strategy-performance/  .Some stories have been already addressed in the master branch. abhyudaya could you please take a look at the EPIc and update tasks or add missing ones?

          runze xia added a comment -

          I thought about our authentication model. First we need to get the acl (traversing all the roles), and in the second step to determine whether the role in the acl has the appropriate permissions.
          Can we simplify the acquisition of acl content, we construct the roleMap in the second step, so we can reduce the traversal of some roles.

          This idea may require redesigning the ACL object. The way the ACL is cached may also change.

          runze xia added a comment - I thought about our authentication model. First we need to get the acl (traversing all the roles), and in the second step to determine whether the role in the acl has the appropriate permissions. Can we simplify the acquisition of acl content, we construct the roleMap in the second step, so we can reduce the traversal of some roles. This idea may require redesigning the ACL object. The way the ACL is cached may also change.

          runzexia I have just committed https://github.com/jenkinsci/role-strategy-plugin/pull/89/commits/a02ea97b0bd512da1973f502c6ab3faca69845d8 for the folder based authorization. This changes how global roles work. Is it something like what you're looking for?

          Abhyudaya Sharma added a comment - runzexia  I have just committed  https://github.com/jenkinsci/role-strategy-plugin/pull/89/commits/a02ea97b0bd512da1973f502c6ab3faca69845d8  for the folder based authorization. This changes how global roles work. Is it something like what you're looking for?

          runze xia added a comment -

          abhyudaya   Yes, it is very similar to the approach I mentioned. If we operate in havePermission func, can we reduce some memory consumption? And we might be very simple to break the loop?

          runze xia added a comment - abhyudaya    Yes, it is very similar to the approach I mentioned. If we operate in havePermission func, can we reduce some memory consumption? And we might be very simple to break the loop?

          Jesse Glick added a comment -

          When loading the "Overview" to list all jobs, the loading time is over 60 seconds.

          Certainly it is good to optimize particular bottlenecks as we find them, but in tandem I would suggest changing the rendering of the dashboard to be progressive so that server-side bottlenecks do not completely prevent display nor block the user from initiating unrelated actions (like clicking on the sidebar). JENKINS-25075 suggests doing this for individual *ListView/column.jelly cells, though you could just as easily render whole rows lazily, ameliorating a bunch of performance problems at once.

          Jesse Glick added a comment - When loading the "Overview" to list all jobs, the loading time is over 60 seconds. Certainly it is good to optimize particular bottlenecks as we find them, but in tandem I would suggest changing the rendering of the dashboard to be progressive so that server-side bottlenecks do not completely prevent display nor block the user from initiating unrelated actions (like clicking on the sidebar). JENKINS-25075 suggests doing this for individual *ListView/column.jelly cells, though you could just as easily render whole rows lazily, ameliorating a bunch of performance problems at once.

            abhyudaya Abhyudaya Sharma
            croesus Croesus Kall
            Votes:
            14 Vote for this issue
            Watchers:
            26 Start watching this issue

              Created:
              Updated: