Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-15858

Jenkinks UI slow due to constant build record loading

      I've noticed issues with jenkins being blocked for several minutes in recent versions (even before 1.490, it started probably like 10 versions in the past).

      Here's thread dump showing such deadlocks: https://gist.github.com/4109900

      Note that my home page is using the dashboard and several threads are blocked on:

      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:638)
      	- waiting to lock <0x5e3afde0> (a hudson.model.RunMap)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:621)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getById(AbstractLazyLoadRunMap.java:498)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:472)
      	at hudson.model.AbstractProject.getNearestOldBuild(AbstractProject.java:1025)
      	at hudson.maven.MavenModuleSetBuild.getModuleLastBuilds(MavenModuleSetBuild.java:434)
      	at hudson.maven.MavenModuleSetBuild.getResult(MavenModuleSetBuild.java:189)
      	at hudson.model.Run.getIconColor(Run.java:640)
      	at hudson.plugins.view.dashboard.stats.StatBuilds.getBuildStat(StatBuilds.java:50)
      

          [JENKINS-15858] Jenkinks UI slow due to constant build record loading

          Jesse Glick added a comment -

          @aheritier field testing would be great! My fix is based on a test only.

          Jesse Glick added a comment - @aheritier field testing would be great! My fix is based on a test only.

          FTR I still have a lot of performance issues with 1.519. I don't use the dashboard plugin anymore since I thought it was causing the issue but apparently it's not since I have hangs on almost all UI views. Very very frequently a view will take as much as 5 minutes before displying making our jenkins UI almost unusable (http://ci.xwiki.org).

          I've just taken a thread dump of jenkins when I tried to access http://ci.xwiki.org/job/xwiki-platform%20Quality%20Checks/569/ which took about 3 minutes to display, see attached jstack-vmassol-20130625.txt

          Vincent Massol added a comment - FTR I still have a lot of performance issues with 1.519. I don't use the dashboard plugin anymore since I thought it was causing the issue but apparently it's not since I have hangs on almost all UI views. Very very frequently a view will take as much as 5 minutes before displying making our jenkins UI almost unusable ( http://ci.xwiki.org ). I've just taken a thread dump of jenkins when I tried to access http://ci.xwiki.org/job/xwiki-platform%20Quality%20Checks/569/ which took about 3 minutes to display, see attached jstack-vmassol-20130625.txt

          And 2 more thread dumps when UI views are slow to load:

          jstack-vmassol-20130625-2.txt
          jstack-vmassol-20130625-3.txt

          Vincent Massol added a comment - And 2 more thread dumps when UI views are slow to load: jstack-vmassol-20130625-2.txt jstack-vmassol-20130625-3.txt

          FTR, note that this morning we've had to restart our Jenkins (http://ci.xwiki.org) because of OOM:

          Exception: org.apache.commons.jelly.JellyTagException: jar:file:/home/maven/.hudson/war/WEB-INF/lib/jenkins-core-1.519.jar!/lib/layout/layout.jelly:85:72: <st:include> java.lang.OutOfMemoryError: PermGen space

          Vincent Massol added a comment - FTR, note that this morning we've had to restart our Jenkins ( http://ci.xwiki.org ) because of OOM: Exception: org.apache.commons.jelly.JellyTagException: jar: file:/home/maven/.hudson/war/WEB-INF/lib/jenkins-core-1.519.jar!/lib/layout/layout.jelly:85:72: <st:include> java.lang.OutOfMemoryError: PermGen space

          PermGen space out of space should be tracked separately. If you think you've given sufficient perm gen heap size and Jenkins is overusing permgen space, please follow this guide and obtain the heap dump and send it to us offline.

          Kohsuke Kawaguchi added a comment - PermGen space out of space should be tracked separately. If you think you've given sufficient perm gen heap size and Jenkins is overusing permgen space, please follow this guide and obtain the heap dump and send it to us offline.

          I looked at the additional thread dumps vmassol-20130625*.txt and I think Jesse's earlier analysis still applies — they are showing that while the UI thread is stuck, it's busy loading records from the disk.

          I see that three thread dumps involve two jobs "xwiki-platform Quality Checks" and "xwiki-platform". They are both Maven projects with lots of modules, so I can imagine that if the cache is cold, this can result in a considerable delay. When I access this instance, I experience that the first page load time of those two jobs are considerable, yet if I reload the page it renders quickly enough.

          Between these and the perm gen problem, I suspect that JVM in question simplify doesn't have enough heap size to keep the cache warm enough. Again, I'd love to see the heap dump that I requested above to see if there's something wasting the heap.

          Another thought that occurred to me is if it helps to provide an option to make the build records a strong reference, instead of the weak reference. It shifts the memory pressure from build records to other soft references, but for some users it might be an useful trade off.

          Kohsuke Kawaguchi added a comment - I looked at the additional thread dumps vmassol-20130625*.txt and I think Jesse's earlier analysis still applies — they are showing that while the UI thread is stuck, it's busy loading records from the disk. I see that three thread dumps involve two jobs "xwiki-platform Quality Checks" and "xwiki-platform". They are both Maven projects with lots of modules, so I can imagine that if the cache is cold, this can result in a considerable delay. When I access this instance, I experience that the first page load time of those two jobs are considerable, yet if I reload the page it renders quickly enough. Between these and the perm gen problem, I suspect that JVM in question simplify doesn't have enough heap size to keep the cache warm enough. Again, I'd love to see the heap dump that I requested above to see if there's something wasting the heap. Another thought that occurred to me is if it helps to provide an option to make the build records a strong reference, instead of the weak reference. It shifts the memory pressure from build records to other soft references, but for some users it might be an useful trade off.

          Updated the title of the ticket to reflect the status.

          There's no dead lock involved.

          Kohsuke Kawaguchi added a comment - Updated the title of the ticket to reflect the status. There's no dead lock involved.

          Jesse Glick added a comment -

          Jesse Glick added a comment - @vmassol you can try https://buildhive.cloudbees.com/job/jenkinsci/job/dashboard-view-plugin/18/org.jenkins-ci.plugins$dashboard-view/artifact/org.jenkins-ci.plugins/dashboard-view/2.7-SNAPSHOT/dashboard-view-2.7-SNAPSHOT.hpi if you want to still use the Dashboard View plugin.

          sogabe added a comment -

          fixed in 2.8

          sogabe added a comment - fixed in 2.8

          Code changed in jenkins
          User: Jesse Glick
          Path:
          test/src/main/java/org/jvnet/hudson/test/RunLoadCounter.java
          http://jenkins-ci.org/commit/jenkins-test-harness/9dc74d4132ad4e40874c638dc3f1773c770d87e0
          Log:
          Lazy-loading utility test class designed for fix of JENKINS-15858.
          Originally-Committed-As: 03ca63c62c1f3e20f405b655bbc610ed21931a86

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: test/src/main/java/org/jvnet/hudson/test/RunLoadCounter.java http://jenkins-ci.org/commit/jenkins-test-harness/9dc74d4132ad4e40874c638dc3f1773c770d87e0 Log: Lazy-loading utility test class designed for fix of JENKINS-15858 . Originally-Committed-As: 03ca63c62c1f3e20f405b655bbc610ed21931a86

            jglick Jesse Glick
            vmassol Vincent Massol
            Votes:
            12 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: