Status: Resolved (View Workflow)
1.483+, also LTS updates with SECURITY-44
I'm looking for guidance on debugging slow page load times. Ever since we upgraded from 1.480 to 1.484 (and other versions beyond that) we've had very slow page load times. Sometimes upwards of 120 seconds to load a page, nearly any view tab, job configuration etc.
I've debugged many Java apps and I'm confounded. There are few, if any, Full GC's occuring, regular GC's occur somewhat frequently but take .02 or less seconds. CPU utilization for the java executeable is under %6 and and disk utilization is reasonable (we're averaging a %90+ idle time). The initial HTTP ack comes back immediately but then we wait for the page response to come back for up to 2 minutes sometimes. Othertimes it loads reasonably fast (under 4 seconds). When it's hanging, it's hanging for all requests, much like it was doing a full garbage collect. It feels like some kind of resource contention.
But I cannot find where the contention is, thread dumps look nominal (I've included one here for example). Something causes long page load times, we back revved to 1.484 thinking it had something to do with lazy loading features but clearly it does not. On 1.480 we did not have these issues. I could use some help figuring out what else to look at to identify why Jenkins is slow. There is a lack of information available on common reasons for slow page load times, this results in a terrible user experience with an otherwise fine tool.
If anyone encountering this issue is using an Apache proxy,
JENKINS-10524 suggests Proxy-nokeepalive, though that is probably unrelated.
I believe I am getting similar behaviour, using Jenkins 1.512
Loading pages is ok sometimes, but it is really slow in other occasions! sometimes several minutes to load few pages only, or sometimes loading any page!
@wael your issue may not have anything to do with this fix. Please use a separate ticket (using JIRA links as needed) unless you can confirm at the code level that this fix did not do what it claimed to do.
Also please try the new command-line tuning parameters introduced by the fix (--help for details), and if a thread dump confirms that many threads are still stuck in parseURI, attach that thread dump since it will now contain more information.
We have found if the jenkins master contains a large number of jobs and the user is only allowed to access a subset then the UI takes a long time to load. If the user is granted read access to everything then it loads very quickly. We suspect (though cannot confirm) that the matrix authorization plugin is not working as efficiently as it could.
We believe the following is happening:
1. plugin is checking every job against every LDAP group the user is a member of
2. plugin is then checking every job against the specific LDAP user
3. it is possible the plugin is making individual calls to an LDAP server for each job
@marnix_klooster: the latest fix is in 1.506 and therefore also 1.509.1 LTS; --httpKeepAliveTimeout may be passed on the command line (see --help for options).