Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37489

Jenkins UI slow page load (big TTFB) - possible LDAP issue

XMLWordPrintable

      Symptoms:

      periodically Jenkins UI becomes very slow, sometimes returns 504, restart Jenkins service helps, but just for a short period of time. The periodicity of problems depends on a number of users, which use UI. No strange logs, just Jenkins web UI respond time becomes critical (nginx errors: "upstream timed out (110: Connection timed out) while reading response header from upstream")
      Average CPU load < 10%, Memory usage ~ 500Mb (heap size ~ 2.5Gb), Active Threads has critical values (max=38) and it matches with the time of UI problems
      Graphs (from Monitoring plugin) in attachments

      Monitoring plugin also has "Current requests" view, and there I find reason of this problem: lots of pending requests for same URLs, something like "/job/MY-JOB/changes GET", "/job/MY-JOB/BUILD-ID/wfapi/changesets?=1471452733792 ajax GET"_
      I tried to kill this requests (to free threads) using "Kill" button in Monitoring plugin, and after that I find helpful log records.

      Second record points on that Jenkins tries to loadUserByUsername and it's strange, because I configured Jenkins search query to search users by uid (login) (according to official configuration guide), but not "FirstName LastName".

      Than I checked ldap server logs and discovered high CPU usage, high LA, and tons of incorrect search queries from Jenkins in slapd.log:

      Aug 17 17:36:14 ldap-server slapd[1183]: conn=249907 fd=26 ACCEPT from IP=5.2.1.1:43078 (IP=0.0.0.0:636)
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 fd=26 TLS established tls_ssf=256 ssf=256
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" method=128
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" mech=SIMPLE ssf=0
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 RESULT tag=97 err=0 text=
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 deref=3 filter="(uid=firstname lastname)"
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=2 UNBIND
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 fd=26 closed
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249911 fd=25 ACCEPT from IP=5.2.1.1:43132 (IP=0.0.0.0:636)
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 fd=25 TLS established tls_ssf=256 ssf=256
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" method=128
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" mech=SIMPLE ssf=0
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 RESULT tag=97 err=0 text=
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 deref=3 filter="(uid=firstname lastname)"
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=2 UNBIND
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 fd=25 closed
      ....
      

      More then 37000 incorrect searches/day, just for my user

      Problem:

      It seems, that jenkins tries to check user privileges before executing (some?) requests, and while forming search query for LDAP it uses "Full Name"/username, but not ID/login/uid -> ldap can't find anything -> empty result -> jenkins tries one more time to verify privileges -> loop -> busy jenkins threads/workers/executors -> HTTP 504

      Workarounds:
      1. Not great, but possible: use big cache for LDAP (in jenkins "Configure Global Security" preferences), it didn't fix, but can minimize impact of this problem (not 100% sure)
      2. Like a fix: use custom LDAP search query (in jenkins "Configure Global Security" preferences), smth. like:
        (|(uid={0})(cn={0}))

        (don't forget to add 'cn' arrtibute to LDAP index)

        1. activeThreads.png
          activeThreads.png
          32 kB
        2. usedMemory.png
          usedMemory.png
          44 kB
        3. cpu.png
          cpu.png
          18 kB

            kohsuke Kohsuke Kawaguchi
            amelnyk Andrii Melnyk
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: