We wanted to use SSO authentication on our Jenkins server, so we started using the Saml plugin 2.0.2. Since we started using it, we noticed that Jenkins started to gradually run slower to the point where it was unusable. Upon further investigation, we found in our logs the following message (see Saml logs picture) - we might have a concurrency issue (probably deadlock).
These messages are generated at least on every login (we are not sure if also due to something else). Furthermore, we discovered that our threads were continuously increasing with threads from the Saml plugin (see picture). From /threadDump, this is the information that we could obtain from one of the threads:
When reaching 9k opened threads, we had to restart Jenkins because of the bad performance. Every couple of days (5-7 days) a restart was needed.
We tested this with multiple Saml plugin versions. The problem only occurs with versions 2.x.x. Therefore we rolled back to version 1.1.7 which seems to run without causing any issues.
We tried to investigate and compare the differences between the two major versions:
- Saml jenkins plugin 2.0.3 uses version 3.9.0 for pac4j. Pac4j uses version 3.4.3 opensaml-saml-impl.
- Saml jenkins plugin 1.1.7 uses version 1.9.9 for pac4j. Pac4j uses version 3.2.0 opensaml-saml-impl.
We think the problem is in the different pac4j version, as the changes in the plugin itself between those versions do not seem that major.
Part of the Jenkins logs (SAML Log.txt) has been attached to the ticket.
And also more descriptive logs (saml_debug_log).