Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65513

Saml plugin 2.x.x causes deadlock impacting Jenkins performance

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Component/s: saml-plugin
    • Labels:
    • Environment:
      Jenkins LTS 2.273.2
      SAML 2.0.3 (2.x.x as well)
      Running on CentOS-7 inside a docker container
      openjdk version "1.8.0_282"



    • Similar Issues:
    • Released As:
      saml-2.0.5

      Description

      We wanted to use SSO authentication on our Jenkins server, so we started using the Saml plugin 2.0.2. Since we started using it, we noticed that Jenkins started to gradually run slower to the point where it was unusable. Upon further investigation, we found in our logs the following message (see Saml logs picture) - we might have a concurrency issue (probably deadlock).

      These messages are generated at least on every login (we are not sure if also due to something else). Furthermore, we discovered that our threads were continuously increasing with threads from the Saml plugin (see picture). From /threadDump, this is the information that we could obtain from one of the threads:

       Timer for org.opensaml.saml.metadata.resolver.impl.FilesystemMetadataResolver@11a2ac5bTimer for org.opensaml.saml.metadata.resolver.impl.FilesystemMetadataResolver@11a2ac5b"Timer for org.opensaml.saml.metadata.resolver.impl.FilesystemMetadataResolver@11a2ac5b" Id=10090 Group=main TIMED_WAITING on java.util.TaskQueue@7aa956ec at java.lang.Object.wait(Native Method) -  waiting on java.util.TaskQueue@7aa956ec at java.util.TimerThread.mainLoop(Timer.java:552) at java.util.TimerThread.run(Timer.java:505)

      When reaching 9k opened threads, we had to restart Jenkins because of the bad performance. Every couple of days (5-7 days) a restart was needed.

      We tested this with multiple Saml plugin versions. The problem only occurs with versions 2.x.x. Therefore we rolled back to version 1.1.7 which seems to run without causing any issues.

      We tried to investigate and compare the differences between the two major versions:

      • Saml jenkins plugin 2.0.3 uses version 3.9.0 for pac4j. Pac4j uses version 3.4.3 opensaml-saml-impl.
      • Saml jenkins plugin 1.1.7 uses version 1.9.9 for pac4j. Pac4j uses version 3.2.0 opensaml-saml-impl.

      We think the problem is in the different pac4j version, as the changes in the plugin itself between those versions do not seem that major.

      Part of the Jenkins logs (SAML Log.txt) has been attached to the ticket.

      And also more descriptive logs (saml_debug_log).

        Attachments

        1. Jenkins_threads.PNG
          Jenkins_threads.PNG
          271 kB
        2. saml_debug_log.log
          26 kB
        3. SAML Log.txt
          26 kB
        4. Saml logs.png
          Saml logs.png
          101 kB
        5. Thread dump.txt
          67 kB

          Issue Links

            Activity

            Hide
            georg020 Georgi added a comment -

            A whole thread dump will be obtained soon.

            Show
            georg020 Georgi added a comment - A whole thread dump will be obtained soon.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I will need the full threat dump to see exactly what thread is blocking what.
            I suspect that it is a combination of https://issues.jenkins.io/browse/JENKINS-61747 and your NFS Jenkins Home, for some reason is slower in the new version, probably related to the change to Spring 5. I will prepare a Docker compose environment to test in a regular deploy without NFS then I will make some load test, I am confident that I will not replicate the issue.
            In any case, I have to take a look to the other issue that will fix this too.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I will need the full threat dump to see exactly what thread is blocking what. I suspect that it is a combination of https://issues.jenkins.io/browse/JENKINS-61747 and your NFS Jenkins Home, for some reason is slower in the new version, probably related to the change to Spring 5. I will prepare a Docker compose environment to test in a regular deploy without NFS then I will make some load test, I am confident that I will not replicate the issue. In any case, I have to take a look to the other issue that will fix this too.
            Hide
            georg020 Georgi added a comment -

            A whole thread dump was attached to the ticket

            Show
            georg020 Georgi added a comment - A whole thread dump was attached to the ticket
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            taking a look to the Thread dump the threads on TIME_WAIT are waiting for a native method, checking the class FilesystemMetadataResolver I suspect is waiting for an IO operation at https://github.com/korteke/java-opensaml/blob/master/opensaml-saml-impl/src/main/java/org/opensaml/saml/metadata/resolver/impl/FilesystemMetadataResolver.java#L121 my guess is that there are more than one file descriptor open for the JENKINS_HOME/saml-sp-metadata.xml or JENKINS_HOME/saml-idp-metadate.xml files and this makes some kind of deadlock.

            Could you check the file descriptor open for the use that runs jenkins and are related with SAML.

            lsof -u jenkins | grep -i saml 
            
            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - taking a look to the Thread dump the threads on TIME_WAIT are waiting for a native method, checking the class FilesystemMetadataResolver I suspect is waiting for an IO operation at https://github.com/korteke/java-opensaml/blob/master/opensaml-saml-impl/src/main/java/org/opensaml/saml/metadata/resolver/impl/FilesystemMetadataResolver.java#L121 my guess is that there are more than one file descriptor open for the JENKINS_HOME/saml-sp-metadata.xml or JENKINS_HOME/saml-idp-metadate.xml files and this makes some kind of deadlock. Could you check the file descriptor open for the use that runs jenkins and are related with SAML. lsof -u jenkins | grep -i saml
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            Finally, I have started to test this, it is a fair issue easy to replicate on a Docker environment https://github.com/kuisathaverat/jenkins-issues/tree/master/JENKINS-65513, now I have to find the cause and a solution.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - Finally, I have started to test this, it is a fair issue easy to replicate on a Docker environment https://github.com/kuisathaverat/jenkins-issues/tree/master/JENKINS-65513 , now I have to find the cause and a solution.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I am double checking but seems like happens on the jenkins/jenkins:2.277.4-lts-centos7 docker image but not in the jenkins/jenkins:2.277.4-lts-jdk11

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I am double checking but seems like happens on the jenkins/jenkins:2.277.4-lts-centos7 docker image but not in the jenkins/jenkins:2.277.4-lts-jdk11
            Hide
            georg020 Georgi added a comment -

            lsof is not found when run from inside the container.

            Show
            georg020 Georgi added a comment - lsof is not found when run from inside the container.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I can replicate it on jenkins/jenkins:2.277.4-lts-jdk11 also, so is not the JDK or the OS, I have checked the file descriptors and there are not file descriptors pointing to metadata files.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I can replicate it on jenkins/jenkins:2.277.4-lts-jdk11 also, so is not the JDK or the OS, I have checked the file descriptors and there are not file descriptors pointing to metadata files.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I have fixed the https://issues.jenkins.io/browse/JENKINS-61747 and it is not related the classes are still stuck

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I have fixed the https://issues.jenkins.io/browse/JENKINS-61747 and it is not related the classes are still stuck
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I found the issue and I have resolved it, the problem was that every time the plugin creates a SAML2Client a SAML2ServiceProviderMetadataResolver is created when the plugin finishes using the SAML2Client this SAML2ServiceProviderMetadataResolver is not destroyed completely, this is the cause of the zombie threads. The solution is simple I have added an explicit call to destroy() method after using the SAML2Client.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I found the issue and I have resolved it, the problem was that every time the plugin creates a SAML2Client a SAML2ServiceProviderMetadataResolver is created when the plugin finishes using the SAML2Client this SAML2ServiceProviderMetadataResolver is not destroyed completely, this is the cause of the zombie threads. The solution is simple I have added an explicit call to destroy() method after using the SAML2Client.
            Hide
            georg020 Georgi added a comment -

            Thank you very much for your time and effort Ivan. We will deploy and keep an eye on the plugin just in case.

            Show
            georg020 Georgi added a comment - Thank you very much for your time and effort Ivan. We will deploy and keep an eye on the plugin just in case.
            Hide
            jhansche Joe Hansche added a comment -

            Ivan Fernandez Calvo has the fix been published yet? I see the v2.0.4 tag in github, but I don't see it available to update in the plugin update center. My update center still says that v2.0.3 is the latest version.

            Show
            jhansche Joe Hansche added a comment - Ivan Fernandez Calvo has the fix been published yet? I see the v2.0.4 tag in github, but I don't see it available to update in the plugin update center. My update center still says that v2.0.3 is the latest version.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment - - edited

            I am checking the update center and is not there https://updates.jenkins.io/download/plugins/saml/, checking the commits in the repo there is something weird on the maven release commits, I will release another time right now.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - - edited I am checking the update center and is not there https://updates.jenkins.io/download/plugins/saml/ , checking the commits in the repo there is something weird on the maven release commits, I will release another time right now.
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            I just release the 2.0.5 and it seems success, the binary is at https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/ so in a few hours should be in the update center https://updates.jenkins.io/download/plugins/saml/

                [INFO] --- maven-deploy-plugin:2.8.2:deploy (default-deploy) @ saml ---
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.hpi
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.hpi (9.0 MB at 3.4 MB/s)
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.pom
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.pom (8.6 kB at 11 kB/s)
                Downloading from maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml
                Downloaded from maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml (1.3 kB at 4.2 kB/s)
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml (1.2 kB at 1.7 kB/s)
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.jar
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.jar (87 kB at 83 kB/s)
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-sources.jar
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-sources.jar (55 kB at 69 kB/s)
                Uploading to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-javadoc.jar
                Uploaded to maven.jenkins-ci.org: https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-javadoc.jar (610 kB at 514 kB/s)
                [INFO] ------------------------------------------------------------------------
                [INFO] BUILD SUCCESS
                [INFO] ------------------------------------------------------------------------
                [INFO] Total time:  44.282 s
                [INFO] Finished at: 2021-05-18T16:30:21+02:00
                [INFO] ------------------------------------------------------------------------
            [INFO] Cleaning up after release...
            [INFO] ------------------------------------------------------------------------
            [INFO] BUILD SUCCESS
            [INFO] ------------------------------------------------------------------------
            [INFO] Total time:  02:18 min
            [INFO] Finished at: 2021-05-18T16:30:21+02:00
            [INFO] ------------------------------------------------------------------------
            
            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - I just release the 2.0.5 and it seems success, the binary is at https://repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/ so in a few hours should be in the update center https://updates.jenkins.io/download/plugins/saml/ [INFO] --- maven-deploy-plugin:2.8.2:deploy ( default -deploy) @ saml --- Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.hpi Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.hpi (9.0 MB at 3.4 MB/s) Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.pom Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.pom (8.6 kB at 11 kB/s) Downloading from maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml Downloaded from maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml (1.3 kB at 4.2 kB/s) Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/maven-metadata.xml (1.2 kB at 1.7 kB/s) Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.jar Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5.jar (87 kB at 83 kB/s) Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-sources.jar Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-sources.jar (55 kB at 69 kB/s) Uploading to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-javadoc.jar Uploaded to maven.jenkins-ci.org: https: //repo.jenkins-ci.org/releases/org/jenkins-ci/plugins/saml/2.0.5/saml-2.0.5-javadoc.jar (610 kB at 514 kB/s) [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 44.282 s [INFO] Finished at: 2021-05-18T16:30:21+02:00 [INFO] ------------------------------------------------------------------------ [INFO] Cleaning up after release... [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:18 min [INFO] Finished at: 2021-05-18T16:30:21+02:00 [INFO] ------------------------------------------------------------------------
            Hide
            jhansche Joe Hansche added a comment - - edited

            Yes, I see that there is nothing for v2.0.4 at repo.jenkins, but I see v2.0.5 now. Thank you for re-releasing. I'll keep an eye out for the update center refresh.

            When I use the "Check now" button in update center, I now see the v2.0.5 update. Thank you!

            Show
            jhansche Joe Hansche added a comment - - edited Yes, I see that there is nothing for v2.0.4 at repo.jenkins, but I see v2.0.5 now. Thank you for re-releasing. I'll keep an eye out for the update center refresh. When I use the "Check now" button in update center, I now see the v2.0.5 update. Thank you!

              People

              Assignee:
              ifernandezcalvo Ivan Fernandez Calvo
              Reporter:
              georg020 Georgi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: