• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core

      We are seeing repeatable very heavy lock congestion with FingerPrint.save.

      Jenkins apparently slides down to a state where a lot of threads have locked / are competing for a lock on FingerPrint instance and are competing for a lock on AnnotationMapper (a singleton). Everything grings to a halt.

          [JENKINS-13154] Heavy thread congestion with FingerPrint.save

          Teppo Kurki created issue -
          Teppo Kurki made changes -
          Attachment New: Thread Dump Analyzer screenshot.jpg [ 21651 ]

          Teppo Kurki added a comment -

          Clippings from a thread dump with evidence of the problem.

          Teppo Kurki added a comment - Clippings from a thread dump with evidence of the problem.
          Teppo Kurki made changes -
          Attachment New: threaddump_samples.txt [ 21652 ]

          Teppo Kurki added a comment -

          Suggestions for a fix

          • speed up serialization?
          • deferred write of the FingerPrint dependencies information
            • periodic flushing of dirty FingerPrints
            • create a copy of FingerPrint and queue that for writing, in effect making FingerPrint saving single threaded (which it sort of is already)
          • shard the XStream in FingerPrint: create an configurable-size array of XStream instances and use them in round robin/random order, spreading the locks over the instances

          Teppo Kurki added a comment - Suggestions for a fix speed up serialization? deferred write of the FingerPrint dependencies information periodic flushing of dirty FingerPrints create a copy of FingerPrint and queue that for writing, in effect making FingerPrint saving single threaded (which it sort of is already) shard the XStream in FingerPrint: create an configurable-size array of XStream instances and use them in round robin/random order, spreading the locks over the instances

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          xstream/src/java/com/thoughtworks/xstream/mapper/AnnotationMapper.java
          http://jenkins-ci.org/commit/xstream/bc2bd6df986cc54a90a013d014d4b1fa36a6e023
          Log:
          JENKINS-13154 removed hot lock contention

          annotatedTypes is used to keep track of classes whose annotations we've already processed. So on a sufficiently warm system, annotatedTypes.contains(type) should almost always return true. Thus we just need to move it out of the synchronization block to be able to eliminate the contention. For this, we use ConcurrentWeakHashMap instead of regular WeakHashMap.

          The processTypes method still needs to run in a synchronized block to prevent concurrent calls of processAnnotation with the same class to serialize their executions. It is safe to enter this synchronized block after the initial annotatedTypes.contains(type) check, as the processTypes method itself guards against repeated processing of the same type.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: xstream/src/java/com/thoughtworks/xstream/mapper/AnnotationMapper.java http://jenkins-ci.org/commit/xstream/bc2bd6df986cc54a90a013d014d4b1fa36a6e023 Log: JENKINS-13154 removed hot lock contention annotatedTypes is used to keep track of classes whose annotations we've already processed. So on a sufficiently warm system, annotatedTypes.contains(type) should almost always return true. Thus we just need to move it out of the synchronization block to be able to eliminate the contention. For this, we use ConcurrentWeakHashMap instead of regular WeakHashMap. The processTypes method still needs to run in a synchronized block to prevent concurrent calls of processAnnotation with the same class to serialize their executions. It is safe to enter this synchronized block after the initial annotatedTypes.contains(type) check, as the processTypes method itself guards against repeated processing of the same type.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          core/pom.xml
          http://jenkins-ci.org/commit/jenkins/2fbcc7300faf285e363e37c1b9c98074a1d910db
          Log:
          [FIXED JENKINS-13154] integrated the fix to XStream that removes the lock contention.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/pom.xml http://jenkins-ci.org/commit/jenkins/2fbcc7300faf285e363e37c1b9c98074a1d910db Log: [FIXED JENKINS-13154] integrated the fix to XStream that removes the lock contention.
          SCM/JIRA link daemon made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          xstream/src/java/com/thoughtworks/xstream/mapper/AnnotationMapper.java
          http://jenkins-ci.org/commit/xstream/0903e75b981bc3ee91995890de91b800aebe11e6
          Log:
          JENKINS-13154 No, the previous fix was incomplete.

          The type gets added to annotatedTypes before its annotations are actually processed, so it is possible for one thread to visit those annotations while another thread comes in and finds it already, then carry on even though the first thread hasn't finished visiting annotations.

          So I added a second WeakHashSet to guard against this problem.

          The serializedClass field also needs to be concurrent-safe, as we don't seem to lock access to it

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: xstream/src/java/com/thoughtworks/xstream/mapper/AnnotationMapper.java http://jenkins-ci.org/commit/xstream/0903e75b981bc3ee91995890de91b800aebe11e6 Log: JENKINS-13154 No, the previous fix was incomplete. The type gets added to annotatedTypes before its annotations are actually processed, so it is possible for one thread to visit those annotations while another thread comes in and finds it already, then carry on even though the first thread hasn't finished visiting annotations. So I added a second WeakHashSet to guard against this problem. The serializedClass field also needs to be concurrent-safe, as we don't seem to lock access to it

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          core/pom.xml
          http://jenkins-ci.org/commit/jenkins/8dbdf8887dfcc8f478e5b6e9a34acdffd8e4d19b
          Log:
          [FIXED JENKINS-13154] bumping up further to -11.

          ... to fix a possible race condition.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: core/pom.xml http://jenkins-ci.org/commit/jenkins/8dbdf8887dfcc8f478e5b6e9a34acdffd8e4d19b Log: [FIXED JENKINS-13154] bumping up further to -11. ... to fix a possible race condition.

            jglick Jesse Glick
            t_kurki Teppo Kurki
            Votes:
            4 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: