Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54999

Performance issue due to the bundle anonymization feature of the support-core plugin

      The CPU usage is peaking when filtering is enabled.

      [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
        [enabled]true[/enabled]
      [/com.cloudbees.jenkins.support.filter.ContentFilters]
      

      The logs are displaying stacktraces related to this anonymization

          at java.util.regex.Pattern$Start.match(Pattern.java:3463)
          at java.util.regex.Matcher.search(Matcher.java:1248)
          at java.util.regex.Matcher.find(Matcher.java:637)
          at java.util.regex.Matcher.replaceAll(Matcher.java:951)
          at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
          at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
          at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
          at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
          at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
      

      Workaround
      Disable the bundles anonymisation from the global settings

          [JENKINS-54999] Performance issue due to the bundle anonymization feature of the support-core plugin

          Francisco Fernández created issue -
          Francisco Fernández made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          p young made changes -
          Priority Original: Critical [ 2 ] New: Major [ 3 ]
          Francisco Fernández made changes -
          Description Original: {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}
          New:
          {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}

          *Workaround*
          Francisco Fernández made changes -
          Description Original:
          {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}

          *Workaround*
          New: The CPU usage is peaking when filtering is enabled.
          {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          The stack trace is
          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}

          *Workaround*
          Disable the bundles anonymisation from the global settings
          Francisco Fernández made changes -
          Description Original: The CPU usage is peaking when filtering is enabled.
          {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          The stack trace is
          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}

          *Workaround*
          Disable the bundles anonymisation from the global settings
          New: The CPU usage is peaking when filtering is enabled.
          {code:java}
          [com.cloudbees.jenkins.support.filter.ContentFilters plugin="support-core@2.50"]
            [enabled]true[/enabled]
          [/com.cloudbees.jenkins.support.filter.ContentFilters]
          {code}

          The logs are displaying stacktraces related to this anonymization
          {code:java}
              at java.util.regex.Pattern$Start.match(Pattern.java:3463)
              at java.util.regex.Matcher.search(Matcher.java:1248)
              at java.util.regex.Matcher.find(Matcher.java:637)
              at java.util.regex.Matcher.replaceAll(Matcher.java:951)
              at com.cloudbees.jenkins.support.filter.ContentMapping.filter(ContentMapping.java:96)
              at com.cloudbees.jenkins.support.filter.SensitiveContentFilter.filter(SensitiveContentFilter.java:56)
              at com.cloudbees.jenkins.support.filter.AllContentFilters.filter(AllContentFilters.java:43)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.filterFlushLines(FilteredOutputStream.java:185)
              at com.cloudbees.jenkins.support.filter.FilteredOutputStream.write(FilteredOutputStream.java:125)
          {code}

          *Workaround*
          Disable the bundles anonymisation from the global settings
          Francisco Fernández made changes -
          Link New: This issue is related to JENKINS-21670 [ JENKINS-21670 ]

          After analysing the issue I've seen that the CPU usage is high even in support-core-2.47, which is the version previous to anonymization since it's moving huge files.
          Anonymization implies the process of huge files such as logs. The difference is the process time needed to anonymize those files. That process is performed line per line and applies some filter to each line of each log and file, so the anonymization takes a long time. This heavy process during such a long time makes the instances unresponsive. Where can we touch to improve the performance?

          1. InetAddressContentFilter search for any String that is susceptible to be an IP address and create an new mapping object. Although the mapping existed previously, the mapping file is saved all the times by the ContentMappings class and not only when the mapping is a new one. That's a bunch of unnecessary disk accesses on write mode.
          2. Both InetAddressContentFilter and SensitiveContentFilter are using the method ContentMapping#filter which executes the Pattern#matcher and therefore compiles the pattern every time the filter is executed. This is a heavy operation and as the filter is executed line per line the process needs a long time to be executed. Since the mapping has already got the replacement String, we can substitute the Pattern objects with StringUtils invocations.
          3. As the logs are showing, the filtering is taking place inside the OutputStream (FilteredOutputStream and FilteredWriter) objects that are writing the content of the files in the final zip file. It's less expensive (from the PoV of the process time) filter the content previously to write it on the zip file. The more Stream are needed, the more expensive is the performance.

          Francisco Fernández added a comment - After analysing the issue I've seen that the CPU usage is high even in support-core-2.47 , which is the version previous to anonymization since it's moving huge files. Anonymization implies the process of huge files such as logs. The difference is the process time needed to anonymize those files. That process is performed line per line and applies some filter to each line of each log and file, so the anonymization takes a long time. This heavy process during such a long time makes the instances unresponsive. Where can we touch to improve the performance? InetAddressContentFilter search for any String that is susceptible to be an IP address and create an new mapping object. Although the mapping existed previously, the mapping file is saved all the times by the ContentMappings class and not only when the mapping is a new one. That's a bunch of unnecessary disk accesses on write mode. Both InetAddressContentFilter and SensitiveContentFilter are using the method ContentMapping#filter which executes the Pattern#matcher and therefore compiles the pattern every time the filter is executed. This is a heavy operation and as the filter is executed line per line the process needs a long time to be executed. Since the mapping has already got the replacement String, we can substitute the Pattern objects with StringUtils invocations. As the logs are showing, the filtering is taking place inside the OutputStream ( FilteredOutputStream and FilteredWriter ) objects that are writing the content of the files in the final zip file. It's less expensive (from the PoV of the process time) filter the content previously to write it on the zip file. The more Stream are needed, the more expensive is the performance.
          Francisco Fernández made changes -
          Status Original: In Progress [ 3 ] New: In Review [ 10005 ]
          Francisco Fernández made changes -
          Remote Link New: This issue links to "PR#158 (Web Link)" [ 22077 ]

            fcojfernandez Francisco Fernández
            fcojfernandez Francisco Fernández
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: