Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24457

VERY poor performance with large test history when viewing results.

      It takes about 20 sec to display any part of the test report when the json file is big enough and there are lots of previous builds.

          [JENKINS-24457] VERY poor performance with large test history when viewing results.

          James Nord added a comment -

          Can you be specific on exactly what is slow?

          Parsing the results in the build?
          Displaying a feature in the ui?
          Something else.

          If it us the display in the ui then 90% chance your jvm heap is too small. Try increasing it and report back.

          If not please attach a json file that is slow.

          We observe instant display in a correctly configured system with what i would consider as large features.

          James Nord added a comment - Can you be specific on exactly what is slow? Parsing the results in the build? Displaying a feature in the ui? Something else. If it us the display in the ui then 90% chance your jvm heap is too small. Try increasing it and report back. If not please attach a json file that is slow. We observe instant display in a correctly configured system with what i would consider as large features.

          Displaying in the ui is the thing which is super slow, parsing, looks OK.
          I've attached the file. JVM heap size cannot be easily increased because it's already quite optimal...

          Anatoly Bubenkov added a comment - Displaying in the ui is the thing which is super slow, parsing, looks OK. I've attached the file. JVM heap size cannot be easily increased because it's already quite optimal...

          James Nord added a comment - - edited

          That file is small.
          I really urgue you too look at jvmstat when trying to display as I bet you will see lots of gc activity.

          The plugin loads previous results, but Jenkins stores these in a weak reference and so are the first things to go. So when memory is low loading them up will cause other results to be gced so Jenkins ends up fighting the GC.

          I will run them through my system but I am expecting instant display.

          James Nord added a comment - - edited That file is small. I really urgue you too look at jvmstat when trying to display as I bet you will see lots of gc activity. The plugin loads previous results, but Jenkins stores these in a weak reference and so are the first things to go. So when memory is low loading them up will cause other results to be gced so Jenkins ends up fighting the GC. I will run them through my system but I am expecting instant display.

          James Nord added a comment -

          If I set a small enough heap I can make a machine spin forever trying to display the results.
          The same is true to some extent for junit results - but to a lesser degree as the files it loads are smaller.

          James Nord added a comment - If I set a small enough heap I can make a machine spin forever trying to display the results. The same is true to some extent for junit results - but to a lesser degree as the files it loads are smaller.

          ah ok!
          then which size would you recommend in our situation?

          http://pastebin.com/pxANWJBE

          Anatoly Bubenkov added a comment - ah ok! then which size would you recommend in our situation? http://pastebin.com/pxANWJBE

          Anatoly Bubenkov added a comment - - edited

          Anatoly Bubenkov added a comment - - edited and memory http://pastebin.com/1cDeRysb

          James Nord added a comment -

          this file has instant display on my system.

          I suggest you buy more RAM or upgrade the VM - 4GB is on the small side for a server and you only have 200 free.
          I would only suggest tuning so you have more young space at the expence of the old if you can not add more ram.
          99% usage of the eden is not so good.

          as an asside - the results are missing timing information - you may want to file a defect against the cucumber-xxx version you are using.

          James Nord added a comment - this file has instant display on my system. I suggest you buy more RAM or upgrade the VM - 4GB is on the small side for a server and you only have 200 free. I would only suggest tuning so you have more young space at the expence of the old if you can not add more ram. 99% usage of the eden is not so good. as an asside - the results are missing timing information - you may want to file a defect against the cucumber-xxx version you are using.

          James Nord added a comment -

          as per above - not enough free memory causing GC churn on sysytem

          James Nord added a comment - as per above - not enough free memory causing GC churn on sysytem

          thanks for all your help!
          Results are missing timing because producing them is still work-in-progress. We are generating them from pytest-bdd
          https://github.com/olegpidsadnyi/pytest-bdd#reporting

          Anatoly Bubenkov added a comment - thanks for all your help! Results are missing timing because producing them is still work-in-progress. We are generating them from pytest-bdd https://github.com/olegpidsadnyi/pytest-bdd#reporting

          James Nord added a comment -

          ahh cool.

          James Nord added a comment - ahh cool.

          Daniel Beck added a comment -

          Doesn't the free output show that there's plenty of free RAM? The OS would give up caches to applications immediately if needed...

          Daniel Beck added a comment - Doesn't the free output show that there's plenty of free RAM? The OS would give up caches to applications immediately if needed...

          well, i tried increasing heap size, also increasing young/old ratio, still same issue

          -Djava.awt.headless=true -Xmx1512m -XX:MaxPermSize=1024m -XX:NewRatio=1 -XX:SurvivorRatio=6

          is what i have now
          any suggestions how to tune it better?

          Anatoly Bubenkov added a comment - well, i tried increasing heap size, also increasing young/old ratio, still same issue -Djava.awt.headless=true -Xmx1512m -XX:MaxPermSize=1024m -XX:NewRatio=1 -XX:SurvivorRatio=6 is what i have now any suggestions how to tune it better?

          Daniel Beck added a comment -

          Is this a 32 bit JRE?

          Daniel Beck added a comment - Is this a 32 bit JRE?

          anatoly@ci-server:~$ java -version
          java version "1.7.0_67"
          Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
          Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

          Anatoly Bubenkov added a comment - anatoly@ci-server:~$ java -version java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

          @teilo, are you sure there's no issue in the ui code?
          It looks really strange that we had no issues with other reports...

          Anatoly Bubenkov added a comment - @teilo, are you sure there's no issue in the ui code? It looks really strange that we had no issues with other reports...

          root report view:

          wget 0.00s user 0.00s system 0% cpu 2:40.45 total

          it's insanely slow...

          Anatoly Bubenkov added a comment - root report view: wget 0.00s user 0.00s system 0% cpu 2:40.45 total it's insanely slow...

          i was initially wrong about 20 sec, it's much slower

          Anatoly Bubenkov added a comment - i was initially wrong about 20 sec, it's much slower

          James Nord added a comment -

          on my server using the supplied json.

          time curl http://myserver/job/tests/job/JENKINS-24457/1/testReport/

          real 0m0.290s
          user 0m0.015s
          sys 0m0.046s

          So yes I think the code is ok.

          try setting initial heap == max heap (or something higher)

          The VM doesn't always expand the heap when a GC frees memory.

          I see exactly this issue in the unit tests if the VM doesn't have enough heap and it is caused by GC removing the test result which the plugin needs in order to render the plugin - so each and every scenario requires the test result to be loaded again and again and again and again...

          @danielbeck - yes there is free server RAM if you throw away your disk cache - but something heavy on disk like jenkins that's not a good idea in my experience.

          James Nord added a comment - on my server using the supplied json. time curl http://myserver/job/tests/job/JENKINS-24457/1/testReport/ real 0m0.290s user 0m0.015s sys 0m0.046s So yes I think the code is ok. try setting initial heap == max heap (or something higher) The VM doesn't always expand the heap when a GC frees memory. I see exactly this issue in the unit tests if the VM doesn't have enough heap and it is caused by GC removing the test result which the plugin needs in order to render the plugin - so each and every scenario requires the test result to be loaded again and again and again and again... @danielbeck - yes there is free server RAM if you throw away your disk cache - but something heavy on disk like jenkins that's not a good idea in my experience.

          ok now it gives 15.012 total for root report
          and 5.042 total for scenario report
          still not so good, but thanks!

          Anatoly Bubenkov added a comment - ok now it gives 15.012 total for root report and 5.042 total for scenario report still not so good, but thanks!

          James Nord added a comment -

          Will. Attempt to reproduce

          James Nord added a comment - Will. Attempt to reproduce

          thanks!
          and sorry for annoying me

          Anatoly Bubenkov added a comment - thanks! and sorry for annoying me

          James Nord added a comment -

          Comment from reporter:

          "This only happens if there is a large (100+) build history"

          This may indicating a recursive load issue - re-opened to investigate.

          James Nord added a comment - Comment from reporter: "This only happens if there is a large (100+) build history" This may indicating a recursive load issue - re-opened to investigate.

          I can share a result of our investigation.

          The poor performance of Cucumber Test Result Plugin caused by the logic of displaying test results.
          Each time when users are visiting Cucumber Test Result page, Cucumber plugin loads previous build results (for calculating age of failed tests).
          In case if current build is success plugin load only one previous result, but in case if a build is failed Cucumber plugin will load previous recursively for calculating a Scenario fail age.

          This behavior has issues:
          1. Method getResult() is synchronized (This mean that only one user can load report for one job).
          2. Cucumber plugin is trying to cache test result in WeakReference class field.
          3. Calculating of fail age has not optimized algorithm. Cucumber plugin calculates fail age separately for each scenario. That's mean that for 10 failed scenarios in one build, the plugin will try to open (or read from a cached field) the XML results 10 times.

          You can see here screenshot with visual VM. When I tried to open a report with 10 failed scenarios in a row Cucumber Plugin is used ~200mb and ~20% of CPU for GC. (Totally we have 868 scenarios)

          Vitaliy Skrypnyk added a comment - I can share a result of our investigation. The poor performance of Cucumber Test Result Plugin caused by the logic of displaying test results. Each time when users are visiting Cucumber Test Result page, Cucumber plugin loads previous build results (for calculating age of failed tests). In case if current build is success plugin load only one previous result, but in case if a build is failed Cucumber plugin will load previous recursively for calculating a Scenario fail age. This behavior has issues: 1. Method getResult() is synchronized (This mean that only one user can load report for one job). 2. Cucumber plugin is trying to cache test result in WeakReference class field. 3. Calculating of fail age has not optimized algorithm. Cucumber plugin calculates fail age separately for each scenario. That's mean that for 10 failed scenarios in one build, the plugin will try to open (or read from a cached field) the XML results 10 times. You can see here screenshot with visual VM. When I tried to open a report with 10 failed scenarios in a row Cucumber Plugin is used ~200mb and ~20% of CPU for GC. (Totally we have 868 scenarios)

          James Nord added a comment -

          vitaliy_skrypnyk to be clear you are 100% correct and this is by design. There is no guarantee that previous builds have finished when the later build completes. This is the same in the junit plugin.

          The slow behavior is due to Memory pressure - if you had more memory available then the prior results would be in memory and would not need to be loaded and parsed - although there is potential that some thing could be improved in this area.

          > Cucumber plugin calculates fail age separately for each scenario

          this is still required as you do not need to view all scenarios - they can be viewed individually, so calculating this for all scenarios when you are viewing a single scenario will be equally sub-optimal.

          James Nord added a comment - vitaliy_skrypnyk to be clear you are 100% correct and this is by design. There is no guarantee that previous builds have finished when the later build completes. This is the same in the junit plugin. The slow behavior is due to Memory pressure - if you had more memory available then the prior results would be in memory and would not need to be loaded and parsed - although there is potential that some thing could be improved in this area. > Cucumber plugin calculates fail age separately for each scenario this is still required as you do not need to view all scenarios - they can be viewed individually, so calculating this for all scenarios when you are viewing a single scenario will be equally sub-optimal.

            Unassigned Unassigned
            bubenkoff Anatoly Bubenkov
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: