• Evergreen - Milestone 1

      See here and  here

      GitPluginTest is currently failing in the ATH since build number 72

          [JENKINS-50790] GitPluginTest are failing in the ATH

          Raul Arabaolaza added a comment - - edited

          I have seen other environments when this has worked (with the exception of GitPluginTest#create_tag_for_build ) which has been failing for a lot of time So I am running this locally to see if this could be an infra problem or is related to tests

          Raul Arabaolaza added a comment - - edited I have seen other environments when this has worked (with the exception of GitPluginTest#create_tag_for_build ) which has been failing for a lot of time So I am running this locally to see if this could be an infra problem or is related to tests

          It runs perfectly locally without using the ath docker container, trying with that

          Raul Arabaolaza added a comment - It runs perfectly locally without using the ath docker container, trying with that

          It fails to me locally when using the ath docker image to run the tests

          Raul Arabaolaza added a comment - It fails to me locally when using the ath docker image to run the tests

          Raul Arabaolaza added a comment - - edited

          It seems the problem is a timeout trying to load the available plugins page, I am going to debug locally with the docker image to get an idea of what is happening, but seems not only related to git plugin but also other plugins in the last ATH runs

           

          Confirmed that is failing when trying to install plugins via @WithPlugins rule, after adding the MockUpdateCenter and clicking check now it tries to visit the available plugins page and there is where the timeout happens, it seems it takes too much to load, need to properly confirm that but seems a performance issue

          Raul Arabaolaza added a comment - - edited It seems the problem is a timeout trying to load the available plugins page, I am going to debug locally with the docker image to get an idea of what is happening, but seems not only related to git plugin but also other plugins in the last ATH runs   Confirmed that is failing when trying to install plugins via @WithPlugins rule, after adding the MockUpdateCenter and clicking check now it tries to visit the available plugins page and there is where the timeout happens, it seems it takes too much to load, need to properly confirm that but seems a performance issue

          Raul Arabaolaza added a comment - - edited

          Confirmed the performance issue, by increasing the load timeout the tests pass, I have also checked that loading the list of available plugins takes a lot of time, that is the reason for the timeout, checking now if I can reduce that load time as I do not like to increase timeouts. The fact that it works perfectly fine without the local container may indicate that some extra resources are needed and not a change in the code.

           

          This is probably impacting other tests also

          cc olivergondza

          Raul Arabaolaza added a comment - - edited Confirmed the performance issue, by increasing the load timeout the tests pass, I have also checked that loading the list of available plugins takes a lot of time, that is the reason for the timeout, checking now if I can reduce that load time as I do not like to increase timeouts. The fact that it works perfectly fine without the local container may indicate that some extra resources are needed and not a change in the code.   This is probably impacting other tests also cc  olivergondza

          Right, a lot of the tests started to fail once moved from jenkins.ci.cloudbees.com to ci.jenkins.io (ATH is run in container now): https://ci.jenkins.io/job/Core/job/jenkins/job/stable-2.107/lastCompletedBuild/testReport/.

           

          If you do have the resources to investigate and fix, it would be great. If not, I am ok increasing the timeout to get reliable build.

          Oliver Gondža added a comment - Right, a lot of the tests started to fail once moved from jenkins.ci.cloudbees.com to ci.jenkins.io (ATH is run in container now): https://ci.jenkins.io/job/Core/job/jenkins/job/stable-2.107/lastCompletedBuild/testReport/.   If you do have the resources to investigate and fix, it would be great. If not, I am ok increasing the timeout to get reliable build.

          I will try to investigate and fix if possible and will increase timeout as a last resort, thanks!

          Raul Arabaolaza added a comment - I will try to investigate and fix if possible and will increase timeout as a last resort, thanks!

          Trying to use `ElasticTime.factor` to increase the timeout without impacting the java code, if this works I am going to create a PR to the ath-container script to make sure the tests in the container are run with the appropriate factor

          Raul Arabaolaza added a comment - Trying to use `ElasticTime.factor` to increase the timeout without impacting the java code, if this works I am going to create a PR to the ath-container script to make sure the tests in the container are run with the appropriate factor

          Do we have an idea if everything is slowed down or just the plugin page? That would suggest if your approach is the correct one or not...

          Oliver Gondža added a comment - Do we have an idea if everything is slowed down or just the plugin page? That would suggest if your approach is the correct one or not...

          The ElasticTime.factor seems to fix the things for me locally, also GitPluginTest#create_tag_for_build is failing just because there is no git username and email defined inside the container.

          I believe I can do a PR to solve this once I get rid of some urgent thing so probably nothing until tomorrow or next day 

          Raul Arabaolaza added a comment - The ElasticTime.factor seems to fix the things for me locally, also GitPluginTest#create_tag_for_build is failing just because there is no git username and email defined inside the container. I believe I can do a PR to solve this once I get rid of some urgent thing so probably nothing until tomorrow or next day 

          olivergondza You are right, for the moment the only slowness I have found is located in the plugin management page in the available section, so increasing all timeouts via ElasticTime.factor is probably not the best idea, I am going to just increase the page load timeout temporarily on that part and see if this still works, if it doesn't then I will keep on the ElasticTime.factor approach  

          Raul Arabaolaza added a comment - olivergondza You are right, for the moment the only slowness I have found is located in the plugin management page in the available section, so increasing all timeouts via ElasticTime.factor is probably not the best idea, I am going to just increase the page load timeout temporarily on that part and see if this still works, if it doesn't then I will keep on the ElasticTime.factor approach  

          It seems the PR went fine and solved not only GitPluginTest but also other affected by the same problem, waiting for the PR to be merged to close this 

          Raul Arabaolaza added a comment - It seems the PR went fine and solved not only GitPluginTest but also other affected by the same problem, waiting for the PR to be merged to close this 

          Code changed in jenkins
          User: Raul Arabaolaza
          Path:
          src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
          src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
          http://jenkins-ci.org/commit/acceptance-test-harness/012364d1a5f6386fab49198c16d00e0cd309a7d9
          Log:
          JENKINS-50790 Temporarily override pageLoadTimeout for available plugins

          This should fix most of GitPluginTest and some others that are failing due to the
          available plugins taking too much time to load on ci.jenkins.io

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java http://jenkins-ci.org/commit/acceptance-test-harness/012364d1a5f6386fab49198c16d00e0cd309a7d9 Log: JENKINS-50790 Temporarily override pageLoadTimeout for available plugins This should fix most of GitPluginTest and some others that are failing due to the available plugins taking too much time to load on ci.jenkins.io

          Code changed in jenkins
          User: Raul Arabaolaza
          Path:
          src/test/java/plugins/GitPluginTest.java
          http://jenkins-ci.org/commit/acceptance-test-harness/81589c3da0b5788917dfc2faaaec78b89c4edf27
          Log:
          JENKINS-50790 Fix GitPluginTest#create_tag_for_build

          The docker container where the ATH is run doe snot provide a global git
          user or mail, so I added the Custom Name and Mail option to the test so
          it can properly tag things

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/test/java/plugins/GitPluginTest.java http://jenkins-ci.org/commit/acceptance-test-harness/81589c3da0b5788917dfc2faaaec78b89c4edf27 Log: JENKINS-50790 Fix GitPluginTest#create_tag_for_build The docker container where the ATH is run doe snot provide a global git user or mail, so I added the Custom Name and Mail option to the test so it can properly tag things

          Code changed in jenkins
          User: Raul Arabaolaza
          Path:
          src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
          src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
          http://jenkins-ci.org/commit/acceptance-test-harness/7ba7a3983ac1d09da5ded9ec134f205e7383782c
          Log:
          JENKINS-50790 Address feedback

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java http://jenkins-ci.org/commit/acceptance-test-harness/7ba7a3983ac1d09da5ded9ec134f205e7383782c Log: JENKINS-50790 Address feedback

          Code changed in jenkins
          User: Oliver Gondža
          Path:
          src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
          src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
          src/test/java/plugins/GitPluginTest.java
          http://jenkins-ci.org/commit/acceptance-test-harness/dad333092159cb368efc2f9869572f0a05d255ac
          Log:
          Merge pull request #428 from raul-arabaolaza/JENKINS-50790

          JENKINS-50790 Temporarily override pageLoadTimeout for available plugins

          Compare: https://github.com/jenkinsci/acceptance-test-harness/compare/ab1fb4738cb0...dad333092159

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java src/test/java/plugins/GitPluginTest.java http://jenkins-ci.org/commit/acceptance-test-harness/dad333092159cb368efc2f9869572f0a05d255ac Log: Merge pull request #428 from raul-arabaolaza/ JENKINS-50790 JENKINS-50790 Temporarily override pageLoadTimeout for available plugins Compare: https://github.com/jenkinsci/acceptance-test-harness/compare/ab1fb4738cb0...dad333092159

          Isa Vilacides added a comment -

          Months ago I tried to fix the same problem with increasing the timeout and it didn't solve all the timeout issues, see the PR [here|https://github.com/jenkinsci/acceptance-test-harness/pull/408/files. I also see on your PR that after increasing the timeout, there are still tests failing with the same error, so I think that the root cause of the long page loading has not been determined and we haven't really tackled it. On investigation I realised that there is a JS error on "Available" tab that might be preventing the page from loading and causing all these issues.

          It could also be that the click on the available tab happens before the advanced tab has finished loading and it seems like webdriver.get returns as soon as the page fires the load event but the page hasn’t actually finished loading.

          Isa Vilacides added a comment - Months ago I tried to fix the same problem with increasing the timeout and it didn't solve all the timeout issues, see the PR [here|https://github.com/jenkinsci/acceptance-test-harness/pull/408/files. I also see on your PR that after increasing the timeout, there are still tests failing with the same error, so I think that the root cause of the long page loading has not been determined and we haven't really tackled it. On investigation I realised that there is a JS error on "Available" tab that might be preventing the page from loading and causing all these issues. It could also be that the click on the available tab happens before the advanced tab has finished loading and it seems like webdriver.get returns as soon as the page fires the load event but the page hasn’t actually finished loading.

          See here it seems that the ATH is still flaky, I suspect due to load

          Raul Arabaolaza added a comment - See here it seems that the ATH is still flaky, I suspect due to load

          Raul Arabaolaza added a comment - - edited

          jglick As you are the one that did the original implementation of MockUpdateCenter can you imagine a reason why it could have a performance degradation under core >= 2.112 with heavy load?

          Raul Arabaolaza added a comment - - edited jglick As you are the one that did the original implementation of MockUpdateCenter can you imagine a reason why it could have a performance degradation under core >= 2.112 with heavy load?

          The delay in the rendering of the available plugins page is obvious even running locally (not in the ATH, just starting the war file) when you do not have any plugin installed, that rendering time (which goes worst which heavy load) is the root cause of this issue,

          Raul Arabaolaza added a comment - The delay in the rendering of the available plugins page is obvious even running locally (not in the ATH, just starting the war file) when you do not have any plugin installed, that rendering time (which goes worst which heavy load) is the root cause of this issue,

          I have performed a little experimentation on load time for the available plugins, when running the ATH and using the MockUpdateCenter the call to get all available plugins oscillates between 25 and 56 seconds running locally, when not using the MockUpdateCenter it takes less than 10 seconds.

          I am starting to believe that the only way to solve this (as I can not increase the page load timeout forever) is to use groovy scripts to install plugins instead of doing directly by the UI.

          Raul Arabaolaza added a comment - I have performed a little experimentation on load time for the available plugins, when running the ATH and using the MockUpdateCenter the call to get all available plugins oscillates between 25 and 56 seconds running locally, when not using the MockUpdateCenter it takes less than 10 seconds. I am starting to believe that the only way to solve this (as I can not increase the page load timeout forever) is to use groovy scripts to install plugins instead of doing directly by the UI.

          I am experimenting with skipping the available plugins page and just request the installation via a REST call, initial experiments seem promising

          Raul Arabaolaza added a comment - I am experimenting with skipping the available plugins page and just request the installation via a REST call, initial experiments seem promising

          Jesse Glick added a comment -

          There have been past attempts to load plugins via REST rather than GUI, and that code is still there; you just need a switch to activate it. It does not work all that well as the plugin manager has some subtle logic which is not easily replicated. I would rather just track down the cause of delays and fix them. Run a profiler or whatever. Is the problem in JavaScript or the mock UC service itself?

          Jesse Glick added a comment - There have been past attempts to load plugins via REST rather than GUI, and that code is still there; you just need a switch to activate it. It does not work all that well as the plugin manager has some subtle logic which is not easily replicated. I would rather just track down the cause of delays and fix them. Run a profiler or whatever. Is the problem in JavaScript or the mock UC service itself?

          What I can see is the http call to pluginManager/available performed by the UI when clicking on the Available tab the one that takes too much time. My understanding is that the endpoint calls the mock UC under the covers, as I have checked that same call with an instance run via java -jar (outside ATH) and it takes much less time ~ 10 seconds

          Raul Arabaolaza added a comment - What I can see is the http call to pluginManager/available performed by the UI when clicking on the Available tab the one that takes too much time. My understanding is that the endpoint calls the mock UC under the covers, as I have checked that same call with an instance run via java -jar (outside ATH) and it takes much less time ~ 10 seconds

          Maybe you refer to this code path? I am not going that way, I am making a request to trigger the installation via UC instead of using the buttons in the `Available Plugins Page` and let it handle dependencies and all that stuff, not sure however about versioning constraints at this point

          Raul Arabaolaza added a comment - Maybe you refer to this code path ? I am not going that way, I am making a request to trigger the installation via UC instead of using the buttons in the `Available Plugins Page` and let it handle dependencies and all that stuff, not sure however about versioning constraints at this point

          So after a conversation with jglick it seems the problem is in the core itself since 2.112, the addition of detached plugins causing an exponential explosion in a recursive call. He is working on a fix, in the meanwhile a temporary increase in the timeout should be enough to make everything stable again

          Raul Arabaolaza added a comment - So after a conversation with jglick it seems the problem is in the core itself since 2.112, the addition of detached plugins causing an exponential explosion in a recursive call. He is working on a fix, in the meanwhile a temporary increase in the timeout should be enough to make everything stable again

          Not sure if the fix in JENKINS-51205 is enough, I am still receiving email alerts about instability on git-plugin, however at this moment ci.jenkins.io is down so I can not check the status

          Raul Arabaolaza added a comment - Not sure if the fix in JENKINS-51205 is enough, I am still receiving email alerts about instability on git-plugin, however at this moment ci.jenkins.io is down so I can not check the status

          It seems it worked, I am going to get this open for a pair of days and close if nothing new arises

          Raul Arabaolaza added a comment - It seems it worked, I am going to get this open for a pair of days and close if nothing new arises

          Jesse Glick added a comment -

          Makes sense.

          Jesse Glick added a comment - Makes sense.

          Last builds of git-plugin have no failures, so I am going to close this.

          Raul Arabaolaza added a comment - Last builds of git-plugin have no failures, so I am going to close this.

          The fix for JENKINS-51205 solved this problem

          Raul Arabaolaza added a comment - The fix for JENKINS-51205 solved this problem

            rarabaolaza Raul Arabaolaza
            rarabaolaza Raul Arabaolaza
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: