Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50790

GitPluginTest are failing in the ATH

    XMLWordPrintable

    Details

    • Similar Issues:
    • Sprint:
      Evergreen - Milestone 1

      Description

      See here and  here

      GitPluginTest is currently failing in the ATH since build number 72

        Attachments

          Issue Links

            Activity

            rarabaolaza Raul Arabaolaza created issue -
            rarabaolaza Raul Arabaolaza made changes -
            Field Original Value New Value
            Epic Link JENKINS-50534 [ 189601 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment - - edited

            I have seen other environments when this has worked (with the exception of GitPluginTest#create_tag_for_build ) which has been failing for a lot of time So I am running this locally to see if this could be an infra problem or is related to tests

            Show
            rarabaolaza Raul Arabaolaza added a comment - - edited I have seen other environments when this has worked (with the exception of GitPluginTest#create_tag_for_build ) which has been failing for a lot of time So I am running this locally to see if this could be an infra problem or is related to tests
            rtyler R. Tyler Croy made changes -
            Assignee R. Tyler Croy [ rtyler ] Raul Arabaolaza [ rarabaolaza ]
            rarabaolaza Raul Arabaolaza made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            It runs perfectly locally without using the ath docker container, trying with that

            Show
            rarabaolaza Raul Arabaolaza added a comment - It runs perfectly locally without using the ath docker container, trying with that
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            It fails to me locally when using the ath docker image to run the tests

            Show
            rarabaolaza Raul Arabaolaza added a comment - It fails to me locally when using the ath docker image to run the tests
            rarabaolaza Raul Arabaolaza made changes -
            Summary GitPluginTest is failing in the ATH GitPluginTest are failing in the ATH
            Hide
            rarabaolaza Raul Arabaolaza added a comment - - edited

            It seems the problem is a timeout trying to load the available plugins page, I am going to debug locally with the docker image to get an idea of what is happening, but seems not only related to git plugin but also other plugins in the last ATH runs

             

            Confirmed that is failing when trying to install plugins via @WithPlugins rule, after adding the MockUpdateCenter and clicking check now it tries to visit the available plugins page and there is where the timeout happens, it seems it takes too much to load, need to properly confirm that but seems a performance issue

            Show
            rarabaolaza Raul Arabaolaza added a comment - - edited It seems the problem is a timeout trying to load the available plugins page, I am going to debug locally with the docker image to get an idea of what is happening, but seems not only related to git plugin but also other plugins in the last ATH runs   Confirmed that is failing when trying to install plugins via @WithPlugins rule, after adding the MockUpdateCenter and clicking check now it tries to visit the available plugins page and there is where the timeout happens, it seems it takes too much to load, need to properly confirm that but seems a performance issue
            Hide
            rarabaolaza Raul Arabaolaza added a comment - - edited

            Confirmed the performance issue, by increasing the load timeout the tests pass, I have also checked that loading the list of available plugins takes a lot of time, that is the reason for the timeout, checking now if I can reduce that load time as I do not like to increase timeouts. The fact that it works perfectly fine without the local container may indicate that some extra resources are needed and not a change in the code.

             

            This is probably impacting other tests also

            cc Oliver Gondža

            Show
            rarabaolaza Raul Arabaolaza added a comment - - edited Confirmed the performance issue, by increasing the load timeout the tests pass, I have also checked that loading the list of available plugins takes a lot of time, that is the reason for the timeout, checking now if I can reduce that load time as I do not like to increase timeouts. The fact that it works perfectly fine without the local container may indicate that some extra resources are needed and not a change in the code.   This is probably impacting other tests also cc  Oliver Gondža
            Hide
            olivergondza Oliver Gondža added a comment -

            Right, a lot of the tests started to fail once moved from jenkins.ci.cloudbees.com to ci.jenkins.io (ATH is run in container now): https://ci.jenkins.io/job/Core/job/jenkins/job/stable-2.107/lastCompletedBuild/testReport/.

             

            If you do have the resources to investigate and fix, it would be great. If not, I am ok increasing the timeout to get reliable build.

            Show
            olivergondza Oliver Gondža added a comment - Right, a lot of the tests started to fail once moved from jenkins.ci.cloudbees.com to ci.jenkins.io (ATH is run in container now): https://ci.jenkins.io/job/Core/job/jenkins/job/stable-2.107/lastCompletedBuild/testReport/.   If you do have the resources to investigate and fix, it would be great. If not, I am ok increasing the timeout to get reliable build.
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            I will try to investigate and fix if possible and will increase timeout as a last resort, thanks!

            Show
            rarabaolaza Raul Arabaolaza added a comment - I will try to investigate and fix if possible and will increase timeout as a last resort, thanks!
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            Trying to use `ElasticTime.factor` to increase the timeout without impacting the java code, if this works I am going to create a PR to the ath-container script to make sure the tests in the container are run with the appropriate factor

            Show
            rarabaolaza Raul Arabaolaza added a comment - Trying to use `ElasticTime.factor` to increase the timeout without impacting the java code, if this works I am going to create a PR to the ath-container script to make sure the tests in the container are run with the appropriate factor
            Hide
            olivergondza Oliver Gondža added a comment -

            Do we have an idea if everything is slowed down or just the plugin page? That would suggest if your approach is the correct one or not...

            Show
            olivergondza Oliver Gondža added a comment - Do we have an idea if everything is slowed down or just the plugin page? That would suggest if your approach is the correct one or not...
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            The ElasticTime.factor seems to fix the things for me locally, also GitPluginTest#create_tag_for_build is failing just because there is no git username and email defined inside the container.

            I believe I can do a PR to solve this once I get rid of some urgent thing so probably nothing until tomorrow or next day 

            Show
            rarabaolaza Raul Arabaolaza added a comment - The ElasticTime.factor seems to fix the things for me locally, also GitPluginTest#create_tag_for_build is failing just because there is no git username and email defined inside the container. I believe I can do a PR to solve this once I get rid of some urgent thing so probably nothing until tomorrow or next day 
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            Oliver Gondža You are right, for the moment the only slowness I have found is located in the plugin management page in the available section, so increasing all timeouts via ElasticTime.factor is probably not the best idea, I am going to just increase the page load timeout temporarily on that part and see if this still works, if it doesn't then I will keep on the ElasticTime.factor approach  

            Show
            rarabaolaza Raul Arabaolaza added a comment - Oliver Gondža You are right, for the moment the only slowness I have found is located in the plugin management page in the available section, so increasing all timeouts via ElasticTime.factor is probably not the best idea, I am going to just increase the page load timeout temporarily on that part and see if this still works, if it doesn't then I will keep on the ElasticTime.factor approach  
            rarabaolaza Raul Arabaolaza made changes -
            Remote Link This issue links to "PR-428 (Web Link)" [ 20447 ]
            rarabaolaza Raul Arabaolaza made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            It seems the PR went fine and solved not only GitPluginTest but also other affected by the same problem, waiting for the PR to be merged to close this 

            Show
            rarabaolaza Raul Arabaolaza added a comment - It seems the PR went fine and solved not only GitPluginTest but also other affected by the same problem, waiting for the PR to be merged to close this 
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Raul Arabaolaza
            Path:
            src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
            src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
            http://jenkins-ci.org/commit/acceptance-test-harness/012364d1a5f6386fab49198c16d00e0cd309a7d9
            Log:
            JENKINS-50790 Temporarily override pageLoadTimeout for available plugins

            This should fix most of GitPluginTest and some others that are failing due to the
            available plugins taking too much time to load on ci.jenkins.io

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java http://jenkins-ci.org/commit/acceptance-test-harness/012364d1a5f6386fab49198c16d00e0cd309a7d9 Log: JENKINS-50790 Temporarily override pageLoadTimeout for available plugins This should fix most of GitPluginTest and some others that are failing due to the available plugins taking too much time to load on ci.jenkins.io
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Raul Arabaolaza
            Path:
            src/test/java/plugins/GitPluginTest.java
            http://jenkins-ci.org/commit/acceptance-test-harness/81589c3da0b5788917dfc2faaaec78b89c4edf27
            Log:
            JENKINS-50790 Fix GitPluginTest#create_tag_for_build

            The docker container where the ATH is run doe snot provide a global git
            user or mail, so I added the Custom Name and Mail option to the test so
            it can properly tag things

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/test/java/plugins/GitPluginTest.java http://jenkins-ci.org/commit/acceptance-test-harness/81589c3da0b5788917dfc2faaaec78b89c4edf27 Log: JENKINS-50790 Fix GitPluginTest#create_tag_for_build The docker container where the ATH is run doe snot provide a global git user or mail, so I added the Custom Name and Mail option to the test so it can properly tag things
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Raul Arabaolaza
            Path:
            src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
            src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
            http://jenkins-ci.org/commit/acceptance-test-harness/7ba7a3983ac1d09da5ded9ec134f205e7383782c
            Log:
            JENKINS-50790 Address feedback

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Raul Arabaolaza Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java http://jenkins-ci.org/commit/acceptance-test-harness/7ba7a3983ac1d09da5ded9ec134f205e7383782c Log: JENKINS-50790 Address feedback
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Oliver Gondža
            Path:
            src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java
            src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java
            src/test/java/plugins/GitPluginTest.java
            http://jenkins-ci.org/commit/acceptance-test-harness/dad333092159cb368efc2f9869572f0a05d255ac
            Log:
            Merge pull request #428 from raul-arabaolaza/JENKINS-50790

            JENKINS-50790 Temporarily override pageLoadTimeout for available plugins

            Compare: https://github.com/jenkinsci/acceptance-test-harness/compare/ab1fb4738cb0...dad333092159

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: src/main/java/org/jenkinsci/test/acceptance/FallbackConfig.java src/main/java/org/jenkinsci/test/acceptance/po/PluginManager.java src/test/java/plugins/GitPluginTest.java http://jenkins-ci.org/commit/acceptance-test-harness/dad333092159cb368efc2f9869572f0a05d255ac Log: Merge pull request #428 from raul-arabaolaza/ JENKINS-50790 JENKINS-50790 Temporarily override pageLoadTimeout for available plugins Compare: https://github.com/jenkinsci/acceptance-test-harness/compare/ab1fb4738cb0...dad333092159
            rarabaolaza Raul Arabaolaza made changes -
            Resolution Fixed [ 1 ]
            Status In Review [ 10005 ] Resolved [ 5 ]
            vilacides Isa Vilacides made changes -
            Hide
            vilacides Isa Vilacides added a comment -

            Months ago I tried to fix the same problem with increasing the timeout and it didn't solve all the timeout issues, see the PR [here|https://github.com/jenkinsci/acceptance-test-harness/pull/408/files. I also see on your PR that after increasing the timeout, there are still tests failing with the same error, so I think that the root cause of the long page loading has not been determined and we haven't really tackled it. On investigation I realised that there is a JS error on "Available" tab that might be preventing the page from loading and causing all these issues.

            It could also be that the click on the available tab happens before the advanced tab has finished loading and it seems like webdriver.get returns as soon as the page fires the load event but the page hasn’t actually finished loading.

            Show
            vilacides Isa Vilacides added a comment - Months ago I tried to fix the same problem with increasing the timeout and it didn't solve all the timeout issues, see the PR [here|https://github.com/jenkinsci/acceptance-test-harness/pull/408/files. I also see on your PR that after increasing the timeout, there are still tests failing with the same error, so I think that the root cause of the long page loading has not been determined and we haven't really tackled it. On investigation I realised that there is a JS error on "Available" tab that might be preventing the page from loading and causing all these issues. It could also be that the click on the available tab happens before the advanced tab has finished loading and it seems like webdriver.get returns as soon as the page fires the load event but the page hasn’t actually finished loading.
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            See here it seems that the ATH is still flaky, I suspect due to load

            Show
            rarabaolaza Raul Arabaolaza added a comment - See here it seems that the ATH is still flaky, I suspect due to load
            rarabaolaza Raul Arabaolaza made changes -
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            rarabaolaza Raul Arabaolaza made changes -
            Sprint Essentials - Milestone 1 [ 511 ]
            rarabaolaza Raul Arabaolaza made changes -
            Link This issue blocks JENKINS-51008 [ JENKINS-51008 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment - - edited

            Jesse Glick As you are the one that did the original implementation of MockUpdateCenter can you imagine a reason why it could have a performance degradation under core >= 2.112 with heavy load?

            Show
            rarabaolaza Raul Arabaolaza added a comment - - edited Jesse Glick As you are the one that did the original implementation of MockUpdateCenter can you imagine a reason why it could have a performance degradation under core >= 2.112 with heavy load?
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            The delay in the rendering of the available plugins page is obvious even running locally (not in the ATH, just starting the war file) when you do not have any plugin installed, that rendering time (which goes worst which heavy load) is the root cause of this issue,

            Show
            rarabaolaza Raul Arabaolaza added a comment - The delay in the rendering of the available plugins page is obvious even running locally (not in the ATH, just starting the war file) when you do not have any plugin installed, that rendering time (which goes worst which heavy load) is the root cause of this issue,
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            I have performed a little experimentation on load time for the available plugins, when running the ATH and using the MockUpdateCenter the call to get all available plugins oscillates between 25 and 56 seconds running locally, when not using the MockUpdateCenter it takes less than 10 seconds.

            I am starting to believe that the only way to solve this (as I can not increase the page load timeout forever) is to use groovy scripts to install plugins instead of doing directly by the UI.

            Show
            rarabaolaza Raul Arabaolaza added a comment - I have performed a little experimentation on load time for the available plugins, when running the ATH and using the MockUpdateCenter the call to get all available plugins oscillates between 25 and 56 seconds running locally, when not using the MockUpdateCenter it takes less than 10 seconds. I am starting to believe that the only way to solve this (as I can not increase the page load timeout forever) is to use groovy scripts to install plugins instead of doing directly by the UI.
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            I am experimenting with skipping the available plugins page and just request the installation via a REST call, initial experiments seem promising

            Show
            rarabaolaza Raul Arabaolaza added a comment - I am experimenting with skipping the available plugins page and just request the installation via a REST call, initial experiments seem promising
            rarabaolaza Raul Arabaolaza made changes -
            Status Reopened [ 4 ] In Progress [ 3 ]
            Hide
            jglick Jesse Glick added a comment -

            There have been past attempts to load plugins via REST rather than GUI, and that code is still there; you just need a switch to activate it. It does not work all that well as the plugin manager has some subtle logic which is not easily replicated. I would rather just track down the cause of delays and fix them. Run a profiler or whatever. Is the problem in JavaScript or the mock UC service itself?

            Show
            jglick Jesse Glick added a comment - There have been past attempts to load plugins via REST rather than GUI, and that code is still there; you just need a switch to activate it. It does not work all that well as the plugin manager has some subtle logic which is not easily replicated. I would rather just track down the cause of delays and fix them. Run a profiler or whatever. Is the problem in JavaScript or the mock UC service itself?
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            What I can see is the http call to pluginManager/available performed by the UI when clicking on the Available tab the one that takes too much time. My understanding is that the endpoint calls the mock UC under the covers, as I have checked that same call with an instance run via java -jar (outside ATH) and it takes much less time ~ 10 seconds

            Show
            rarabaolaza Raul Arabaolaza added a comment - What I can see is the http call to pluginManager/available performed by the UI when clicking on the Available tab the one that takes too much time. My understanding is that the endpoint calls the mock UC under the covers, as I have checked that same call with an instance run via java -jar (outside ATH) and it takes much less time ~ 10 seconds
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            Maybe you refer to this code path? I am not going that way, I am making a request to trigger the installation via UC instead of using the buttons in the `Available Plugins Page` and let it handle dependencies and all that stuff, not sure however about versioning constraints at this point

            Show
            rarabaolaza Raul Arabaolaza added a comment - Maybe you refer to this code path ? I am not going that way, I am making a request to trigger the installation via UC instead of using the buttons in the `Available Plugins Page` and let it handle dependencies and all that stuff, not sure however about versioning constraints at this point
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            So after a conversation with Jesse Glick it seems the problem is in the core itself since 2.112, the addition of detached plugins causing an exponential explosion in a recursive call. He is working on a fix, in the meanwhile a temporary increase in the timeout should be enough to make everything stable again

            Show
            rarabaolaza Raul Arabaolaza added a comment - So after a conversation with Jesse Glick it seems the problem is in the core itself since 2.112, the addition of detached plugins causing an exponential explosion in a recursive call. He is working on a fix, in the meanwhile a temporary increase in the timeout should be enough to make everything stable again
            jglick Jesse Glick made changes -
            Link This issue depends on JENKINS-51205 [ JENKINS-51205 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            Not sure if the fix in JENKINS-51205 is enough, I am still receiving email alerts about instability on git-plugin, however at this moment ci.jenkins.io is down so I can not check the status

            Show
            rarabaolaza Raul Arabaolaza added a comment - Not sure if the fix in JENKINS-51205 is enough, I am still receiving email alerts about instability on git-plugin, however at this moment ci.jenkins.io is down so I can not check the status
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            It seems it worked, I am going to get this open for a pair of days and close if nothing new arises

            Show
            rarabaolaza Raul Arabaolaza added a comment - It seems it worked, I am going to get this open for a pair of days and close if nothing new arises
            Hide
            jglick Jesse Glick added a comment -

            Makes sense.

            Show
            jglick Jesse Glick added a comment - Makes sense.
            rarabaolaza Raul Arabaolaza made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            Last builds of git-plugin have no failures, so I am going to close this.

            Show
            rarabaolaza Raul Arabaolaza added a comment - Last builds of git-plugin have no failures, so I am going to close this.
            Hide
            rarabaolaza Raul Arabaolaza added a comment -

            The fix for JENKINS-51205 solved this problem

            Show
            rarabaolaza Raul Arabaolaza added a comment - The fix for JENKINS-51205 solved this problem
            rarabaolaza Raul Arabaolaza made changes -
            Link This issue is related to JENKINS-51205 [ JENKINS-51205 ]
            rarabaolaza Raul Arabaolaza made changes -
            Resolution Fixed [ 1 ]
            Status In Review [ 10005 ] Resolved [ 5 ]
            olivergondza Oliver Gondža made changes -
            Component/s acceptance-test-harness [ 18623 ]

              People

              Assignee:
              rarabaolaza Raul Arabaolaza
              Reporter:
              rarabaolaza Raul Arabaolaza
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: