Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49263

Determine minimal useful test case for running ATH for Jenkins Core

        [JENKINS-49263] Determine minimal useful test case for running ATH for Jenkins Core

        R. Tyler Croy created issue -
        Raul Arabaolaza made changes -
        Status Original: Open [ 1 ] New: In Progress [ 3 ]

        Raul Arabaolaza added a comment - - edited

        There is a SmokeTest category in the ATH which seems interesting, it covers:

        • FreeStyle Job Creation
        • Periodical builds of FreeStyle Jobs
        • Parameterized builds of FreeStyle Jobs
        • Build history correctness
        • User log in and log out capabilities
        • Execute system script
        • Slave creation and tie jobs to specific labels
        • Remote build trigger
        • View creation and reg exp 
        • Jobs renaming

        The entire category run (including checkout, docker images run and all the needed set up) takes between 10 and 12 min in my test instance, according to Javadoc is meant to not have more than 10 tests and has been there for some quite time. DISCLAIMER: I am not running in my local instance using the ATH docker image, will do that after I have some data with plain runs just to compare results

        Below is a list of the concrete tests executed and some stability measures

        • FreestyleJobTest#archiveArtifacts (No failures recorded in history)
        • FreestyleJobTest#buildParametrized (No failures recorded in history)
        • FreestyleJobTest#buildPeriodically (No failures recorded in history)
        • FreestyleJobTest#doNotDiscardSuccessfulBuilds (No failures recorded in history)
        • JenkinsDatabaseSecurityRealmTest#login_and_logout (No failures recorded in history)
        • ScriptTest#execute_system_script  (Not very stable according to  history and failing now in ci.jenkins.io even if working  perfectly fine in my test instance)
        • SlaveTest#tie_job_to_specified_label (Not very stable according to history)
        • TriggerRemoteBuildsTest#triggerBuildRemotely (No failures recorded in history)
        • ViewTest#findJobThroughRegexp (No failures recorded in history)
        • ViewTest#renameJob (No failures recorded in history)

        I am currently performing runs of those tests on my test instance every half an hour to get more data about failures and timings, if  no problems are found I would recommend to create a new category in the ATH mostly based on the SmokeTest one without the tests that have shown any failure.

         

        Will update this with data after 24 hours of runs. Any comment so far vilacides rtyler olivergondza ?

        Raul Arabaolaza added a comment - - edited There is a SmokeTest  category in the ATH which seems interesting, it covers: FreeStyle Job Creation Periodical builds of FreeStyle Jobs Parameterized builds of FreeStyle Jobs Build history correctness User log in and log out capabilities Execute system script Slave creation and tie jobs to specific labels Remote build trigger View creation and reg exp  Jobs renaming The entire category run (including checkout, docker images run and all the needed set up) takes between 10 and 12 min in my test instance, according to Javadoc is meant to not have more than 10 tests and has been there for some quite time. DISCLAIMER: I am not running in my local instance using the ATH docker image, will do that after I have some data with plain runs just to compare results Below is a list of the concrete tests executed and some stability measures FreestyleJobTest#archiveArtifacts (No failures recorded in history ) FreestyleJobTest#buildParametrized (No failures recorded in  history ) FreestyleJobTest#buildPeriodically (No failures recorded in  history ) FreestyleJobTest#doNotDiscardSuccessfulBuilds (No failures recorded in  history ) JenkinsDatabaseSecurityRealmTest#login_and_logout (No failures recorded in history ) ScriptTest#execute_system_script  (Not very stable according to  history  and failing now in ci.jenkins.io even if working  perfectly fine in my test instance) SlaveTest#tie_job_to_specified_label (Not very stable according to history ) TriggerRemoteBuildsTest#triggerBuildRemotely (No failures recorded in history ) ViewTest#findJobThroughRegexp (No failures recorded in history ) ViewTest#renameJob (No failures recorded in  history ) I am currently performing runs of those tests on my test instance every half an hour to get more data about failures and timings, if  no problems are found I would recommend to create a new category in the ATH mostly based on the SmokeTest one without the tests that have shown any failure.   Will update this with data after 24 hours of runs. Any comment so far vilacides rtyler olivergondza ?

        Thanks for looking into this. I have 2 comments:

        • Currently, the ATH container is build every time. If we can get it pushed to the registry it can speed the execution significantly.
        • Once you will purge this off flaky test, fee free to update the SmokeTest category instead of creating a new one.

        Oliver Gondža added a comment - Thanks for looking into this. I have 2 comments: Currently, the ATH container is build every time. If we can get it pushed to the registry it can speed the execution significantly. Once you will purge this off flaky test, fee free to update the SmokeTest category instead of creating a new one.

        olivergondza My main concern about using a released version of the ATH container instead of building from source is to make sure we use the latest fixes (same reason why we use master of the ATH itself instead of a released version) In your experience is this concern valid or the ATH container is stable enough to use a released version without worries?

        Note that on first stages I am not using the ATH container (that means is not really needed), just starting up the selenium firefox one and using it, but I want to change that so for example if the ATH updates the selenium version I expect the container to be also updated and hence no changes in the job itself are needed

        Raul Arabaolaza added a comment - olivergondza My main concern about using a released version of the ATH container instead of building from source is to make sure we use the latest fixes (same reason why we use master of the ATH itself instead of a released version) In your experience is this concern valid or the ATH container is stable enough to use a released version without worries? Note that on first stages I am not using the ATH container (that means is not really needed), just starting up the selenium firefox one and using it, but I want to change that so for example if the ATH updates the selenium version I expect the container to be also updated and hence no changes in the job itself are needed

        Raul Arabaolaza added a comment - - edited

        JenkinsDatabaseSecurityRealmTest#login_and_logout has shown some instability failing 6 times in the last 37 runs. The reason is always the same and I suspect that is a race condition after login, I am going to try to fix the test as I believe is important to make sure users can log in and out

         

        expected:<jenkins-acceptance-tests-user (Full Name)> but was:<null (null)>

        Raul Arabaolaza added a comment - - edited JenkinsDatabaseSecurityRealmTest#login_and_logout has shown some instability failing 6 times in the last 37 runs. The reason is always the same and I suspect that is a race condition after login, I am going to try to fix the test as I believe is important to make sure users can log in and out   expected:<jenkins-acceptance-tests-user (Full Name)> but was:<null (null)>

        Raul Arabaolaza added a comment - - edited

        Testing now my attempt to remove flakiness, code is here

        Raul Arabaolaza added a comment - - edited Testing now my attempt to remove flakiness, code is here

        So, my initial set of changes did not worked, but it has confirmed the issue, for some reason sometimes the login takes more than ten seconds...

        Timed out after 10 seconds: Element matching By.cssSelector: a[href='/user/jenkins-acceptance-tests-user'] is present

        Raul Arabaolaza added a comment - So, my initial set of changes did not worked, but it has confirmed the issue, for some reason sometimes the login takes more than ten seconds... Timed out after 10 seconds: Element matching By.cssSelector: a[href='/user/jenkins-acceptance-tests-user'] is present

        Raul Arabaolaza added a comment - - edited

        The changes I made did not entirely solve the issue but they have greatly reduced the frequency of it, two failures in 47 runs... I am going to continue with the same approach I suspect now the problem is a race condition related to the click on login button and the load of the next page

        Raul Arabaolaza added a comment - - edited The changes I made did not entirely solve the issue but they have greatly reduced the frequency of it, two failures in 47 runs... I am going to continue with the same approach I suspect now the problem is a race condition related to the click on login button and the load of the next page

        Gathering data with my latest changes....

        Raul Arabaolaza added a comment - Gathering data with my latest changes....

          rarabaolaza Raul Arabaolaza
          rtyler R. Tyler Croy
          Votes:
          0 Vote for this issue
          Watchers:
          4 Start watching this issue

            Created:
            Updated:
            Resolved: