[JENKINS-49263] Determine minimal useful test case for running ATH for Jenkins Core

Type: Task
Resolution: Done
Priority: Minor
Component/s: acceptance-test-harness, core
Labels:
- evergreen

Similar Issues:
Powered by SuggestiMate

Show

is related to

JENKINS-49524 JenkinsDatabaseSecurityRealmTest#login_and_logout is flaky

Closed

R. Tyler Croy created issue - 2018-01-30 16:34

Raul Arabaolaza made changes - 2018-02-06 11:04

Status

Original: Open [ 1 ]

New: In Progress [ 3 ]

Raul Arabaolaza added a comment - 2018-02-06 11:51 - edited

There is a SmokeTest category in the ATH which seems interesting, it covers:

FreeStyle Job Creation
Periodical builds of FreeStyle Jobs
Parameterized builds of FreeStyle Jobs
Build history correctness
User log in and log out capabilities
Execute system script
Slave creation and tie jobs to specific labels
Remote build trigger
View creation and reg exp
Jobs renaming

The entire category run (including checkout, docker images run and all the needed set up) takes between 10 and 12 min in my test instance, according to Javadoc is meant to not have more than 10 tests and has been there for some quite time. DISCLAIMER: I am not running in my local instance using the ATH docker image, will do that after I have some data with plain runs just to compare results

Below is a list of the concrete tests executed and some stability measures

FreestyleJobTest#archiveArtifacts (No failures recorded in history)
FreestyleJobTest#buildParametrized (No failures recorded in history)
FreestyleJobTest#buildPeriodically (No failures recorded in history)
FreestyleJobTest#doNotDiscardSuccessfulBuilds (No failures recorded in history)
JenkinsDatabaseSecurityRealmTest#login_and_logout (No failures recorded in history)
ScriptTest#execute_system_script (Not very stable according to history and failing now in ci.jenkins.io even if working perfectly fine in my test instance)
SlaveTest#tie_job_to_specified_label (Not very stable according to history)
TriggerRemoteBuildsTest#triggerBuildRemotely (No failures recorded in history)
ViewTest#findJobThroughRegexp (No failures recorded in history)
ViewTest#renameJob (No failures recorded in history)

I am currently performing runs of those tests on my test instance every half an hour to get more data about failures and timings, if no problems are found I would recommend to create a new category in the ATH mostly based on the SmokeTest one without the tests that have shown any failure.

Will update this with data after 24 hours of runs. Any comment so far vilacides rtyler olivergondza ?

Raul Arabaolaza added a comment - 2018-02-06 11:51 - edited There is a SmokeTest category in the ATH which seems interesting, it covers: FreeStyle Job Creation Periodical builds of FreeStyle Jobs Parameterized builds of FreeStyle Jobs Build history correctness User log in and log out capabilities Execute system script Slave creation and tie jobs to specific labels Remote build trigger View creation and reg exp Jobs renaming The entire category run (including checkout, docker images run and all the needed set up) takes between 10 and 12 min in my test instance, according to Javadoc is meant to not have more than 10 tests and has been there for some quite time. DISCLAIMER: I am not running in my local instance using the ATH docker image, will do that after I have some data with plain runs just to compare results Below is a list of the concrete tests executed and some stability measures FreestyleJobTest#archiveArtifacts (No failures recorded in history ) FreestyleJobTest#buildParametrized (No failures recorded in history ) FreestyleJobTest#buildPeriodically (No failures recorded in history ) FreestyleJobTest#doNotDiscardSuccessfulBuilds (No failures recorded in history ) JenkinsDatabaseSecurityRealmTest#login_and_logout (No failures recorded in history ) ScriptTest#execute_system_script (Not very stable according to history and failing now in ci.jenkins.io even if working perfectly fine in my test instance) SlaveTest#tie_job_to_specified_label (Not very stable according to history ) TriggerRemoteBuildsTest#triggerBuildRemotely (No failures recorded in history ) ViewTest#findJobThroughRegexp (No failures recorded in history ) ViewTest#renameJob (No failures recorded in history ) I am currently performing runs of those tests on my test instance every half an hour to get more data about failures and timings, if no problems are found I would recommend to create a new category in the ATH mostly based on the SmokeTest one without the tests that have shown any failure. Will update this with data after 24 hours of runs. Any comment so far vilacides rtyler olivergondza ?

Oliver Gondža added a comment - 2018-02-06 11:58

Thanks for looking into this. I have 2 comments:

Currently, the ATH container is build every time. If we can get it pushed to the registry it can speed the execution significantly.
Once you will purge this off flaky test, fee free to update the SmokeTest category instead of creating a new one.

Oliver Gondža added a comment - 2018-02-06 11:58 Thanks for looking into this. I have 2 comments: Currently, the ATH container is build every time. If we can get it pushed to the registry it can speed the execution significantly. Once you will purge this off flaky test, fee free to update the SmokeTest category instead of creating a new one.

Raul Arabaolaza added a comment - 2018-02-06 12:11

olivergondza My main concern about using a released version of the ATH container instead of building from source is to make sure we use the latest fixes (same reason why we use master of the ATH itself instead of a released version) In your experience is this concern valid or the ATH container is stable enough to use a released version without worries?

Note that on first stages I am not using the ATH container (that means is not really needed), just starting up the selenium firefox one and using it, but I want to change that so for example if the ATH updates the selenium version I expect the container to be also updated and hence no changes in the job itself are needed

Raul Arabaolaza added a comment - 2018-02-06 12:11 olivergondza My main concern about using a released version of the ATH container instead of building from source is to make sure we use the latest fixes (same reason why we use master of the ATH itself instead of a released version) In your experience is this concern valid or the ATH container is stable enough to use a released version without worries? Note that on first stages I am not using the ATH container (that means is not really needed), just starting up the selenium firefox one and using it, but I want to change that so for example if the ATH updates the selenium version I expect the container to be also updated and hence no changes in the job itself are needed

Raul Arabaolaza added a comment - 2018-02-07 11:26 - edited

JenkinsDatabaseSecurityRealmTest#login_and_logout has shown some instability failing 6 times in the last 37 runs. The reason is always the same and I suspect that is a race condition after login, I am going to try to fix the test as I believe is important to make sure users can log in and out

expected:<jenkins-acceptance-tests-user (Full Name)> but was:<null (null)>

Raul Arabaolaza added a comment - 2018-02-07 11:26 - edited JenkinsDatabaseSecurityRealmTest#login_and_logout has shown some instability failing 6 times in the last 37 runs. The reason is always the same and I suspect that is a race condition after login, I am going to try to fix the test as I believe is important to make sure users can log in and out expected:<jenkins-acceptance-tests-user (Full Name)> but was:<null (null)>

Raul Arabaolaza added a comment - 2018-02-07 13:59 - edited

Testing now my attempt to remove flakiness, code is here

Raul Arabaolaza added a comment - 2018-02-07 13:59 - edited Testing now my attempt to remove flakiness, code is here

Raul Arabaolaza added a comment - 2018-02-07 16:59

So, my initial set of changes did not worked, but it has confirmed the issue, for some reason sometimes the login takes more than ten seconds...

Timed out after 10 seconds: Element matching By.cssSelector: a[href='/user/jenkins-acceptance-tests-user'] is present

Raul Arabaolaza added a comment - 2018-02-07 16:59 So, my initial set of changes did not worked, but it has confirmed the issue, for some reason sometimes the login takes more than ten seconds... Timed out after 10 seconds: Element matching By.cssSelector: a[href='/user/jenkins-acceptance-tests-user'] is present

Raul Arabaolaza added a comment - 2018-02-08 08:54 - edited

The changes I made did not entirely solve the issue but they have greatly reduced the frequency of it, two failures in 47 runs... I am going to continue with the same approach I suspect now the problem is a race condition related to the click on login button and the load of the next page

Raul Arabaolaza added a comment - 2018-02-08 08:54 - edited The changes I made did not entirely solve the issue but they have greatly reduced the frequency of it, two failures in 47 runs... I am going to continue with the same approach I suspect now the problem is a race condition related to the click on login button and the load of the next page

Raul Arabaolaza added a comment - 2018-02-08 12:16

Gathering data with my latest changes....

Raul Arabaolaza added a comment - 2018-02-08 12:16 Gathering data with my latest changes....

Assignee:: Raul Arabaolaza

Reporter:: R. Tyler Croy

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2018-01-30 16:34

Updated:: 2018-09-28 10:00

Resolved:: 2018-02-19 10:02

Jenkins

Details

Attachments

Issue Links

Activity

Collapse comment: Raul Arabaolaza added a comment - 2018-02-06 11:51, Edited by Raul Arabaolaza - 2018-02-06 11:54

Expand comment: Raul Arabaolaza added a comment - 2018-02-06 11:51, Edited by Raul Arabaolaza - 2018-02-06 11:54

Collapse comment: Oliver Gondža added a comment - 2018-02-06 11:58

Expand comment: Oliver Gondža added a comment - 2018-02-06 11:58

Collapse comment: Raul Arabaolaza added a comment - 2018-02-06 12:11

Expand comment: Raul Arabaolaza added a comment - 2018-02-06 12:11

Collapse comment: Raul Arabaolaza added a comment - 2018-02-07 11:26, Edited by Raul Arabaolaza - 2018-02-07 12:18

Expand comment: Raul Arabaolaza added a comment - 2018-02-07 11:26, Edited by Raul Arabaolaza - 2018-02-07 12:18

Collapse comment: Raul Arabaolaza added a comment - 2018-02-07 13:59, Edited by Raul Arabaolaza - 2018-02-07 14:03

Expand comment: Raul Arabaolaza added a comment - 2018-02-07 13:59, Edited by Raul Arabaolaza - 2018-02-07 14:03

Collapse comment: Raul Arabaolaza added a comment - 2018-02-07 16:59

Expand comment: Raul Arabaolaza added a comment - 2018-02-07 16:59

Collapse comment: Raul Arabaolaza added a comment - 2018-02-08 08:54, Edited by Raul Arabaolaza - 2018-02-08 09:06

Expand comment: Raul Arabaolaza added a comment - 2018-02-08 08:54, Edited by Raul Arabaolaza - 2018-02-08 09:06

Collapse comment: Raul Arabaolaza added a comment - 2018-02-08 12:16

Expand comment: Raul Arabaolaza added a comment - 2018-02-08 12:16

People

Dates