-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Platform: All, OS: All
-
Powered by SuggestiMate
We're building a project with Hudson using Maven2. We're also using the Maven
Cobertura plugin to generate coverage reports. This means the tests will be
executed twice, once with instrumentation and once without.
Hudson is reporting the results from both runs although there is only a single
set of results in target/surefire-reports/*.xml. Hudson should only report a
single set of results, preferably those within target/surefire-reports.
- is duplicated by
-
JENKINS-2068 Junit test showing twice
-
- Reopened
-
-
JENKINS-2158 If Maven runs the test suite multiple times the Test Result Trend graph and Stats are increased with each run.
-
- Resolved
-
[JENKINS-1557] Duplicate test results with Maven2 projects
The test case is very well described by the initial reporter
I.e. when building a Maven2 Project in Hudson and also utilizing cobertura
(Goals e.g. "-U clean deploy findbugs:findbugs checkstyle:checkstyle pmd:pmd
pmd:cpd cobertura:cobertura javadoc:javadoc") the tests are executed twice (once
due to deploy/install and once due to cobertura) and Hudson is reporting the
results from both runs - although there is only a single
set of results in target/surefire-reports/*.xml
Interestingly when using the Hudson FreeStyle project the tests are reported
only once.
Hints/Speculation:
- When processing the Maven2 Hudson Project it seems that the test results are
somewhow stored after the first test run and then again aftre the second
(Instrumenting) test run. - With the FreeStyle Hudson Project the test results are probably processed at
the end, i.e. after cobertura rewrites the tests on target/surefire-reports?
This is the preferred behaviour.
I run the cobertura:cobertura, cobertura:check and package goals, so I actually
get three results reported.
I wonder if there is a chance that somobody will look into this. Clearly, this
is not a show-stopper, but after one year waiting it would be nice at least to
know if anything is going to happen
Same issue when using emma-maven-plugin and running 'package site'. According
to FAQ (http://mojo.codehaus.org/emma-maven-plugin/faq.html) tests should be
run twice.
I suppose that adding an ability to restrict phases/goals when test results are
recorded would be solution, albeit inelegant.
There I was going to go and write a big blog entry and somebody quotes the FAQ
entry I wrote back to me (saving me the bother of writing the blog entry)
In any case... this is a CI server, run the damn tests twice it's best for you...
and since you are running them twice, you should get both sets of results.
Also FYI, the lifecycle is forked to run the instrumented tests so that the
result is the tests are still run in the same phase... filtering by phase will
not help you.
SERIOUSLY I have seen too many tests where they pass with coverage and hide a
real bug than only shows when run without coverage....
unless ALL your developers are much smarter than me (and I'm not saying I'm
smart) this is what you should do... if you have at least one developer as smart
as me or less smart than me you must run the tests twice...
Read "Java Concurrency in practice" the one with the three high-speed trains
(TGV or japanese bullet trains I cannot remember) there is an example on page
33. If all your developers can correctly identify what that simple program is
allowed to do under the JVM spec, then and only then will I accept that you
might be ok in only running the tests with coverage and not without.
Ha! I didn't notice, that you are emma-maven-plugin author, Stephen.
From my point of view - the second (instrumented) pass should give me a code
coverage info. I assume that number of succeded/failed tests should be the same
as in first pass. If not, this Emma/Cobertura/Clover bug, not mine. Yep, warn
me, if they are different, but doubling tests count is misleading.
Since I'm not into M2 internals I do not know if fixing this bug is possible/
feasible or not. Not a big deal anyway, I'm just little pedantic .
I did not write the emma-maven-plugin, I just ran the release process for alpha-1 which involved tidy
up and writing docs
the point is that if a test fails when run with coverage,90% of the time you need to sit up and take
notice because it is most likely your fault too
the modifications made by instrumentation are all legal under the jvm spec. if your code fails in the
presence of such modifications, that is a bug in your code. just as much as if it won't pass without
them
I think the real issue here is that Hudson is reporting both test executions as
the cumulative unit test result of the build. Ideally, Hudson would be able to
make the difference between executions instead of adding them up.
Then, Hudson would be able to display two (or more) plots instead of just one
which would be much more informative: one could clearly see a test failing in
one execution and not another.
The fact that the emma-maven-plugin is involved in this issue is that, right
now, the only solution to not have cumulative reporting is to have a single test
execution (instrumented), which is not currently possible with the plugin.
I don't think anyone is arguing against the best practices, we're just trying to
find a temporary fix until the issue reported here is resolved.
Well actually from some of the earlier comments it's actually that people don't
want to run their (long running) tests twice...
All I have being pointing out is that there is a MAJOR falicy with thinking that
if my tests pass when I'm measuring code coverage then everything works.
If you only run the tests once, and that once is with instrumented code, then
you may have bugs that you don't know about.
The most major source of such bugs is unintended synchronization. The JVM can
cache any non-volatile values between synchronization points. Instrumentation
records into a coverage results map each time a code path is covered... which
results in increased synchronization, as each time is passes a branch point
(either for an NPE, or an explicit branch in your code) it has to update the
coverage map, which involves synchronized access to that map (because there may
be multiple threads). The net result is that variable changes from other
threads are more visible when your code is instrumented.
Now as the JVM spec only says that the JVM is allowed to cache (for performance
improvement) the excess synchronization is not incorrect... in fact your code
should function correctly irrespective of whether the JVM has cached the value
or not... 9 times out of 10 though people assume that the JVM is not caching
things on them, so that is why you need to run the tests without coverage.
Another thing you should do is run the tests with a different JVM (ideally a
different vendor: JRockit or IBM's JVM) as these have different optimization and
caching strategies (all legal under the JVM spec)
In my experience, there is a diminishing return here though, and 80% of these
kind of bugs will be found if you run on just one JVM with and without coverage
(i.e. 2 runs through your test suite)
With 6 runs through your tests suite (Sun, IBM, JRockit, Sun + emma, Sun +
cobertura, and any JVM on a different architecture) you're pretty safe... but
the effort involved in setting up such a testing suite can sometimes be such
that the 80:20 rule comes into play and we don't bother.
Yes I agree that hudson should report the source of the test results... perhaps
it should count reruns on the same test as partial scores... so if
TestMyClass.smokeTest()
has run 3 times, we count that as one test, and each run passing run counts as
1/3 towards the passing count, failing runs count as 1/3 towards the failing
count, skipped runs count as 1/3 towards the skipped count, etc
To my view, there is nothing to be gained in splitting the reporting of these
multiple tests as you need to know if this test passed with Sun and no coverage
before you wonder if the failure with Sun and cobertura is something that should
be looked into (for synchronization/threading issues)
In short, if any test is failing you need to find out why and fix it (either by
having the test not run if code coverage is active, or by fixing the bug in your
code or fixing the bug in your test)
(The only reason emma got dragged in is that somebody quoted the FAQ entry I
wrote while pushing the emma-maven-plugin out the door... so that I could write
the m2 support for emma into the coverage plugin that i've been working on for
some time now)
Great discussion. Thanks!
From my perspective, the following points are important (quoting):
- Yep, warn me, if they are different, but doubling tests count is misleading.
(marcingalazka) - I don't think anyone is arguing against the best practices, we're just trying
to find a temporary fix until the issue reported here is resolved. (plaflamme) - (...) perhaps it should count reruns on the same test as partial scores (...)
has run 3 times, we count that as one test (...) (stephenconnolly)
The way forward is to allow for multiple testing execution, but Hudson should
not report the MISLEADING cumulative count.
Ok, fair points Stephen.
I think that above post from javanet_ac summed it up nicely anyway. I'm not
opposed to running tests twice (or more) and I fully agree that coverage pass
can be valuable (just like testing under different JVM or - tbh - just under
different application server).
The partial scores solution is interesting. Do you think that your base
coverage plugin could be proper place to put some support code into?
Hi,
I'm still waiting any update for this issue.
I moved the subcomponent from JUnit to maven2 because I have this problem only
using maven2 jobs and not while using freestyle jobs (with batch windows nor
maven goals).
In the configuration of a freestyle job, we can ask for JUnit test results and
specify where are the surefire reports. This cannot be done in the configuration
of a maven2 job.
My problem is that what's reported in the health report and in the test reports
is the number of tests run and not the number of existing tests (in maven2 job).
Anyway, when reading these reports from the xml api, I have no clue to know what
the number of tests corresponds to. Running the tests 2, 4 or more times isn't
my problem, I need to get a relevant information in the reports.
Having...
- the real number of tests
or - the number of tests run + the number of times tests are run.
...doesn't really matter.
Thank you by advance,
Alexis.
Created an attachment (id=1001)
Very simple maven project that reproduces issue
I have attached a zipped maven project that you can run in Hudson and reproduces
what I think is the issue. It just runs two unit tests and two integration
tests (all four of which are just no-ops that pass). Unlike what some of the
others have complained about, it is not running extra tests. However, the total
tests shows as 6 on Hudson because it counts the unit tests twice. If you add
another unit test, Hudson will show 8 total tests.
Thank you for the attachment, I tried it and that's the same issue I was talking
about.
I saw abayer on hudson chat few days ago saying:
"So I'm taking a look at the oft-mentioned "when I run a Maven build with
Cobertura, tests get reported twice" thing. And I can't decide what, if
anything, should be done about it. It'd be possible to tweak the
SurefireArchiver to not record existing suites a second time, but is that really
the right approach?"
It's more clear now because we see this isn't related to cobertura. I hope it
will help.
Here are suggestions on how it could look when the tests run more than once. Lets call each of those runs a "batch" of runs. One batch for plain old SureFire. One batch for Cobertura or Emma. Or in one example in a comment above, three batches just for the coverage targets. Note that this involves collecting the records every time the test runs and keeping track of which batch it was.
The problem is that people see the graph on the project page and think it shows the number of tests present in the code. We don't really know that number, what with exclusions and searching for tests by class name and such. We know the number of tests that ran, whether we count them all by as they run or only count the number in each batch of the tests. Currently, it shows the sum of all the batches and confuses some people.
If all the test batches in any particular build have the same number of tests and same failure counts, I suggest we just show those numbers. Then the graph will be likely to show the number of tests present in the project which would equal the number of tests run in any given batch. The failures would show how many failed in each of the batches (but not the sum).
Now the problem remains ... what do we do if the batches do NOT have the same test or failure counts?
1 idea:
On the project page, where it shows the graph of tests (blue) and failures (red) show the number of tests as the maximum of any of the test batches and the failures as the max failures of any of the test batches. For example, if tests run 3 times (3 batches) and they are 1000/0, 1000/20, 998/20 then the graph shows 1000/20 (tests/failures) in red and blue.
We also draw a black/yellow/white (whatever) mark on the graph at that build. It could be a vertical line or two little circles or squares. Below the graph (or above it, whatever) there is a message "Some variable test results" with a link on it to another page that shows multiple graphs or a little table, one graph (or row in the table) for each actual test run showing the true numbers from that test run. That other page would be a "build" page tied to a particular build number.
We could also use the table format to have it in a pop up message if you hover over that black/yellow/white marks on the graph. (Or maybe just text like, "1000/0" shown on multiple lines, one per test batch in the popup.)
2 idea:
Instead of that we could do this. When there are different test counts or failure counts between batches show that in the graph as a vertical range. If you think of the vertical for one build on the graph there could be red from the bottom to the smallest number of failures for any batch, then pink from there on up to the largest number of failures, then blue up to the smallest number of tests run in one batch and finally light-blue up to the largest number of tests in a batch. (This would look really funny though if one of the test batches was real flakey and less tests ran than the number of failures in some other test batch. There might not be a blue part or a pink part, like there is no red part if no failures.) The nice thing about this one is that it easily shows something is going on and it would visibly show if the same thing keeps happening build after build. There would be, for example, a pink swath running across the graph.
The popup showing the counts by batch and/or the link to the other page with more details would work here too.
The "Test Results" page. (The URL ends in /99/testReport.)
This is much easier. Just show one horizontal line for each test batch at the top of the page. The red part on the left might vary on each one and the numbers shown might differ. Then, if there are multiple batchs only, add a message above the table holding the errors telling the reader, if the tests ran multiple times, something like "Tests ran 2 times. Each test may appear more than once in the table." When that table is expanded to show actual test class names, each one will have a number in parentheses after the class name to indicate whether it was the 1st or 2nd run of that test. It would be best if the reader can tell which failures were with Cobertura or Emma and which were without it. That would be useful to identify those situations Stephen mentioned where adding the instrumentation causes the test to fail (or work).
Code changed in jenkins
User: Martijn Baay
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/jenkins/0fff2e929015d50aa0811ae323503b8c6423ad0f
Log:
Fixed JENKINS-1557 duplicate test results. However it does not take into account the fact that the same test suite might be ran twice
Code changed in jenkins
User: Olivier Lamy
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
test/src/test/java/hudson/maven/Maven3BuildTest.java
test/src/test/resources/hudson/maven/JENKINS-1557.zip
http://jenkins-ci.org/commit/jenkins/9fc376ab1548fb3e8650a1a28435761ec5b97e27
Log:
[FIXED JENKINS-1557] Duplicate test results with Maven2 projects
merged with some modifications from https://github.com/kaydouble-u/jenkins/commit/0fff2e929015d50aa0811ae323503b8c6423ad0f
it test added
Compare: https://github.com/jenkinsci/jenkins/compare/4d6df29...9fc376a
Integrated in jenkins_main_trunk #669
Fixed JENKINS-1557 duplicate test results. However it does not take into account the fact that the same test suite might be ran twice
[FIXED JENKINS-1557] Duplicate test results with Maven2 projects
Martijn Baay : 0fff2e929015d50aa0811ae323503b8c6423ad0f
Files :
- maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
Olivier Lamy : 9fc376ab1548fb3e8650a1a28435761ec5b97e27
Files :
- test/src/test/resources/hudson/maven/
JENKINS-1557.zip - maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
- test/src/test/java/hudson/maven/Maven3BuildTest.java
Code changed in jenkins
User: Martijn Baay
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/jenkins/0fff2e929015d50aa0811ae323503b8c6423ad0f
Log:
Fixed JENKINS-1557 duplicate test results. However it does not take into account the fact that the same test suite might be ran twice
Code changed in jenkins
User: Olivier Lamy
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
test/src/test/java/hudson/maven/Maven3BuildTest.java
test/src/test/resources/hudson/maven/JENKINS-1557.zip
http://jenkins-ci.org/commit/jenkins/9fc376ab1548fb3e8650a1a28435761ec5b97e27
Log:
[FIXED JENKINS-1557] Duplicate test results with Maven2 projects
merged with some modifications from https://github.com/kaydouble-u/jenkins/commit/0fff2e929015d50aa0811ae323503b8c6423ad0f
it test added
Code changed in jenkins
User: Martijn Baay
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/jenkins/0fff2e929015d50aa0811ae323503b8c6423ad0f
Log:
Fixed JENKINS-1557 duplicate test results. However it does not take into account the fact that the same test suite might be ran twice
Code changed in jenkins
User: Olivier Lamy
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
test/src/test/java/hudson/maven/Maven3BuildTest.java
test/src/test/resources/hudson/maven/JENKINS-1557.zip
http://jenkins-ci.org/commit/jenkins/9fc376ab1548fb3e8650a1a28435761ec5b97e27
Log:
[FIXED JENKINS-1557] Duplicate test results with Maven2 projects
merged with some modifications from https://github.com/kaydouble-u/jenkins/commit/0fff2e929015d50aa0811ae323503b8c6423ad0f
it test added
Code changed in jenkins
User: Vojtech Juranek
Path:
maven-plugin/src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/jenkins/b0a39275cc85b2a3d94ecfba77b1349f86780d09
Log:
Revert 4440e0f420242364975265f6db5c27bee13de4e3
It breaks some already fixed issues like JENKINS-1557
Code changed in jenkins
User: Martijn Baay
Path:
src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/maven-plugin/144a92f35d5d49871684f9190588ac40bf9b7a96
Log:
Fixed JENKINS-1557 duplicate test results. However it does not take into account the fact that the same test suite might be ran twice
Originally-Committed-As: 0fff2e929015d50aa0811ae323503b8c6423ad0f
Code changed in jenkins
User: Olivier Lamy
Path:
src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/maven-plugin/dc8887d2a5d6b4bab87cd1b14007c97a4fdeb01e
Log:
[FIXED JENKINS-1557] Duplicate test results with Maven2 projects
merged with some modifications from https://github.com/kaydouble-u/jenkins/commit/0fff2e929015d50aa0811ae323503b8c6423ad0f
it test added
Originally-Committed-As: 9fc376ab1548fb3e8650a1a28435761ec5b97e27
Code changed in jenkins
User: Vojtech Juranek
Path:
src/main/java/hudson/maven/reporters/SurefireArchiver.java
http://jenkins-ci.org/commit/maven-plugin/25ee31812fc007b87f49d87f862a8eff5d1751bf
Log:
Revert 4440e0f420242364975265f6db5c27bee13de4e3
It breaks some already fixed issues like JENKINS-1557
Originally-Committed-As: b0a39275cc85b2a3d94ecfba77b1349f86780d09
I'm having the same issue. I'm wondering if you can tweak the pom to only
execute the tests once, which should be done anyway, in order to reduce the
build time.
Currently I'm using the maven goals "install findbugs:findbugs
cobertura:cobertura" in my hudson project, but would like to reduce that to
"install" only. If this would be documented somewhere, then this bug is also
obsolete, imho.