Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-5393

Jenkins falsely fails builds due to aggressive up-to-date check on junit artifacts

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • junit-plugin
    • None
    • RedHat Linux VMware

      I'm getting this error at the end of a build in Hudson that should have been successful. I don't know why this age check is failing but I would be satisfied if I could simply disable the check.

      Archiving artifacts
      Recording test results
      Test reports were found but none of them are new. Did tests run?
      For example, /opt/hudson-workspace/workspace/project/target/test/reports/unittest/Ant suite/Ant test.xml is 3 min 13 sec old

      These files are, without question, up to date and generated only by the current built.

      Here are my questions:
      1: The comparison is between the date of the file and... what other thing? What do "new" and "old" mean in this context?
      2: How can I disable or deceive this comparison? It's completely unwanted and unnecessary.

      I know there was a bug for this in earlier versions of Hudson, but apparently the error message was the only thing changed.

      The problem still occurs in 1.341. As for the environment, these are RedHat Linux virtual machines, and the dates on the system are synchronized to within 1 second. I cannot understand why Hudson thinks these files are too old.

          [JENKINS-5393] Jenkins falsely fails builds due to aggressive up-to-date check on junit artifacts

          carltonb added a comment -

          Just a sample of the time differentials:
          Build time: 4 minutes 22 seconds Ant test.xml is 2 min 31 sec old
          Build time: 3 minutes 38 seconds Ant test.xml is 3 min 13 sec old
          Build time: 5 minutes 38 seconds Ant test.xml is 1 min 26 sec old

          I could see maybe 10-20 seconds being too close, but this master/slave has never been 4 minutes apart.

          carltonb added a comment - Just a sample of the time differentials: Build time: 4 minutes 22 seconds Ant test.xml is 2 min 31 sec old Build time: 3 minutes 38 seconds Ant test.xml is 3 min 13 sec old Build time: 5 minutes 38 seconds Ant test.xml is 1 min 26 sec old I could see maybe 10-20 seconds being too close, but this master/slave has never been 4 minutes apart.

          carltonb added a comment -

          After digging around in Fisheye I see where this is happening.

          http://fisheye.jenkins-ci.org/browse/Hudson/trunk/hudson/main/core/src/main/java/hudson/tasks/junit/TestResult.java?r=HEAD

          It looks like some refactoring has taken place, and there is a TODO to make a certain property 'checkTimestamps' configurable as a system property.

          I presume that the person who made that comment delayed this change for a reason and will finish that task when the time is right.

          carltonb added a comment - After digging around in Fisheye I see where this is happening. http://fisheye.jenkins-ci.org/browse/Hudson/trunk/hudson/main/core/src/main/java/hudson/tasks/junit/TestResult.java?r=HEAD It looks like some refactoring has taken place, and there is a TODO to make a certain property 'checkTimestamps' configurable as a system property. I presume that the person who made that comment delayed this change for a reason and will finish that task when the time is right.

          carltonb added a comment -

          I finally discovered the reason for the apparent clock difference. It was not that the slave and master were out of sync. Rather, the slave and master both store all their data on an NFS mount, and the NFS server was 6 minutes behind. Fortunately this is an NFS server that my team controls, so we were able to correct the clock and pass builds.

          IMO this is just another illustration of why we need the option to disable this aggressive config check. There are just too many ways it can falsely fail for reasons that the Hudson admin may not be able to mitigate.

          carltonb added a comment - I finally discovered the reason for the apparent clock difference. It was not that the slave and master were out of sync. Rather, the slave and master both store all their data on an NFS mount, and the NFS server was 6 minutes behind. Fortunately this is an NFS server that my team controls, so we were able to correct the clock and pass builds. IMO this is just another illustration of why we need the option to disable this aggressive config check. There are just too many ways it can falsely fail for reasons that the Hudson admin may not be able to mitigate.

          Note that if this check is not in place, tests that are removed at one point will continue to be tabulated by Hudson, until you do a clean build.

          Maybe touching a file in the beginning of the build and reading its timestamp back to use as the basis might be the way to go.

          Kohsuke Kawaguchi added a comment - Note that if this check is not in place, tests that are removed at one point will continue to be tabulated by Hudson, until you do a clean build. Maybe touching a file in the beginning of the build and reading its timestamp back to use as the basis might be the way to go.

          carltonb added a comment -

          That sounds like a good plan, as long as the reference file is created in the same way the unit test files are created (on the same slave, in the same workspace) to avoid unnecessary differences.

          carltonb added a comment - That sounds like a good plan, as long as the reference file is created in the same way the unit test files are created (on the same slave, in the same workspace) to avoid unnecessary differences.

          ebann added a comment -

          Related to 6268.

          Aggressive time check should be an option.
          In my case Junit xml files are retrieved in other jobs artifacts, i just want to regroup and parse all tests results in a single job.

          ebann added a comment - Related to 6268. Aggressive time check should be an option. In my case Junit xml files are retrieved in other jobs artifacts, i just want to regroup and parse all tests results in a single job.

          kutzi added a comment -

          Regarding Kohsuke's comment from 15/Jun/10: I'd say that not doing a clean build and relying on some 'magic' timestamp comparison to calculate the test results, is asking for errors.

          I agree with others that this feature should be configurable - and ideally off by default, but that's probably not possible for backwards compatibility.

          kutzi added a comment - Regarding Kohsuke's comment from 15/Jun/10: I'd say that not doing a clean build and relying on some 'magic' timestamp comparison to calculate the test results, is asking for errors. I agree with others that this feature should be configurable - and ideally off by default, but that's probably not possible for backwards compatibility.

          This is a stopper for some cases of build execution time optimization. I even can't imagine the reasons why this feature exists.

          Stanislav Tyurikov added a comment - This is a stopper for some cases of build execution time optimization. I even can't imagine the reasons why this feature exists.

          Kent Bolton added a comment -

          Looks like the magic number is 1000 miliseconds. Code below in a patch here: http://git.eclipse.org/c/hudson/org.eclipse.hudson.core.git/commit?id=97587dde35e862fabbd8b5b4b0136a786cfcff37 .

          This is an issue when using VMs where time can drift quite quickly. Would be good to disable this test, or to be able to change the magic number of seconds in configuration.

          @@ -151,21 +144,22 @@ public final class TestResult extends MetaTabulatedResult {
          }
          }

          • if(!parsed) {
            + if (!parsed) {
            long localTime = System.currentTimeMillis();
          • if(localTime < buildTime-1000) /margin/
          • // build time is in the the future. clock on this slave must be running behind
            + if (localTime < buildTime - 1000) /margin/ // build time is in the the future. clock on this slave must be running behind
            + { throw new AbortException( - "Clock on this slave is out of sync with the master, and therefore \n" + - "I can't figure out what test results are new and what are old.\n" + - "Please keep the slave clock in sync with the master."); + "Clock on this slave is out of sync with the master, and therefore \n" + + "I can't figure out what test results are new and what are old.\n" + + "Please keep the slave clock in sync with the master."); + }

          Kent Bolton added a comment - Looks like the magic number is 1000 miliseconds. Code below in a patch here: http://git.eclipse.org/c/hudson/org.eclipse.hudson.core.git/commit?id=97587dde35e862fabbd8b5b4b0136a786cfcff37 . This is an issue when using VMs where time can drift quite quickly. Would be good to disable this test, or to be able to change the magic number of seconds in configuration. @@ -151,21 +144,22 @@ public final class TestResult extends MetaTabulatedResult { } } if(!parsed) { + if (!parsed) { long localTime = System.currentTimeMillis(); if(localTime < buildTime-1000) / margin / // build time is in the the future. clock on this slave must be running behind + if (localTime < buildTime - 1000) / margin / // build time is in the the future. clock on this slave must be running behind + { throw new AbortException( - "Clock on this slave is out of sync with the master, and therefore \n" + - "I can't figure out what test results are new and what are old.\n" + - "Please keep the slave clock in sync with the master."); + "Clock on this slave is out of sync with the master, and therefore \n" + + "I can't figure out what test results are new and what are old.\n" + + "Please keep the slave clock in sync with the master."); + }

          Carlos Garcia added a comment -

          I have a similar problem. Even though my clocks are in sync (agents and master), Jenkins complains about them being out of sync.

          I agree with you, making this "magic number" of 1000 configurable, or even disabling this unnecessary check, is necessary. Given that this is a 9-year-old bug I don't expect this to be fixed; I'll try to modify this behaviour myself and build Jenkins.

          Carlos Garcia added a comment - I have a similar problem. Even though my clocks are in sync (agents and master), Jenkins complains about them being out of sync. I agree with you, making this "magic number" of 1000 configurable, or even disabling this unnecessary check, is necessary. Given that this is a 9-year-old bug I don't expect this to be fixed; I'll try to modify this behaviour myself and build Jenkins.

            Unassigned Unassigned
            carltonb carltonb
            Votes:
            14 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: