Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46349

Naginator should not retry when queued or successful builds already exist

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Component/s: naginator-plugin
    • Labels:
      None
    • Environment:
      Jenkins 2.71 (directly started)
      Naginator 1.17.2
      Running on Linux using JDK 1.8.0_141
    • Similar Issues:

      Description

      We have 'build' and 'test' jobs. The build job builds several variants of the SW and then triggers several downstream jobs to test all the variants in parallel. The test jobs have a naginator step attached, which checks for specific REs in the console output, which indicate an environment problem (instead of an actual test failure). If such an RE is found, a retry is triggered up to three times. If the build and all triggered tests pass, the build gets promoted.

      Now I found this sequence of test jobs:

      1. Test #4793 runs, Aug 22, 2017 3:22:20 PM, triggered by build #6133
        and finally fails
      2. Test #4794 (Aug 22, 2017 3:31:50 PM), triggered by build #6134
        (which got triggered by a code change)
        PASSES
      3. test #4795 "Started by Naginator after the failure of build #4793"
        finally failed too

      Huh ? It seems to me, that naginator is triggering a retry of a test, although there is a separately triggered test before, which succeeded.

      If my interpretation is correct, it means

      1. At least an unneeded test run
      2. In the worst case the old build (#6133) might now trigger a promotion, which will overwrite the one from the newer build (#6134)
        Didn't happen here, because the retry failed too - so I can't show that.

      Seems to me, naginator should NOT trigger a retry, if any upstream job is running or queued at the time of the failure.

        Attachments

          Activity

          Hide
          martinjost Martin Jost added a comment -

          Update to the report:

          Unfortunately we've seen today the issue I've speculated about above. The behavior of the naginator had led to a "downdating" of the promoted version to an older one. The newer one was already promoted by a successful new build/test pair; after that a retest triggered by naginator made an older build qualify for promotion too, which overwrote the more recent promoted version.

          For the moment we're trying to fix this by a modification of your script checking whether a build with all tests passing is valid and needed for promotion. (We have other reasons not to promote something which qualified, e.g. changes "we" (== our System Component) wants to test, but is no visible change on system level)

          Still I would like to see an approach from naginator on this, even if I'm not sure if this behavior would need to be configurable.

          Show
          martinjost Martin Jost added a comment - Update to the report: Unfortunately we've seen today the issue I've speculated about above. The behavior of the naginator had led to a "downdating" of the promoted version to an older one. The newer one was already promoted by a successful new build/test pair; after that a retest triggered by naginator made an older build qualify for promotion too, which overwrote the more recent promoted version. For the moment we're trying to fix this by a modification of your script checking whether a build with all tests passing is valid and needed for promotion. (We have other reasons not to promote something which qualified, e.g. changes "we" (== our System Component) wants to test, but is no visible change on system level) Still I would like to see an approach from naginator on this, even if I'm not sure if this behavior would need to be configurable.
          Hide
          ikedam ikedam added a comment -

          It's not a bug, but an expected behavior of naginator for now.
          Naginator cares only the failed build and doesn't care the build history of the project.
          Changed to New Feature.

          Show
          ikedam ikedam added a comment - It's not a bug, but an expected behavior of naginator for now. Naginator cares only the failed build and doesn't care the build history of the project. Changed to New Feature.

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            martinjost Martin Jost
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated: