• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • gerrit-trigger-plugin
    • None
    • Jenkins 1.554.1
      GerritTrigger 2.11.1

      When viewing jobs, newer builds disappear from the build history list. Trying to go to the URL of a newer build results in a 404. The missing builds are located on disk and restarting Jenkins will make them reappear.

      Sometimes, the nextBuildNumber is set to a lower number and duplicate builds are created; two date-directories are created in builds/ but only the latest build has a build_number symlink.

      I wrote tool to help detect these problems on disk and even repair them (for some value of repair): https://github.com/docwhat/jenkins-job-checker

      A lot of history is on JENKINS-15156; I'll repeat some of it below.

          [JENKINS-23152] builds getting lost due to GerritTrigger

          Christian Höltje created issue -

          Summary of the IRC chat:

          On IRC schristou said that he tracked this down to GerritTrigger...

          Specifically, his steps to reproduce are to "Reload Configuration from Disk" and then kick off a Gerrit build.

          The nextBuildNumber was new to GerritTrigger author rsandell; "It could have something to do with cancel previous patchsets, but I'm just guessing".

          I'm also fairly certain its happening to us even without "Reload Configuration from Disk" because we normally don't use that (we've been using Jenkins since the days when build info would be lost and are afraid of it).

          rsandell mentioned that core not sending a start/stop signal to the triggers when a reload from disk is performed makes it very hard for him to make GerritTrigger behave better.

          Christian Höltje added a comment - Summary of the IRC chat: On IRC schristou said that he tracked this down to GerritTrigger... Specifically, his steps to reproduce are to "Reload Configuration from Disk" and then kick off a Gerrit build. The nextBuildNumber was new to GerritTrigger author rsandell ; "It could have something to do with cancel previous patchsets, but I'm just guessing". I'm also fairly certain its happening to us even without "Reload Configuration from Disk" because we normally don't use that (we've been using Jenkins since the days when build info would be lost and are afraid of it). rsandell mentioned that core not sending a start/stop signal to the triggers when a reload from disk is performed makes it very hard for him to make GerritTrigger behave better.
          Christian Höltje made changes -
          Remote Link New: This issue links to "May 21st IRC chat log (Web Link)" [ 10901 ]

          My current theory is that either:

          1) The list-of-builds is being modified by something
          2) A new list-of-builds is being created at certain times (e.g. "Reload Configuration from Disk") and some things are still using the old list-of-builds. This causes the weird mis-match and possibly explains how nextBuildNumber gets "reset" to an older version. Possibly the object that is recreated is the Project itself (which holds the nextBuildNumber).

          I think #2 is the more likely.

          Christian Höltje added a comment - My current theory is that either: 1) The list-of-builds is being modified by something 2) A new list-of-builds is being created at certain times (e.g. "Reload Configuration from Disk") and some things are still using the old list-of-builds. This causes the weird mis-match and possibly explains how nextBuildNumber gets "reset" to an older version. Possibly the object that is recreated is the Project itself (which holds the nextBuildNumber ). I think #2 is the more likely.
          Christian Höltje made changes -
          Link New: This issue is related to JENKINS-15156 [ JENKINS-15156 ]
          Christian Höltje made changes -
          Link New: This issue is related to JENKINS-23130 [ JENKINS-23130 ]

          Jesse Glick added a comment -

          I am willing to help solve this; what I am stuck on is just reproducing the problem, specifically getting Gerrit configured to the point where it triggers something (since I am not a Gerrit user and its configuration seems rather complex). Is there perhaps some method in the Gerrit Trigger plugin I could call from the script console which would simulate the trigger from a Gerrit server without actually running anything besides Jenkins? That would be the starting point for a regression test anyway. But I will ask @schristou about this too.

          Jesse Glick added a comment - I am willing to help solve this; what I am stuck on is just reproducing the problem, specifically getting Gerrit configured to the point where it triggers something (since I am not a Gerrit user and its configuration seems rather complex). Is there perhaps some method in the Gerrit Trigger plugin I could call from the script console which would simulate the trigger from a Gerrit server without actually running anything besides Jenkins? That would be the starting point for a regression test anyway. But I will ask @schristou about this too.
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-10709 [ JENKINS-10709 ]

          jglick;

          I'm willing to pair with you to reproduce (and even fixing the bug).

          It appears that the "Query and Trigger Gerrit" thingy actually produces a slightly different bug.

          I think what we can do is point it at a Gerrit repository and have the job trigger on any Gerrit change... but we have to find a Gerrit repository you have permission to "listen to the event streams" on.

          Christian Höltje added a comment - jglick ; I'm willing to pair with you to reproduce (and even fixing the bug). It appears that the "Query and Trigger Gerrit" thingy actually produces a slightly different bug. I think what we can do is point it at a Gerrit repository and have the job trigger on any Gerrit change... but we have to find a Gerrit repository you have permission to "listen to the event streams" on.

          Jesse Glick added a comment -

          Reproduced. From code review, both BuildMemory and GerritTrigger appear to be storing (indirect) references to Job instances in global static state, which is not permissible. Looking into a fix; probably cannot test any fix very realistically other than to confirm that it solves the observed problem and does not regress tests, but that is what a PR is for.

          Jesse Glick added a comment - Reproduced. From code review, both BuildMemory and GerritTrigger appear to be storing (indirect) references to Job instances in global static state, which is not permissible. Looking into a fix; probably cannot test any fix very realistically other than to confirm that it solves the observed problem and does not regress tests, but that is what a PR is for.

            rsandell rsandell
            docwhat Christian Höltje
            Votes:
            7 Vote for this issue
            Watchers:
            23 Start watching this issue

              Created:
              Updated:
              Resolved: