• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • gerrit-trigger-plugin

      We are experiencing a delay in Gerrit triggered Jobs in our Jenkins Jobs.

      attached is the stack trace of the blocking threads.

      Jul 18, 2018 5:07:57 PM com.sonymobile.tools.gerrit.gerritevents.GerritHandler checkQueueSize
      WARNING: The Gerrit incoming events queue contains 28095 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 30

       

      Jul 18, 2018 6:54:37 PM com.sonymobile.tools.gerrit.gerritevents.GerritJsonEventFactory getEvent
      FINE: Constructor with JSONObject as parameter missing, trying default constructor.
      java.lang.NoSuchMethodException: com.sonymobile.tools.gerrit.gerritevents.dto.events.RefUpdated.<init>(net.sf.json.JSONObject)
      at java.lang.Class.getConstructor0(Class.java:3082)
      at java.lang.Class.getConstructor(Class.java:1825)
      at com.sonymobile.tools.gerrit.gerritevents.GerritJsonEventFactory.getEvent(GerritJsonEventFactory.java:69)
      at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:69)
      at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67)
      at com.sonymobile.tools.gerrit.gerritevents.workers.EventThread.run(EventThread.java:66)
      at com.sonyericsson.hudson.plugins.gerrit.trigger.SystemEventThread.run(SystemEventThread.java:66)

       

       

       

          [JENKINS-52636] Gerrit triggered jobs getting delayed

          We're having the same issue. Running on a machine with 32 cores we have a ton of events queued up and the trigger is barely able to keep up with them. Sometimes Jenkins appears to be dropping builds because of this and some of the requests are not honored.

           

          Our updated thread pool size doesn't appear to be taking effect (although that should be put in a different issues).

          Darrien Glasser added a comment - We're having the same issue. Running on a machine with 32 cores we have a ton of events queued up and the trigger is barely able to keep up with them. Sometimes Jenkins appears to be dropping builds because of this and some of the requests are not honored.   Our updated thread pool size doesn't appear to be taking effect (although that should be put in a different issues).

          Darrien Glasser added a comment - - edited

          Update, we've been debugging this issue, and it seems like the Replication Plugin may be (somehow) interfering with the missed events playback plugin/event trigger.

          We're not entirely certain, but it almost looks like the event trigger (that sends builds to Jenkins) is receiving replication events and gets stuck trying to understand what to do with them.

          We will continue investigating.

          Disabling the replication plugin reduces our backlog queue to 0 and events are perfectly streamed to Jenkins.

          Darrien Glasser added a comment - - edited Update, we've been debugging this issue, and it seems like the Replication Plugin may be (somehow) interfering with the missed events playback plugin/event trigger. We're not entirely certain, but it almost looks like the event trigger (that sends builds to Jenkins) is receiving replication events and gets stuck trying to understand what to do with them. We will continue investigating. Disabling the replication plugin reduces our backlog queue to 0 and events are perfectly streamed to Jenkins.

          Darrien Glasser added a comment - - edited

          Our team has confirmed that Gerrit trigger listens to all gerrit events, including ones it cannot act upon. It takes an inordinate time to determine whether or not it is able to act on them, finally only throwing a `NoSuchMethodException` trying to reflect into an object and see if the event has what it is looking for.

          This wasn't really an issue before, as all data was stored in SQL somewhere. NoteDB stores all data (events included) in refs/changes/* which all generate events that Jenkins listens for.

          We started using the replication plugin, and added reviewers automatically when a review was posted, and our Jenkins server was inundated almost immediately.

          As it stands, this makes the gerrit-trigger plugin unusable for anybody using the replication plugin, and as gerrit generates more metadata over time, will make the plugin unusable in all general cases unless it starts only listening to relevant events.

          We've found this gerrit trigger implemented with pipelines does not have the issue https://github.com/jenkinsci/gerrit-code-review-plugin

          And will likely be switching to it in the future.

          Darrien Glasser added a comment - - edited Our team has confirmed that Gerrit trigger listens to all gerrit events, including ones it cannot act upon. It takes an inordinate time to determine whether or not it is able to act on them, finally only throwing a `NoSuchMethodException` trying to reflect into an object and see if the event has what it is looking for. This wasn't really an issue before, as all data was stored in SQL somewhere. NoteDB stores all data (events included) in refs/changes/* which all generate events that Jenkins listens for. We started using the replication plugin, and added reviewers automatically when a review was posted, and our Jenkins server was inundated almost immediately. As it stands, this makes the gerrit-trigger plugin unusable for anybody using the replication plugin, and as gerrit generates more metadata over time, will make the plugin unusable in all general cases unless it starts only listening to relevant events. We've found this gerrit trigger implemented with pipelines does not have the issue https://github.com/jenkinsci/gerrit-code-review-plugin And will likely be switching to it in the future.

          I have two PRs related to this issue.

          https://github.com/jenkinsci/gerrit-trigger-plugin/pull/397

          https://github.com/jenkinsci/gerrit-trigger-plugin/pull/398

          The first one is to add an event filter to the gerrit event stream and the second one is delegate disk writing to a thread instead of letting the workers have a hold up.

          Christoffer Cortes Sjöwall added a comment - - edited I have two PRs related to this issue. https://github.com/jenkinsci/gerrit-trigger-plugin/pull/397 https://github.com/jenkinsci/gerrit-trigger-plugin/pull/398 The first one is to add an event filter to the gerrit event stream and the second one is delegate disk writing to a thread instead of letting the workers have a hold up.

          Ganesh Saraf added a comment -

          Hi Christoffer,

          That is good news. When we can expect it to get released?

           

          Ganesh Saraf added a comment - Hi Christoffer, That is good news. When we can expect it to get released?  

          Dustin Oprea added a comment -

          Any status on this? There's a chance that my team just encountered this.

           

          Dustin Oprea added a comment - Any status on this? There's a chance that my team just encountered this.  

          It doesn't look like there's been any action on the PR yet although it mostly looks good to go from here.

          In the meantime, I have a patched version we're using at the company I'm at. You're free to use while we wait: https://github.com/DarrienG/gerrit-trigger-plugin/releases/tag/2.31.0-uninterested

          It's basically HEAD from the official repo + ignores irrelevant events. We have probably 3000+ builds a day and haven't seen any issues with it while we wait. Jenkins was unusable for us otherwise.

          Darrien Glasser added a comment - It doesn't look like there's been any action on the PR yet although it mostly looks good to go from here. In the meantime, I have a patched version we're using at the company I'm at. You're free to use while we wait: https://github.com/DarrienG/gerrit-trigger-plugin/releases/tag/2.31.0-uninterested It's basically HEAD from the official repo + ignores irrelevant events. We have probably 3000+ builds a day and haven't seen any issues with it while we wait. Jenkins was unusable for us otherwise.

          Dustin Oprea added a comment - - edited

          We tried bumping to 2.29.0 during a system upgrade, and the queue started wildly accumulating without actually starting any builds. We reverted the plugin to 2.27.5 and things appear to be rolling again.

          Dustin Oprea added a comment - - edited We tried bumping to 2.29.0 during a system upgrade, and the queue started wildly accumulating without actually starting any builds. We reverted the plugin to 2.27.5 and things appear to be rolling again.

          Jia Jia added a comment -

          Any status on this? There's also a chance that my team just encountered this recently.

          Jia Jia added a comment - Any status on this? There's also a chance that my team just encountered this recently.

          Recently 2.30.0 was released that include two changes meant to reduce queue load. One reduces the disk writing when playback is enabled and the other gives you the ability to filter out unnecessary gerrit messages from the main settings panel under advanced. Some may still be experiencing delays and queue build ups though.

          Christoffer Cortes Sjöwall added a comment - Recently 2.30.0 was released that include two changes meant to reduce queue load. One reduces the disk writing when playback is enabled and the other gives you the ability to filter out unnecessary gerrit messages from the main settings panel under advanced. Some may still be experiencing delays and queue build ups though.

            scoheb Scott Hebert
            eattsma Amit Sharma
            Votes:
            12 Vote for this issue
            Watchers:
            25 Start watching this issue

              Created:
              Updated: