Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63000

Gerrit Triggers stop triggering builds and the queue builds up

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • gerrit-trigger-plugin
    • None
    • CloudBees CI 2.222.2.1
      Gerrit Trigger Plugin 2.30.5

      1. Summary: 
        1. Gerrit triggers will at some point stop triggering builds in Jenkins. The queue can be seen to grow out of control and is visible as such inside Jenkins. The only current workaround has been to restart the master to get the queue flowing again. This issue has popped up in the effected master twice over a month.
      1. Steps to reproduce
        1. The exact cause is not known yet. 
        2. This is following the Gerrit Trigger Plugin being upgraded from version 2.27.1 to version 2.30.5.
        3. This also followed the Jenkins server being upgraded from a 1.x version to a 2.x version.
      2. Expected behavior. 
        1. The Gerrit Trigger queue will not start to back up and will continue executing builds.
      3. Actual behavior
        1. The Gerrit queue begins to build.
        2. No builds are started from Gerrit Triggers.

      The log messages do indicate that the queue has stopped processing:

      2020-05-07 21:35:57.166+0000 [id=1253] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 247 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
      

      Increasing the number of worker threads has no impact on this issue. As observed, the queue size reported will continue to grow until the master is restarted.

      A thread dump was captured when this issue was happening. There were four Gerrit threads sitting in a WAITING state while the queue was growing:

      "Gerrit Worker EventThread_28" id=691286 (0xa8c56) state=WAITING cpu=95%
          - waiting on <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync)
          - locked <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync)
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
          at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.waitForProjectListToBeReady(GerritTrigger.java:1876)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:188)
          at sun.reflect.GeneratedMethodAccessor607.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListener(GerritHandler.java:496)
          at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListeners(GerritHandler.java:476)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwareGerritHandler.notifyListeners(JenkinsAwareGerritHandler.java:80)
          at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractGerritEventWork.perform(AbstractGerritEventWork.java:46)
          at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77)
          at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67)
          at com.sonymobile.tools.gerrit.gerritevents.GerritHandler$EventWorker.run(GerritHandler.java:302)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
      

      We also see these Gerrit threads:

      "com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater for review-tbs Thread" id=37 (0x25) state=TIMED_WAITING cpu=63%
          - waiting on <0x0b482201> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
          - locked <0x0b482201> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
          at java.lang.Object.wait(Native Method)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.waitFor(GerritProjectListUpdater.java:212)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.run(GerritProjectListUpdater.java:169)
      
      "com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater for review-tbs-dev Thread" id=38 (0x26) state=TIMED_WAITING cpu=63%
          - waiting on <0x2e1418f7> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
          - locked <0x2e1418f7> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
          at java.lang.Object.wait(Native Method)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.waitFor(GerritProjectListUpdater.java:212)
          at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.run(GerritProjectListUpdater.java:169)
      

      Attached full thread dump

      1. Workaround. 
        1. **Restart the effected Master.
      2. Business impact. 
        1. **No Gerrit builds are triggered and the master must be restarted to resolve the problem.

        1. configDiff.png
          configDiff.png
          41 kB
        2. CustomTracesAwait.png
          CustomTracesAwait.png
          8 kB
        3. thread-dump.txt
          373 kB

          [JENKINS-63000] Gerrit Triggers stop triggering builds and the queue builds up

          Mitch McLaughlin created issue -
          Ryan Campbell made changes -
          Description Original: # *Summary:* 

           ## Gerrit triggers will at some point stop triggering builds in Jenkins. The queue can be seen to grow out of control and is visible as such inside Jenkins. The only current workaround has been to restart the master to get the queue flowing again. This issue has popped up in the effected master twice over a month.
           # *Steps to reproduce*. 
           ## The exact cause is not known yet. 
           ## This is following the Gerrit Trigger Plugin being upgraded from version 2.27.1 to version 2.30.5.
           ## This also followed the Jenkins server being upgraded from a 1.x version to a 2.x version.
           # *Expected behavior.* 
           ## The Gerrit Trigger queue will not start to back up and will continue executing builds.
           # *Actual behavior*. 
           ## The Gerrit queue begins to build.
           ## No builds are started from Gerrit Triggers.

          The log messages do indicate that the queue has stopped processing:
          {code:java}
          2020-05-07 21:35:57.166+0000 [id=1253] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 247 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
          {code}
          Increasing the number of worker threads has no impact on this issue. As observed, the queue size reported will continue to grow until the master is restarted.

          A thread dump was captured when this issue was happening. There wasn't much that stood out. But there were Gerrit threads sitting in a WAITING state while the queue was growing:
          {code:java}
          threadId:691286 (0xa8c56) - state:WAITING stackTrace: - waiting on <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) - locked <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.gerritTrigger.waitForProjectListToBeReady(gerritTrigger.java:1876) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:188) at sun.reflect.GeneratedMethodAccessor607.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListener(gerritHandler.java:496) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListeners(gerritHandler.java:476) at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwaregerritHandler.notifyListeners(JenkinsAwaregerritHandler.java:80) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractgerritEventWork.perform(AbstractgerritEventWork.java:46) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77) at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler$EventWorker.run(gerritHandler.java:302) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked synchronizers: count = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@6d79c90c
          {code}
          Attached full thread dump
           # *Workaround.* 
           ## **Restart the effected Master.
           # *Business impact.* 
           ## **On an effected Master, at any moment, Gerrit jobs could stop triggering and it requires downtime to get them running again
          New: # *Summary:* 

           ## Gerrit triggers will at some point stop triggering builds in Jenkins. The queue can be seen to grow out of control and is visible as such inside Jenkins. The only current workaround has been to restart the master to get the queue flowing again. This issue has popped up in the effected master twice over a month.
           # *Steps to reproduce*. 
           ## The exact cause is not known yet. 
           ## This is following the Gerrit Trigger Plugin being upgraded from version 2.27.1 to version 2.30.5.
           ## This also followed the Jenkins server being upgraded from a 1.x version to a 2.x version.
           # *Expected behavior.* 
           ## The Gerrit Trigger queue will not start to back up and will continue executing builds.
           # *Actual behavior*. 
           ## The Gerrit queue begins to build.
           ## No builds are started from Gerrit Triggers.

          The log messages do indicate that the queue has stopped processing:
          {code:java}
          2020-05-07 21:35:57.166+0000 [id=1253] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 247 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
          {code}
          Increasing the number of worker threads has no impact on this issue. As observed, the queue size reported will continue to grow until the master is restarted.

          A thread dump was captured when this issue was happening. There were four Gerrit threads sitting in a WAITING state while the queue was growing:
          {code:java}
          threadId:691286 (0xa8c56) - state:WAITING stackTrace: - waiting on <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) - locked <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.gerritTrigger.waitForProjectListToBeReady(gerritTrigger.java:1876) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:188) at sun.reflect.GeneratedMethodAccessor607.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListener(gerritHandler.java:496) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListeners(gerritHandler.java:476) at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwaregerritHandler.notifyListeners(JenkinsAwaregerritHandler.java:80) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractgerritEventWork.perform(AbstractgerritEventWork.java:46) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77) at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler$EventWorker.run(gerritHandler.java:302) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked synchronizers: count = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@6d79c90c
          {code}
          Attached full thread dump
           # *Workaround.* 
           ## **Restart the effected Master.
           # *Business impact.* 
           ## **No Gerrit builds are triggered and the master must be restarted to resolve the problem.
          Ryan Campbell made changes -
          Description Original: # *Summary:* 

           ## Gerrit triggers will at some point stop triggering builds in Jenkins. The queue can be seen to grow out of control and is visible as such inside Jenkins. The only current workaround has been to restart the master to get the queue flowing again. This issue has popped up in the effected master twice over a month.
           # *Steps to reproduce*. 
           ## The exact cause is not known yet. 
           ## This is following the Gerrit Trigger Plugin being upgraded from version 2.27.1 to version 2.30.5.
           ## This also followed the Jenkins server being upgraded from a 1.x version to a 2.x version.
           # *Expected behavior.* 
           ## The Gerrit Trigger queue will not start to back up and will continue executing builds.
           # *Actual behavior*. 
           ## The Gerrit queue begins to build.
           ## No builds are started from Gerrit Triggers.

          The log messages do indicate that the queue has stopped processing:
          {code:java}
          2020-05-07 21:35:57.166+0000 [id=1253] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 247 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
          {code}
          Increasing the number of worker threads has no impact on this issue. As observed, the queue size reported will continue to grow until the master is restarted.

          A thread dump was captured when this issue was happening. There were four Gerrit threads sitting in a WAITING state while the queue was growing:
          {code:java}
          threadId:691286 (0xa8c56) - state:WAITING stackTrace: - waiting on <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) - locked <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.gerritTrigger.waitForProjectListToBeReady(gerritTrigger.java:1876) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:188) at sun.reflect.GeneratedMethodAccessor607.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListener(gerritHandler.java:496) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler.notifyListeners(gerritHandler.java:476) at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwaregerritHandler.notifyListeners(JenkinsAwaregerritHandler.java:80) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractgerritEventWork.perform(AbstractgerritEventWork.java:46) at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77) at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67) at com.sonymobile.tools.gerrit.gerritevents.gerritHandler$EventWorker.run(gerritHandler.java:302) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked synchronizers: count = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@6d79c90c
          {code}
          Attached full thread dump
           # *Workaround.* 
           ## **Restart the effected Master.
           # *Business impact.* 
           ## **No Gerrit builds are triggered and the master must be restarted to resolve the problem.
          New: # *Summary:* 

           ## Gerrit triggers will at some point stop triggering builds in Jenkins. The queue can be seen to grow out of control and is visible as such inside Jenkins. The only current workaround has been to restart the master to get the queue flowing again. This issue has popped up in the effected master twice over a month.
           # *Steps to reproduce*. 
           ## The exact cause is not known yet. 
           ## This is following the Gerrit Trigger Plugin being upgraded from version 2.27.1 to version 2.30.5.
           ## This also followed the Jenkins server being upgraded from a 1.x version to a 2.x version.
           # *Expected behavior.* 
           ## The Gerrit Trigger queue will not start to back up and will continue executing builds.
           # *Actual behavior*. 
           ## The Gerrit queue begins to build.
           ## No builds are started from Gerrit Triggers.

          The log messages do indicate that the queue has stopped processing:
          {code:java}
          2020-05-07 21:35:57.166+0000 [id=1253] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 247 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
          {code}
          Increasing the number of worker threads has no impact on this issue. As observed, the queue size reported will continue to grow until the master is restarted.

          A thread dump was captured when this issue was happening. There were four Gerrit threads sitting in a WAITING state while the queue was growing:
          {code:java}
          "Gerrit Worker EventThread_28" id=691286 (0xa8c56) state=WAITING cpu=95%
              - waiting on <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync)
              - locked <0x66dc3d20> (a java.util.concurrent.CountDownLatch$Sync)
              at sun.misc.Unsafe.park(Native Method)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
              at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.waitForProjectListToBeReady(GerritTrigger.java:1876)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:188)
              at sun.reflect.GeneratedMethodAccessor607.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListener(GerritHandler.java:496)
              at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListeners(GerritHandler.java:476)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwareGerritHandler.notifyListeners(JenkinsAwareGerritHandler.java:80)
              at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractGerritEventWork.perform(AbstractGerritEventWork.java:46)
              at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77)
              at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67)
              at com.sonymobile.tools.gerrit.gerritevents.GerritHandler$EventWorker.run(GerritHandler.java:302)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
          {code}

          We also see these Gerrit threads:

          {code}
          "com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater for review-tbs Thread" id=37 (0x25) state=TIMED_WAITING cpu=63%
              - waiting on <0x0b482201> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
              - locked <0x0b482201> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
              at java.lang.Object.wait(Native Method)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.waitFor(GerritProjectListUpdater.java:212)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.run(GerritProjectListUpdater.java:169)

          "com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater for review-tbs-dev Thread" id=38 (0x26) state=TIMED_WAITING cpu=63%
              - waiting on <0x2e1418f7> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
              - locked <0x2e1418f7> (a com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater)
              at java.lang.Object.wait(Native Method)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.waitFor(GerritProjectListUpdater.java:212)
              at com.sonyericsson.hudson.plugins.gerrit.trigger.GerritProjectListUpdater.run(GerritProjectListUpdater.java:169)
          {code}
          Attached full thread dump
           # *Workaround.* 
           ## **Restart the effected Master.
           # *Business impact.* 
           ## **No Gerrit builds are triggered and the master must be restarted to resolve the problem.

          Ryan Campbell added a comment - - edited

          I also see log messages such as these:

          ...
          2020-08-12 21:35:41.656+0000 [id=347882]        INFO    c.s.t.g.g.w.StreamWatchdog#run: Last lively connection with Gerrit was 300 seconds ago; reconnecting.
          2020-08-12 21:35:41.661+0000 [id=1096]  INFO    c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionDown: connectionDown for server: xxxx-stg
          2020-08-12 21:35:41.755+0000 [id=1096]  WARNING c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionEstablished: Playback of missed events not supported for server xxxx-stg!
          2020-08-12 21:35:41.755+0000 [id=1096]  INFO    c.s.t.g.g.GerritConnection#run: Ready to receive data from Gerrit: xxxx-stg
          ...
          2020-08-12 21:35:47.356+0000 [id=1095]  INFO    c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionDown: connectionDown for server: xxxxx-dev
          2020-08-12 21:35:47.476+0000 [id=1095]  WARNING c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionEstablished: Playback of missed events not supported for server xxxxx-dev!
          2020-08-12 21:35:47.476+0000 [id=1095]  INFO    c.s.t.g.g.GerritConnection#run: Ready to receive data from Gerrit: xxxxx-dev
          ...
          2020-08-12 21:36:09.798+0000 [id=26]    SEVERE  c.s.h.p.g.t.h.GerritTrigger#updateTriggerConfigURL: IOException for project: projectname and URL: https://artifactory.customer.com/artifactory/path/somefile.txt Message: https://artifactory.customer.com/artifactory/path/somefile.txt
          java.io.FileNotFoundException: https://artifactory.customer.com/artifactory/path/somefile.txt
                  at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1896)
                  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
                  at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
                  at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritDynamicUrlProcessor.fetch(GerritDynamicUrlProcessor.java:266)
                  at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.DynamicConfigurationCacheProxy.fetchThroughCache(DynamicConfigurationCacheProxy.java:47)
                  at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.updateTriggerConfigURL(GerritTrigger.java:1761)
                  at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTriggerTimerTask.run(GerritTriggerTimerTask.java:63)
                  at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
                  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
                  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                  at java.lang.Thread.run(Thread.java:748)
          ...
          2020-08-12 21:37:09.675+0000 [id=1094]  WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 40 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4
          
          

          Ryan Campbell added a comment - - edited I also see log messages such as these: ... 2020-08-12 21:35:41.656+0000 [id=347882] INFO c.s.t.g.g.w.StreamWatchdog#run: Last lively connection with Gerrit was 300 seconds ago; reconnecting. 2020-08-12 21:35:41.661+0000 [id=1096] INFO c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionDown: connectionDown for server: xxxx-stg 2020-08-12 21:35:41.755+0000 [id=1096] WARNING c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionEstablished: Playback of missed events not supported for server xxxx-stg! 2020-08-12 21:35:41.755+0000 [id=1096] INFO c.s.t.g.g.GerritConnection#run: Ready to receive data from Gerrit: xxxx-stg ... 2020-08-12 21:35:47.356+0000 [id=1095] INFO c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionDown: connectionDown for server: xxxxx-dev 2020-08-12 21:35:47.476+0000 [id=1095] WARNING c.s.h.p.g.t.p.GerritMissedEventsPlaybackManager#connectionEstablished: Playback of missed events not supported for server xxxxx-dev! 2020-08-12 21:35:47.476+0000 [id=1095] INFO c.s.t.g.g.GerritConnection#run: Ready to receive data from Gerrit: xxxxx-dev ... 2020-08-12 21:36:09.798+0000 [id=26] SEVERE c.s.h.p.g.t.h.GerritTrigger#updateTriggerConfigURL: IOException for project: projectname and URL: https: //artifactory.customer.com/artifactory/path/somefile.txt Message: https://artifactory.customer.com/artifactory/path/somefile.txt java.io.FileNotFoundException: https: //artifactory.customer.com/artifactory/path/somefile.txt at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1896) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritDynamicUrlProcessor.fetch(GerritDynamicUrlProcessor.java:266) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.DynamicConfigurationCacheProxy.fetchThroughCache(DynamicConfigurationCacheProxy.java:47) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.updateTriggerConfigURL(GerritTrigger.java:1761) at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTriggerTimerTask.run(GerritTriggerTimerTask.java:63) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) ... 2020-08-12 21:37:09.675+0000 [id=1094] WARNING c.s.t.g.g.GerritHandler#checkQueueSize: The Gerrit incoming events queue contains 40 items! Something might be stuck, or your system can't process the commands fast enough. Try to increase the number of receiving worker threads. Current thread-pool size: 4

          Ryan Campbell added a comment -

          Seems similar to JENKINS-56528

          Ryan Campbell added a comment - Seems similar to JENKINS-56528
          Ryan Campbell made changes -
          Link New: This issue relates to JENKINS-56528 [ JENKINS-56528 ]

          rsandell added a comment -

          Just a guess atm, but it seems like the trigger is waiting for the dynamic project configuration to update before it starts to let events through (
          com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.waitForProjectListToBeReady
          ). But the dynamic update fails, so the latch is never opened (
          c.s.h.p.g.t.h.GerritTrigger#updateTriggerConfigURL: IOException for project: projectname and URL: https://artifactory.customer.com/artifactory/path/somefile.txt Message: https://artifactory.customer.com/artifactory/path/somefile.txtjava.io.FileNotFoundException: https://artifactory.customer.com/artifactory/path/somefile.txt
          )

          rsandell added a comment - Just a guess atm, but it seems like the trigger is waiting for the dynamic project configuration to update before it starts to let events through ( com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.waitForProjectListToBeReady ). But the dynamic update fails, so the latch is never opened ( c.s.h.p.g.t.h.GerritTrigger#updateTriggerConfigURL: IOException for project: projectname and URL: https://artifactory.customer.com/artifactory/path/somefile.txt Message: https://artifactory.customer.com/artifactory/path/somefile.txtjava.io.FileNotFoundException: https://artifactory.customer.com/artifactory/path/somefile.txt )

          Looping in georgbremer

          George Cimpoies added a comment - Looping in georgbremer

          aric gardner added a comment -

          Hitting this as well. Jenkins server streams events from 9 Gerrit servers, and the noisiest one lost connection and was unable to be reconnected without a restart of jenkins.

          aric gardner added a comment - Hitting this as well. Jenkins server streams events from 9 Gerrit servers, and the noisiest one lost connection and was unable to be reconnected without a restart of jenkins.

          Just upgraded to the 2.31.0 version of the plugin that was released a couple of days ago since I saw https://github.com/jenkinsci/gerrit-trigger-plugin/pull/417 was merged in and I was hoping that it was going to resolve the problem.

          A completely empty job queue and I'm still getting WARNINGs about the incoming events queue and it's growing. Jenkins does appear to be picking up events as they're happening on the watched Gerrit systems though so I really have no idea what's going on now.

          Andrew Grimberg added a comment - Just upgraded to the 2.31.0 version of the plugin that was released a couple of days ago since I saw https://github.com/jenkinsci/gerrit-trigger-plugin/pull/417  was merged in and I was hoping that it was going to resolve the problem. A completely empty job queue and I'm still getting WARNINGs about the incoming events queue and it's growing. Jenkins does appear to be picking up events as they're happening on the watched Gerrit systems though so I really have no idea what's going on now.

            rsandell rsandell
            mmclaughlin Mitch McLaughlin
            Votes:
            3 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: