Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28175

config change deadlock Jenkins when pircx.shutdown() is invoked

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Component/s: ircbot-plugin
    • Labels:
      None
    • Environment:
      jenkins 1.596.2
      ircbot-plugin 2.26
    • Similar Issues:

      Description

      We have recently upgraded the IRC plugin from 2.25 to 2.26. On configuration change, the ircbot plugin invokes PircBotX.shutdown(). For some reason it never finish and the conf change is stalled.

      A side effect is that jobs sending notifications ends up being blocked waiting for an instance of the irc connection provider. The only fix is to restart Jenkins entirely.

      Our bug has a few more explanations https://phabricator.wikimedia.org/T96183 and a full thread dump attached https://phabricator.wikimedia.org/P584

      Here are the blocked threads:

      Two jobs are blocked:

      "Executor #2 for integration-slave-trusty-1016 : executing browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce #234" prio=5 BLOCKED
      "Executor #1 for integration-slave-trusty-1012 : executing browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce #494" prio=5 BLOCKED
      
      	hudson.plugins.ircbot.v2.IRCConnectionProvider.getInstance(IRCConnectionProvider.java:14)
      	hudson.plugins.ircbot.IrcPublisher.getIMConnection(IrcPublisher.java:102)
      	hudson.plugins.im.IMPublisher.sendNotification(IMPublisher.java:374)
      	hudson.plugins.im.IMPublisher.notifyChatsOnBuildEnd(IMPublisher.java:585)
      	hudson.plugins.im.IMPublisher.notifyOnBuildEnd(IMPublisher.java:304)
      	hudson.plugins.im.IMPublisher.perform(IMPublisher.java:291)
              ...
      

      A configuration submit change is blocked as well:

      "Handling POST /ci/configSubmit from X.X.X.X : RequestHandlerThread[#1683]" daemon prio=5 WAITING
      	sun.misc.Unsafe.park(Native Method)
      	java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
      	java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
      	java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
      	org.pircbotx.Channel.getMode(Channel.java:127)
      	org.pircbotx.Channel.getModeArgument(Channel.java:182)
      	org.pircbotx.Channel.getChannelKey(Channel.java:239)
      	org.pircbotx.PircBotX.shutdown(PircBotX.java:2872)
      	hudson.plugins.ircbot.v2.IRCConnection.close(IRCConnection.java:102)
      	hudson.plugins.im.IMConnectionProvider.releaseConnection(IMConnectionProvider.java:92)
      	hudson.plugins.ircbot.v2.IRCConnectionProvider.setDesc(IRCConnectionProvider.java:19)
      	hudson.plugins.ircbot.IrcPublisher$DescriptorImpl.configure(IrcPublisher.java:336)
      	jenkins.model.Jenkins.configureDescriptor(Jenkins.java:2915)
      	jenkins.model.Jenkins.doConfigSubmit(Jenkins.java:2878)
              ...
      

      Some other related threads:

      "JenkinsIsBusyListener-thread" daemon prio=5 BLOCKED
       hudson.plugins.im.IMConnectionProvider.currentConnection(IMConnectionProvider.java:83)
       hudson.plugins.im.JenkinsIsBusyListener.setStatus(JenkinsIsBusyListener.java:118)
       hudson.plugins.im.JenkinsIsBusyListener.updateIMStatus(JenkinsIsBusyListener.java:109)
       hudson.plugins.im.JenkinsIsBusyListener.access$000(JenkinsIsBusyListener.java:20)
       hudson.plugins.im.JenkinsIsBusyListener$3.run(JenkinsIsBusyListener.java:98)
       java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       java.util.concurrent.FutureTask.run(FutureTask.java:262)
       java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
       java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:745)
      
      "IM-Reconnector-Thread" daemon prio=5 BLOCKED
       hudson.plugins.im.IMConnectionProvider$ConnectorRunnable.run(IMConnectionProvider.java:175)
       java.lang.Thread.run(Thread.java:745)
      

      Looking at the git changelog I noticed https://github.com/jenkinsci/ircbot-plugin/commit/98b0105a743d062abf957c285cdded06fedd3fa3 which changes in IRCConnection.close:

      - this.pircConnection.disconnect();
      + this.pircConnection.shutdown(true);
      

      Which was done to fix a leak (JENKINS-25349).

        Attachments

          Issue Links

            Activity

            Hide
            kutzi kutzi added a comment -

            Quite difficult to reproduce this. Seems only to happen when the channel mode is stale while PircBot is shut down. And stale mode seem only to happen, when the server returns something for mode, what Pircbot cannot parse.

            Show
            kutzi kutzi added a comment - Quite difficult to reproduce this. Seems only to happen when the channel mode is stale while PircBot is shut down. And stale mode seem only to happen, when the server returns something for mode, what Pircbot cannot parse.
            Hide
            kutzi kutzi added a comment -

            Opened an issue against PircBotX: https://code.google.com/p/pircbotx/issues/detail?id=240
            Not sure if the issue persists in PircBotX 2.x as a lot has changed there.
            Trying to update to PircBotX 2.0.1

            Show
            kutzi kutzi added a comment - Opened an issue against PircBotX: https://code.google.com/p/pircbotx/issues/detail?id=240 Not sure if the issue persists in PircBotX 2.x as a lot has changed there. Trying to update to PircBotX 2.0.1
            Hide
            kutzi kutzi added a comment -

            I've create a plugin build which uses PircBotX 2.0.1 which should fix the issue. Would be great, if you could test it, before I build a new release:

            https://dl.dropboxusercontent.com/u/25863594/ircbot.hpi

            Show
            kutzi kutzi added a comment - I've create a plugin build which uses PircBotX 2.0.1 which should fix the issue. Would be great, if you could test it, before I build a new release: https://dl.dropboxusercontent.com/u/25863594/ircbot.hpi
            Hide
            kutzi kutzi added a comment -

            Antoine, any chance you can try the unreleased IRC plugin?

            Show
            kutzi kutzi added a comment - Antoine, any chance you can try the unreleased IRC plugin?
            Hide
            davehunt Dave Hunt added a comment -

            We are seeing this too, but I'd like to avoid trying an unreleased plugin on our production Jenkins instance. We do have a staging instance that until today did not use the IRC plugin. I have it running now and if I manage to replicate the issue I'll try the unreleased plugin to see if that resolves it.

            Show
            davehunt Dave Hunt added a comment - We are seeing this too, but I'd like to avoid trying an unreleased plugin on our production Jenkins instance. We do have a staging instance that until today did not use the IRC plugin. I have it running now and if I manage to replicate the issue I'll try the unreleased plugin to see if that resolves it.
            Hide
            davehunt Dave Hunt added a comment -

            I was able to replicate this on our staging instance, so I've installed the unreleased plugin, and will see if this appears to fix the issue.

            Show
            davehunt Dave Hunt added a comment - I was able to replicate this on our staging instance, so I've installed the unreleased plugin, and will see if this appears to fix the issue.
            Hide
            hashar Antoine Musso added a comment -

            kutzi wrote:
            > Antoine, any chance you can try the unreleased IRC plugin?

            Unlikely, the issue caused much havoc on the system and I can reproduce it on my dev instance. Moreover I have no idea how to reproduce it on the prod instance :-/

            That being said, I found out that when stopping Jenkins it ends up waiting for org.pircbotx.Channel.getMode(). That might be the same root cause. You can find a full stacktrace at https://phabricator.wikimedia.org/T98976 , I can fill it as another bug if it is unrelated.

            kutzi wrote:
            > https://dl.dropboxusercontent.com/u/25863594/ircbot.hpi

            I guess I can just build it from https://github.com/jenkinsci/ircbot-plugin/commit/d18cc7b617155100f8afadb73b324f378c5661da and maybe grab as well the latest instant-messaging-plugin https://github.com/jenkinsci/ircbot-plugin/commit/a78e066fc57001168a8ffc1893a3de2c91a1d518 :-}

            Show
            hashar Antoine Musso added a comment - kutzi wrote: > Antoine, any chance you can try the unreleased IRC plugin? Unlikely, the issue caused much havoc on the system and I can reproduce it on my dev instance. Moreover I have no idea how to reproduce it on the prod instance :-/ That being said, I found out that when stopping Jenkins it ends up waiting for org.pircbotx.Channel.getMode(). That might be the same root cause. You can find a full stacktrace at https://phabricator.wikimedia.org/T98976 , I can fill it as another bug if it is unrelated. kutzi wrote: > https://dl.dropboxusercontent.com/u/25863594/ircbot.hpi I guess I can just build it from https://github.com/jenkinsci/ircbot-plugin/commit/d18cc7b617155100f8afadb73b324f378c5661da and maybe grab as well the latest instant-messaging-plugin https://github.com/jenkinsci/ircbot-plugin/commit/a78e066fc57001168a8ffc1893a3de2c91a1d518 :-}
            Hide
            davehunt Dave Hunt added a comment -

            I can still replicate this issue using the unreleased version of the plugin, but the behaviour is different. After submitting a global config change for the first time, the bot didn't quit IRC (it usually does, promptly). The config change appeared to stick, however a second config change took a really long time to respond. Eventually it did, and the second config change was processed, but in the meantime the bot quit IRC with a ping timeout.

            Looking through the logs, it appears that the bot had been trying to rejoin the channel but was failing due to the nick already being in use. After the bot timed out, it was then able to rejoin. For us, this is an improvement as previously the bug was taking our Jenkins instance offline, but it does appear that there's still an issue.

            Also, the IRC logging appears to be much more verbose - I'm not sure if this is intentional.

            Show
            davehunt Dave Hunt added a comment - I can still replicate this issue using the unreleased version of the plugin, but the behaviour is different. After submitting a global config change for the first time, the bot didn't quit IRC (it usually does, promptly). The config change appeared to stick, however a second config change took a really long time to respond. Eventually it did, and the second config change was processed, but in the meantime the bot quit IRC with a ping timeout. Looking through the logs, it appears that the bot had been trying to rejoin the channel but was failing due to the nick already being in use. After the bot timed out, it was then able to rejoin. For us, this is an improvement as previously the bug was taking our Jenkins instance offline, but it does appear that there's still an issue. Also, the IRC logging appears to be much more verbose - I'm not sure if this is intentional.
            Hide
            thelq Leon Blakey added a comment -

            Hi, I'm the PircBotX developer. Can someone please check if this is still an issue on PircBotX 2.1-SNAPSHOT? There have been several changes since then that should fix this. I'd like to verify this works before I do the final release.

            Show
            thelq Leon Blakey added a comment - Hi, I'm the PircBotX developer. Can someone please check if this is still an issue on PircBotX 2.1-SNAPSHOT? There have been several changes since then that should fix this. I'd like to verify this works before I do the final release.
            Hide
            thelq Leon Blakey added a comment -

            I've done the needed changes in my repo at https://github.com/TheLQ/ircbot-plugin . It seems to work on my end with my basic testing. hpi downloadable here: https://www.dropbox.com/s/hog8sxf48ann6mj/ircbot.hpi?dl=0

            >lqjenkins< CTCP VERSION
            lqjenkins VERSION PircBotX 2.1-SNAPSHOT-a6c801f Java IRC bot - pircbotx.googlecode.com
            <lqjenkins> Project jlouis build #115: SUCCESS in 20 sec: http://myserver:8080/job/jlouis/115/
            <thelq> !jenkins status jlouis
            <lqjenkins> jlouis: last build: 115 (16 min ago): SUCCESS: http://myserver:8080/job/jlouis/115/

            Please let me know if you have any issues

            Show
            thelq Leon Blakey added a comment - I've done the needed changes in my repo at https://github.com/TheLQ/ircbot-plugin . It seems to work on my end with my basic testing. hpi downloadable here: https://www.dropbox.com/s/hog8sxf48ann6mj/ircbot.hpi?dl=0 >lqjenkins< CTCP VERSION lqjenkins VERSION PircBotX 2.1-SNAPSHOT-a6c801f Java IRC bot - pircbotx.googlecode.com <lqjenkins> Project jlouis build #115: SUCCESS in 20 sec: http://myserver:8080/job/jlouis/115/ <thelq> !jenkins status jlouis <lqjenkins> jlouis: last build: 115 (16 min ago): SUCCESS: http://myserver:8080/job/jlouis/115/ Please let me know if you have any issues
            Hide
            davehunt Dave Hunt added a comment -

            I've installed the latest snapshot on our staging server. I'll report back when I have tested it some more.

            Show
            davehunt Dave Hunt added a comment - I've installed the latest snapshot on our staging server. I'll report back when I have tested it some more.
            Hide
            davehunt Dave Hunt added a comment -

            As far as I can tell there is still an issue with the latest snapshot, however it doesn't seem to be as severe. Jenkins becomes unresponsive for a while and the IRC bot appears to disconnect after a timeout, but everything appears to recover without the need to restart Jenkins.

            Show
            davehunt Dave Hunt added a comment - As far as I can tell there is still an issue with the latest snapshot, however it doesn't seem to be as severe. Jenkins becomes unresponsive for a while and the IRC bot appears to disconnect after a timeout, but everything appears to recover without the need to restart Jenkins.
            Hide
            kutzi kutzi added a comment - - edited

            Leon Blakey: can you release PircBotX 2.1, so I can create a release of the Jenkins plugin?

            Show
            kutzi kutzi added a comment - - edited Leon Blakey : can you release PircBotX 2.1, so I can create a release of the Jenkins plugin?
            Hide
            thelq Leon Blakey added a comment -

            I'm still nervous about the "Jenkins becomes unresponsive for a while and the IRC bot appears to disconnect" part. I'm stumped though on what could cause it.

            Any backtraces of when that happens would help.

            Show
            thelq Leon Blakey added a comment - I'm still nervous about the "Jenkins becomes unresponsive for a while and the IRC bot appears to disconnect" part. I'm stumped though on what could cause it. Any backtraces of when that happens would help.
            Hide
            davehunt Dave Hunt added a comment -

            Changing main Jenkins configuration, and the bot doesn't immediately leave IRC. This is from the thread dumps:

            {{IM-Reconnector-Thread

            "IM-Reconnector-Thread" Id=45 Group=main TIMED_WAITING on java.util.concurrent.CountDownLatch$Sync@5f86fe67
            at sun.misc.Unsafe.park(Native Method)

            • waiting on java.util.concurrent.CountDownLatch$Sync@5f86fe67
              at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
              at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
              at hudson.plugins.ircbot.v2.IRCConnection.connect(IRCConnection.java:208)
              at hudson.plugins.ircbot.v2.IRCConnectionProvider.createConnection(IRCConnectionProvider.java:37)
            • locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466
              at hudson.plugins.im.IMConnectionProvider.create(IMConnectionProvider.java:59)
            • locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466
              at hudson.plugins.im.IMConnectionProvider.access$500(IMConnectionProvider.java:16)
              at hudson.plugins.im.IMConnectionProvider$ConnectorRunnable.run(IMConnectionProvider.java:165)
            • locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466
              at java.lang.Thread.run(Thread.java:722)

            IRC Bot

            "IRC Bot" Id=8963 Group=main RUNNABLE (in native)
            at java.net.SocketInputStream.socketRead0(Native Method)
            at java.net.SocketInputStream.read(SocketInputStream.java:150)
            at java.net.SocketInputStream.read(SocketInputStream.java:121)
            at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
            at sun.security.ssl.InputRecord.read(InputRecord.java:480)
            at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)

            • locked java.lang.Object@43feecf1
              at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:884)
              at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
            • locked sun.security.ssl.AppInputStream@6f135402
              at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
              at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
              at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
            • locked java.io.InputStreamReader@5a82dbd8
              at java.io.InputStreamReader.read(InputStreamReader.java:184)
              at java.io.BufferedReader.fill(BufferedReader.java:154)
              at java.io.BufferedReader.readLine(BufferedReader.java:317)
            • locked java.io.InputStreamReader@5a82dbd8
              at java.io.BufferedReader.readLine(BufferedReader.java:382)
              at org.pircbotx.PircBotX.startLineProcessing(PircBotX.java:353)
              at org.pircbotx.PircBotX.connect(PircBotX.java:337)
              at org.pircbotx.PircBotX.startBot(PircBotX.java:187)
              at hudson.plugins.ircbot.v2.IRCConnection$2.run(IRCConnection.java:199)}}

            Then the bot leaves IRC with:

            webqatestbot2 has left IRC (Connection closed)

            The following is from Jenkins logs:

            {{ERROR :Closing link: (webqatestbo@nat-qa.scl3.mozilla.com) [Registration timeout]
            Aug 24, 2015 2:06:39 PM INFO org.pircbotx.PircBotX startLineProcessing
            Socket is closed, stopping read loop and shutting down
            Aug 24, 2015 2:08:18 PM WARNING hudson.plugins.ircbot.v2.IRCConnection connect
            Time out waiting for connecting to irc
            Aug 24, 2015 2:08:18 PM INFO hudson.plugins.im.IMConnectionProvider$ConnectorRunnable run
            Reconnect failed. Next connection attempt in 1 minutes
            Aug 24, 2015 2:09:06 PM WARNING javax.jmdns.impl.DNSIncoming readAnswer
            There was an OPT answer. Not currently handled. Option code: 65002 data: 6E9AC0F5A9646969
            Aug 24, 2015 2:09:18 PM INFO hudson.plugins.ircbot.v2.IRCConnection connect
            Connecting to irc.mozilla.org:6697 as webqatestbot2 using charset UTF-8
            Aug 24, 2015 2:09:18 PM INFO org.pircbotx.PircBotX connect
            --Starting Connect attempt 1/5--
            Aug 24, 2015 2:09:18 PM INFO org.pircbotx.PircBotX connect
            Connected to server.}}

            The bot then successfully reconnects to IRC:

            webqatestbot2 has joined (webqatestbo@moz-8i38gl.scl3.mozilla.com)

            Show
            davehunt Dave Hunt added a comment - Changing main Jenkins configuration, and the bot doesn't immediately leave IRC. This is from the thread dumps: {{IM-Reconnector-Thread "IM-Reconnector-Thread" Id=45 Group=main TIMED_WAITING on java.util.concurrent.CountDownLatch$Sync@5f86fe67 at sun.misc.Unsafe.park(Native Method) waiting on java.util.concurrent.CountDownLatch$Sync@5f86fe67 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282) at hudson.plugins.ircbot.v2.IRCConnection.connect(IRCConnection.java:208) at hudson.plugins.ircbot.v2.IRCConnectionProvider.createConnection(IRCConnectionProvider.java:37) locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466 at hudson.plugins.im.IMConnectionProvider.create(IMConnectionProvider.java:59) locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466 at hudson.plugins.im.IMConnectionProvider.access$500(IMConnectionProvider.java:16) at hudson.plugins.im.IMConnectionProvider$ConnectorRunnable.run(IMConnectionProvider.java:165) locked hudson.plugins.ircbot.v2.IRCConnectionProvider@2e8f0466 at java.lang.Thread.run(Thread.java:722) IRC Bot "IRC Bot" Id=8963 Group=main RUNNABLE (in native) at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at sun.security.ssl.InputRecord.readFully(InputRecord.java:442) at sun.security.ssl.InputRecord.read(InputRecord.java:480) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927) locked java.lang.Object@43feecf1 at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:884) at sun.security.ssl.AppInputStream.read(AppInputStream.java:102) locked sun.security.ssl.AppInputStream@6f135402 at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) locked java.io.InputStreamReader@5a82dbd8 at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:154) at java.io.BufferedReader.readLine(BufferedReader.java:317) locked java.io.InputStreamReader@5a82dbd8 at java.io.BufferedReader.readLine(BufferedReader.java:382) at org.pircbotx.PircBotX.startLineProcessing(PircBotX.java:353) at org.pircbotx.PircBotX.connect(PircBotX.java:337) at org.pircbotx.PircBotX.startBot(PircBotX.java:187) at hudson.plugins.ircbot.v2.IRCConnection$2.run(IRCConnection.java:199)}} Then the bot leaves IRC with: webqatestbot2 has left IRC (Connection closed) The following is from Jenkins logs: {{ERROR :Closing link: (webqatestbo@nat-qa.scl3.mozilla.com) [Registration timeout] Aug 24, 2015 2:06:39 PM INFO org.pircbotx.PircBotX startLineProcessing Socket is closed, stopping read loop and shutting down Aug 24, 2015 2:08:18 PM WARNING hudson.plugins.ircbot.v2.IRCConnection connect Time out waiting for connecting to irc Aug 24, 2015 2:08:18 PM INFO hudson.plugins.im.IMConnectionProvider$ConnectorRunnable run Reconnect failed. Next connection attempt in 1 minutes Aug 24, 2015 2:09:06 PM WARNING javax.jmdns.impl.DNSIncoming readAnswer There was an OPT answer. Not currently handled. Option code: 65002 data: 6E9AC0F5A9646969 Aug 24, 2015 2:09:18 PM INFO hudson.plugins.ircbot.v2.IRCConnection connect Connecting to irc.mozilla.org:6697 as webqatestbot2 using charset UTF-8 Aug 24, 2015 2:09:18 PM INFO org.pircbotx.PircBotX connect -- Starting Connect attempt 1/5 -- Aug 24, 2015 2:09:18 PM INFO org.pircbotx.PircBotX connect Connected to server.}} The bot then successfully reconnects to IRC: webqatestbot2 has joined (webqatestbo@moz-8i38gl.scl3.mozilla.com)
            Hide
            thelq Leon Blakey added a comment -

            ERROR :Closing link: (webqatestbo@nat-qa.scl3.mozilla.com) [Registration timeout]

            I think that's causing the issue. Server says it doesn't have enough information to finish connecting the bot, bot doesn't dispatch ConnectEvent, irc-plugin blocks then times out after 2 minutes waiting for that ConnectEvent ( https://github.com/TheLQ/ircbot-plugin/blob/master/src/main/java/hudson/plugins/ircbot/v2/IRCConnection.java#L208 )

            I tried to reproduce this with my Jenkins server on irc.mozilla.org, even registered it with nickserv, but didn't run into any issues.

            Dave do you see any errors from the IRC server before that line? If not can you paste any IRC logs between the " -Starting Connect attempt 1/5-" line to before the "ERROR: Closing link" line you posted?

            Show
            thelq Leon Blakey added a comment - ERROR :Closing link: (webqatestbo@nat-qa.scl3.mozilla.com) [Registration timeout] I think that's causing the issue. Server says it doesn't have enough information to finish connecting the bot, bot doesn't dispatch ConnectEvent, irc-plugin blocks then times out after 2 minutes waiting for that ConnectEvent ( https://github.com/TheLQ/ircbot-plugin/blob/master/src/main/java/hudson/plugins/ircbot/v2/IRCConnection.java#L208 ) I tried to reproduce this with my Jenkins server on irc.mozilla.org, even registered it with nickserv, but didn't run into any issues. Dave do you see any errors from the IRC server before that line? If not can you paste any IRC logs between the " - Starting Connect attempt 1/5 -" line to before the "ERROR: Closing link" line you posted?
            Hide
            davehunt Dave Hunt added a comment -

            Leon Blakey I'm unable to replicate this unless some time has passed since the initial IRC connection. The first comment mentions something about a stale mode, which might be related? The log is pretty verbose as the version I have seems to echo all IRC conversation. I'm not sure if that's expected, but it's not ideal.

            Show
            davehunt Dave Hunt added a comment - Leon Blakey I'm unable to replicate this unless some time has passed since the initial IRC connection. The first comment mentions something about a stale mode, which might be related? The log is pretty verbose as the version I have seems to echo all IRC conversation. I'm not sure if that's expected, but it's not ideal.
            Hide
            davehunt Dave Hunt added a comment -

            Any updates on this? We're still seeing it on our main Jenkins instance.

            Show
            davehunt Dave Hunt added a comment - Any updates on this? We're still seeing it on our main Jenkins instance.
            Hide
            thelq Leon Blakey added a comment -

            PircBotX 2.1 was released last night.

            Show
            thelq Leon Blakey added a comment - PircBotX 2.1 was released last night.
            Hide
            davehunt Dave Hunt added a comment -

            kutzi could we get a new plugin release to see if it addresses this issue?

            Show
            davehunt Dave Hunt added a comment - kutzi could we get a new plugin release to see if it addresses this issue?
            Hide
            kutzi kutzi added a comment -

            Sorry, I missed the last updates here.

            Leon Blakey: since you did already the needed changes on your fork: can you update your fork to use the released version 2.1 and open a pull request?

            Show
            kutzi kutzi added a comment - Sorry, I missed the last updates here. Leon Blakey : since you did already the needed changes on your fork: can you update your fork to use the released version 2.1 and open a pull request?
            Hide
            hashar Antoine Musso added a comment -

            From the discussion on https://github.com/jenkinsci/ircbot-plugin/commit/d18cc7b617155100f8afadb73b324f378c5661da (which bumps Pircbotx to 2.0.1) the deadlock might be solved by Pircbotx 2.1.

            Would be nice to have a commit that bump the dependency.

            Show
            hashar Antoine Musso added a comment - From the discussion on https://github.com/jenkinsci/ircbot-plugin/commit/d18cc7b617155100f8afadb73b324f378c5661da (which bumps Pircbotx to 2.0.1) the deadlock might be solved by Pircbotx 2.1. Would be nice to have a commit that bump the dependency.
            Hide
            schristou Steven Christou added a comment -

            Please make sure to update to the latest version of the ircbot plugin which uses 2.1 of the pircbotx plugin. If this does not solve your issue, then please re-attach a new thread dump to diagnose the deadlock.

            Show
            schristou Steven Christou added a comment - Please make sure to update to the latest version of the ircbot plugin which uses 2.1 of the pircbotx plugin. If this does not solve your issue, then please re-attach a new thread dump to diagnose the deadlock.
            Hide
            hashar Antoine Musso added a comment -

            I am not sure I have seen that issue again with 2.27 / pircbotx 2.0.1, so maybe that fixed it.

            Show
            hashar Antoine Musso added a comment - I am not sure I have seen that issue again with 2.27 / pircbotx 2.0.1, so maybe that fixed it.
            Hide
            hashar Antoine Musso added a comment -

            Has not happened again since I have upgraded to 2.27 / pircbotx 2.0.1

            Show
            hashar Antoine Musso added a comment - Has not happened again since I have upgraded to 2.27 / pircbotx 2.0.1

              People

              Assignee:
              kutzi kutzi
              Reporter:
              hashar Antoine Musso
              Votes:
              3 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: