-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
jenkins 1.596.2
ircbot-plugin 2.26
We have recently upgraded the IRC plugin from 2.25 to 2.26. On configuration change, the ircbot plugin invokes PircBotX.shutdown(). For some reason it never finish and the conf change is stalled.
A side effect is that jobs sending notifications ends up being blocked waiting for an instance of the irc connection provider. The only fix is to restart Jenkins entirely.
Our bug has a few more explanations https://phabricator.wikimedia.org/T96183 and a full thread dump attached https://phabricator.wikimedia.org/P584
Here are the blocked threads:
Two jobs are blocked:
"Executor #2 for integration-slave-trusty-1016 : executing browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce #234" prio=5 BLOCKED "Executor #1 for integration-slave-trusty-1012 : executing browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce #494" prio=5 BLOCKED hudson.plugins.ircbot.v2.IRCConnectionProvider.getInstance(IRCConnectionProvider.java:14) hudson.plugins.ircbot.IrcPublisher.getIMConnection(IrcPublisher.java:102) hudson.plugins.im.IMPublisher.sendNotification(IMPublisher.java:374) hudson.plugins.im.IMPublisher.notifyChatsOnBuildEnd(IMPublisher.java:585) hudson.plugins.im.IMPublisher.notifyOnBuildEnd(IMPublisher.java:304) hudson.plugins.im.IMPublisher.perform(IMPublisher.java:291) ...
A configuration submit change is blocked as well:
"Handling POST /ci/configSubmit from X.X.X.X : RequestHandlerThread[#1683]" daemon prio=5 WAITING sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236) org.pircbotx.Channel.getMode(Channel.java:127) org.pircbotx.Channel.getModeArgument(Channel.java:182) org.pircbotx.Channel.getChannelKey(Channel.java:239) org.pircbotx.PircBotX.shutdown(PircBotX.java:2872) hudson.plugins.ircbot.v2.IRCConnection.close(IRCConnection.java:102) hudson.plugins.im.IMConnectionProvider.releaseConnection(IMConnectionProvider.java:92) hudson.plugins.ircbot.v2.IRCConnectionProvider.setDesc(IRCConnectionProvider.java:19) hudson.plugins.ircbot.IrcPublisher$DescriptorImpl.configure(IrcPublisher.java:336) jenkins.model.Jenkins.configureDescriptor(Jenkins.java:2915) jenkins.model.Jenkins.doConfigSubmit(Jenkins.java:2878) ...
Some other related threads:
"JenkinsIsBusyListener-thread" daemon prio=5 BLOCKED hudson.plugins.im.IMConnectionProvider.currentConnection(IMConnectionProvider.java:83) hudson.plugins.im.JenkinsIsBusyListener.setStatus(JenkinsIsBusyListener.java:118) hudson.plugins.im.JenkinsIsBusyListener.updateIMStatus(JenkinsIsBusyListener.java:109) hudson.plugins.im.JenkinsIsBusyListener.access$000(JenkinsIsBusyListener.java:20) hudson.plugins.im.JenkinsIsBusyListener$3.run(JenkinsIsBusyListener.java:98) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) java.util.concurrent.FutureTask.run(FutureTask.java:262) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745)
"IM-Reconnector-Thread" daemon prio=5 BLOCKED hudson.plugins.im.IMConnectionProvider$ConnectorRunnable.run(IMConnectionProvider.java:175) java.lang.Thread.run(Thread.java:745)
Looking at the git changelog I noticed https://github.com/jenkinsci/ircbot-plugin/commit/98b0105a743d062abf957c285cdded06fedd3fa3 which changes in IRCConnection.close:
- this.pircConnection.disconnect(); + this.pircConnection.shutdown(true);
Which was done to fix a leak (JENKINS-25349).
- is related to
-
JENKINS-25349 Leaking ircbot OutputThreads
-
- Resolved
-
-
JENKINS-22042 IRCBot disconnects after long idle
-
- Open
-
ERROR :Closing link: (webqatestbo@nat-qa.scl3.mozilla.com) [Registration timeout]
I think that's causing the issue. Server says it doesn't have enough information to finish connecting the bot, bot doesn't dispatch ConnectEvent, irc-plugin blocks then times out after 2 minutes waiting for that ConnectEvent ( https://github.com/TheLQ/ircbot-plugin/blob/master/src/main/java/hudson/plugins/ircbot/v2/IRCConnection.java#L208 )
I tried to reproduce this with my Jenkins server on irc.mozilla.org, even registered it with nickserv, but didn't run into any issues.
Dave do you see any errors from the IRC server before that line? If not can you paste any IRC logs between the " -
Starting Connect attempt 1/5-" line to before the "ERROR: Closing link" line you posted?