Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-25676

Disconnect from XMPP after a few minutes

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved (View Workflow)
    • Major
    • Resolution: Fixed
    • jabber-plugin
    • None
    • Jenkins 1.580.1

    Description

      Jenkins seems to disconnect from the XMPP server after a few minutes. I can get it to reconnect shortly after re-configuring the system, but it always disconnects soon after. If I can provide more useful information I'd be happy to.

      From the system log:

      Nov 18, 2014 5:17:45 PM INFO hudson.plugins.jabber.im.transport.JabberIMConnection createConnection
      Trying to connect to XMPP on /svc-eng-jenkins
      Nov 18, 2014 5:17:45 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:17:45 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:17:45 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:17:45 PM INFO hudson.plugins.jabber.im.transport.JabberIMConnection setupSubscriptionMode
      Accepting all subscription requests
      Nov 18, 2014 5:17:45 PM WARNING hudson.plugins.jabber.im.transport.JabberIMConnection createVCardIfNeeded
      internal-server-error
      	at org.jivesoftware.smack.PacketCollector.nextResultOrThrow(PacketCollector.java:196)
      	at org.jivesoftware.smack.PacketCollector.nextResultOrThrow(PacketCollector.java:175)
      	at org.jivesoftware.smackx.vcardtemp.packet.VCard.save(VCard.java:528)
      	at hudson.plugins.jabber.im.transport.JabberIMConnection.createVCard(JabberIMConnection.java:591)
      	at hudson.plugins.jabber.im.transport.JabberIMConnection.createVCardIfNeeded(JabberIMConnection.java:535)
      	at hudson.plugins.jabber.im.transport.JabberIMConnection.createConnection(JabberIMConnection.java:431)
      	at hudson.plugins.jabber.im.transport.JabberIMConnection.connect(JabberIMConnection.java:189)
      	at hudson.plugins.jabber.im.transport.JabberIMConnectionProvider.createConnection(JabberIMConnectionProvider.java:42)
      	at hudson.plugins.im.IMConnectionProvider.create(IMConnectionProvider.java:65)
      	at hudson.plugins.im.IMConnectionProvider.access$600(IMConnectionProvider.java:22)
      	at hudson.plugins.im.IMConnectionProvider$ConnectorRunnable.run(IMConnectionProvider.java:183)
      	at java.lang.Thread.run(Thread.java:745)
      
      Nov 18, 2014 5:17:45 PM INFO hudson.plugins.jabber.im.transport.JabberIMConnection connect
      Connected to XMPP on null:0/[redacted].com using secure connection
      Nov 18, 2014 5:17:45 PM INFO hudson.plugins.jabber.im.transport.JabberIMConnection connect
      Joined groupchat jenkins@conference.[redacted].com
      Nov 18, 2014 5:17:48 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:18:12 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:20:16 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:20:24 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      Nov 18, 2014 5:23:28 PM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
      Failed to parse extension packet in Presence packet.
      

      From the FINEST log:

      Nov 18, 2014 5:27:23 PM FINE hudson.plugins.jabber.im.transport.JabberConnectionDebugger
      RECV: </stream:stream>
      Nov 18, 2014 5:27:23 PM FINE hudson.plugins.jabber.im.transport.JabberConnectionDebugger
      SENT: <presence id='TJYtN-19' type='unavailable'></presence>
      Nov 18, 2014 5:27:23 PM FINE hudson.plugins.jabber.im.transport.JabberConnectionDebugger
      SENT: </stream:stream>
      Nov 18, 2014 5:27:23 PM FINE hudson.plugins.jabber.im.transport.JabberConnectionDebugger
      Connection closed
      

      Attachments

        1. error.log
          1.32 MB
        2. im-client.log
          0.1 kB
        3. im-server.log
          120 kB
        4. jenkins_im_redacted.txt
          208 kB
        5. jenkins.log
          1 kB

        Activity

          kutzi kutzi added a comment -

          I've disabled the creation of avatar/vCard in the plugin, which should fix this immediate problem.
          Not really a satisfactory solution - I should fix the creation of the vCard instead - but I don't have time for that, currently.

          kutzi kutzi added a comment - I've disabled the creation of avatar/vCard in the plugin, which should fix this immediate problem. Not really a satisfactory solution - I should fix the creation of the vCard instead - but I don't have time for that, currently.
          flow Florian Schmaus added a comment - - edited

          @Jordan Is it possible to show us the vcard stanza that Smack sends to the server? The interesting part of the logs is actually just before the part you showed us . One could assume that the server sends the 'internal-server-error', and subsequently closes the connection, because of the stanza. Also some logs from the the server side may be helpful to determine the underlying issue.

          I've no reports that Smack's 4 vcard implementation is broken in this regard, but I'm of course eager to resolve such issues.

          flow Florian Schmaus added a comment - - edited @Jordan Is it possible to show us the vcard stanza that Smack sends to the server? The interesting part of the logs is actually just before the part you showed us . One could assume that the server sends the 'internal-server-error', and subsequently closes the connection, because of the stanza. Also some logs from the the server side may be helpful to determine the underlying issue. I've no reports that Smack's 4 vcard implementation is broken in this regard, but I'm of course eager to resolve such issues.
          spikerjenk2 Jordan Spiker added a comment -

          @Forian

          I'm uploading as much as I can get from the FINEST level log. Please let me know if I can provide more info. Note that I have [redact]ed sensitive information.

          spikerjenk2 Jordan Spiker added a comment - @Forian I'm uploading as much as I can get from the FINEST level log. Please let me know if I can provide more info. Note that I have [redact] ed sensitive information.
          spikerjenk2 Jordan Spiker added a comment -

          It is unlikely that I would be able to get logs from the IM server any time soon. Our sysadmins are very busy with higher priority tasks. If that is required to help fix this issue then I can hopefully get some logs in the next week or two.

          spikerjenk2 Jordan Spiker added a comment - It is unlikely that I would be able to get logs from the IM server any time soon. Our sysadmins are very busy with higher priority tasks. If that is required to help fix this issue then I can hopefully get some logs in the next week or two.

          Thanks Jordan for your detailed feedback. Reading the logs and reviewing the issue again, I begin to doubt that this is related to vcard in any way. At least there are no proofs that this is caused by the vcard code. The only hint towards vcard appears in the stacktrace, but this could be pure coincidence. After all it's the server who was closing the connection. The server logs from the relevant time period would sure be helpful.

          So I don't think that the internal-server-error exception is related to the connection termination at all. What likely happened is that createVCard in JabberIMConnection:537 threw an internal-server-error XMPPException, but this does not mean that the connection will get closed. This only means that the server responded to the vcard IQ request from the client, with an IQ error. The timestamps also strengthen this thesis: the XMPPException is at 5:17:45, but the closing stream elements are send at 5:27:23.

          @Jordan I think the answer, or at least helpful information what is going on, is found in the server logs (under the assumption that it's the server who is terminating the connection, the timestamps give no clear indication who sends the closing stream element first).

          @kutzi I recommend reverting the related change and reopening this issue.

          flow Florian Schmaus added a comment - Thanks Jordan for your detailed feedback. Reading the logs and reviewing the issue again, I begin to doubt that this is related to vcard in any way. At least there are no proofs that this is caused by the vcard code. The only hint towards vcard appears in the stacktrace, but this could be pure coincidence. After all it's the server who was closing the connection. The server logs from the relevant time period would sure be helpful. So I don't think that the internal-server-error exception is related to the connection termination at all. What likely happened is that createVCard in JabberIMConnection:537 threw an internal-server-error XMPPException, but this does not mean that the connection will get closed. This only means that the server responded to the vcard IQ request from the client, with an IQ error. The timestamps also strengthen this thesis: the XMPPException is at 5:17:45, but the closing stream elements are send at 5:27:23. @Jordan I think the answer, or at least helpful information what is going on, is found in the server logs (under the assumption that it's the server who is terminating the connection, the timestamps give no clear indication who sends the closing stream element first). @kutzi I recommend reverting the related change and reopening this issue.
          kutzi kutzi added a comment -

          @flo when the createVCard method throws an exception the connection gets closed. At least from the jabber-plugin point-of-view.

          kutzi kutzi added a comment - @flo when the createVCard method throws an exception the connection gets closed. At least from the jabber-plugin point-of-view.

          createVCard method throws an exception the connection gets closed.

          No, that is not what we are seeing in this issue. The exception is catched and logged by https://github.com/jenkinsci/jabber-plugin/blob/master/src/main/java/hudson/plugins/jabber/im/transport/JabberIMConnection.java#L540
          As I tried to explain: There is no evidence that the XMPPException thrown by createVCard is causing the connection termination. In fact the termination as shown in the log happens 10 minutes after the exception is thrown. I guess you are a bit biased towards vcard because of JENKINS-25515.

          With the information the issue currenctly contains, it's not even clear which side, client or server, sends the closing stream element first, i.e. terminates the stream.

          flow Florian Schmaus added a comment - createVCard method throws an exception the connection gets closed. No, that is not what we are seeing in this issue. The exception is catched and logged by https://github.com/jenkinsci/jabber-plugin/blob/master/src/main/java/hudson/plugins/jabber/im/transport/JabberIMConnection.java#L540 As I tried to explain: There is no evidence that the XMPPException thrown by createVCard is causing the connection termination. In fact the termination as shown in the log happens 10 minutes after the exception is thrown. I guess you are a bit biased towards vcard because of JENKINS-25515 . With the information the issue currenctly contains, it's not even clear which side, client or server, sends the closing stream element first, i.e. terminates the stream.
          kutzi kutzi added a comment -

          Sorry Florian, you're right of course.
          I completely was not paying attention to the details.

          kutzi kutzi added a comment - Sorry Florian, you're right of course. I completely was not paying attention to the details.
          spikerjenk2 Jordan Spiker added a comment -

          All,

          I installed the updated Jabber plugin (1.30) and I have not had the disconnect issue since. There are a number of these warnings in the log, however:

          Nov 24, 2014 10:48:00 AM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence
          
          Failed to parse extension packet in Presence packet.
          
          spikerjenk2 Jordan Spiker added a comment - All, I installed the updated Jabber plugin (1.30) and I have not had the disconnect issue since. There are a number of these warnings in the log, however: Nov 24, 2014 10:48:00 AM WARNING org.jivesoftware.smack.util.PacketParserUtils parsePresence Failed to parse extension packet in Presence packet.
          spikerjenk2 Jordan Spiker added a comment -

          Well after a half hour of working OK, Jabber disconnected in the same way as before. Please disregard my previous comment.

          spikerjenk2 Jordan Spiker added a comment - Well after a half hour of working OK, Jabber disconnected in the same way as before. Please disregard my previous comment.

          Those warnings are non-fatal, the faulty packet extension in the presence packet will simply be ignored.

          @Jordan Did you had a chance to get the server logs?

          flow Florian Schmaus added a comment - Those warnings are non-fatal, the faulty packet extension in the presence packet will simply be ignored. @Jordan Did you had a chance to get the server logs?
          spikerjenk2 Jordan Spiker added a comment -

          @Florian

          Here's some of our logs.

          im-client.log: this is just a copy/paste of the lines that were shown in the IM client
          im-server.log: this is a copy of all data the IM server received related to that service account user. Other user names and such were genericized, but 'svc-eng-jenkins' (the account in question) is still visible.
          jenkins.log: this is what was logged by jenkins

          If there's anything else I can provide please let me know.

          spikerjenk2 Jordan Spiker added a comment - @Florian Here's some of our logs. im-client.log: this is just a copy/paste of the lines that were shown in the IM client im-server.log: this is a copy of all data the IM server received related to that service account user. Other user names and such were genericized, but 'svc-eng-jenkins' (the account in question) is still visible. jenkins.log: this is what was logged by jenkins If there's anything else I can provide please let me know.

          I see that you are using Openfire. Actually what's relevant are not the in/out stanzas logged by the server, but the general log messages that the server produces. For openfire this means, debug, info, warn and error log. They may contain a hint why the connection is closed.

          flow Florian Schmaus added a comment - I see that you are using Openfire. Actually what's relevant are not the in/out stanzas logged by the server, but the general log messages that the server produces. For openfire this means, debug, info, warn and error log. They may contain a hint why the connection is closed.
          spikerjenk2 Jordan Spiker added a comment -

          I'm relaying one of our sysadmins:

          ===

          Yes, we are running an older version of openfire.

          The info and debug logs contained nothing from the time period.

          Warning only contained the following:
          2014.11.25 11:55:29 Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
          2014.11.25 11:55:29 Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.

          But it isn't repeated for the other times when this happened so it seems unrelated.

          Error has an incredible amount of errors that it repeats at an insane rate, so much so that I don't have the data available from the original time period any more. However, recreating the event (same symptoms, ~6 min disconnect) seems to show nothing relevant (attached file), but appears to be related to when we stripped out many our LDAP groups. There is the mention of one non-existent user in a group, but this is not the user we are concerned with.

          Would increasing verbosity perhaps give something more useful?

          spikerjenk2 Jordan Spiker added a comment - I'm relaying one of our sysadmins: === Yes, we are running an older version of openfire. The info and debug logs contained nothing from the time period. Warning only contained the following: 2014.11.25 11:55:29 Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended. 2014.11.25 11:55:29 Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended. But it isn't repeated for the other times when this happened so it seems unrelated. Error has an incredible amount of errors that it repeats at an insane rate, so much so that I don't have the data available from the original time period any more. However, recreating the event (same symptoms, ~6 min disconnect) seems to show nothing relevant (attached file), but appears to be related to when we stripped out many our LDAP groups. There is the mention of one non-existent user in a group, but this is not the user we are concerned with. Would increasing verbosity perhaps give something more useful?
          flow Florian Schmaus added a comment - - edited

          This is going a bit out of the realm of the jenkins jabber plugin issue tracker. To summarize how the situation appears to me: Either the server or the clients disconnects the XMPP connection. I know Smack and it would tell you why it did so, and this would appear in the Jenkin's log (of the jabber-plugin). I believe Openfire would also log an message if it terminates the connection for some reason. But openfire being a server, logs also many other things, so it's like looking for the needle in the haystack.

          flow Florian Schmaus added a comment - - edited This is going a bit out of the realm of the jenkins jabber plugin issue tracker. To summarize how the situation appears to me: Either the server or the clients disconnects the XMPP connection. I know Smack and it would tell you why it did so, and this would appear in the Jenkin's log (of the jabber-plugin). I believe Openfire would also log an message if it terminates the connection for some reason. But openfire being a server, logs also many other things, so it's like looking for the needle in the haystack.
          spikerjenk2 Jordan Spiker added a comment -

          I am somewhat happy to say that we've figured this out.

          See here: https://pidgin.im/pipermail/tracker/2010-March/060960.html

          Our older openfire server will disconnect after 6 minutes. This doesn't affect pidgin users because of the above patch. Jenkins, however, does not send keepalive packets and is disconnected after 6 minutes of inactivity.

          As a workaround, I made a job that has Jenkins IM itself every 5 minutes.

          Is it possible to add a keepalive, similar to Pidgin, to Jabber?

          spikerjenk2 Jordan Spiker added a comment - I am somewhat happy to say that we've figured this out. See here: https://pidgin.im/pipermail/tracker/2010-March/060960.html Our older openfire server will disconnect after 6 minutes. This doesn't affect pidgin users because of the above patch. Jenkins, however, does not send keepalive packets and is disconnected after 6 minutes of inactivity. As a workaround, I made a job that has Jenkins IM itself every 5 minutes. Is it possible to add a keepalive, similar to Pidgin, to Jabber?

          HipChat has a similar behavior addressed in JENKINS-25222. Unfortunately it queries the service name for hipchat: https://github.com/jenkinsci/jabber-plugin/commit/cd9eaf5fa28877fc3d2e42339219168ee3f0c0a3#diff-f633f26543fa04f4d2bad3e2c23d6fc3R442

          @kutzi I recommend to re-open this issue, changing it's title to "Add support for automated server pings", remove the server specific code added in JENKINS-25222 and add a generic setting for server pings using existing Smack code https://www.igniterealtime.org/builds/smack/docs/latest/javadoc/org/jivesoftware/smackx/ping/PingManager.html

          I recommend a default ping interval of 5 minutes, but users should be able to enter a longer ping interval (30min, 1h).

          flow Florian Schmaus added a comment - HipChat has a similar behavior addressed in JENKINS-25222 . Unfortunately it queries the service name for hipchat: https://github.com/jenkinsci/jabber-plugin/commit/cd9eaf5fa28877fc3d2e42339219168ee3f0c0a3#diff-f633f26543fa04f4d2bad3e2c23d6fc3R442 @kutzi I recommend to re-open this issue, changing it's title to "Add support for automated server pings", remove the server specific code added in JENKINS-25222 and add a generic setting for server pings using existing Smack code https://www.igniterealtime.org/builds/smack/docs/latest/javadoc/org/jivesoftware/smackx/ping/PingManager.html I recommend a default ping interval of 5 minutes, but users should be able to enter a longer ping interval (30min, 1h).

          Create PR to address this issue: https://github.com/jenkinsci/jabber-plugin/pull/13

          @Jordan Are you able to create the plugin from the PR and test it?

          flow Florian Schmaus added a comment - Create PR to address this issue: https://github.com/jenkinsci/jabber-plugin/pull/13 @Jordan Are you able to create the plugin from the PR and test it?
          kutzi kutzi added a comment -

          Jordan, Florian, great work in finding out the root cause!

          I've fixed this in a slighty different way than in the PR and uploaded a build with the fix to:
          https://dl.dropboxusercontent.com/u/25863594/jabber.hpi

          Jordan, would be great if you could test it, before I do a release.

          kutzi kutzi added a comment - Jordan, Florian, great work in finding out the root cause! I've fixed this in a slighty different way than in the PR and uploaded a build with the fix to: https://dl.dropboxusercontent.com/u/25863594/jabber.hpi Jordan, would be great if you could test it, before I do a release.
          spikerjenk2 Jordan Spiker added a comment -

          Thank you so much for the help with this one. I'll be able to test on Monday when I'm back in the office.

          spikerjenk2 Jordan Spiker added a comment - Thank you so much for the help with this one. I'll be able to test on Monday when I'm back in the office.
          kutzi kutzi added a comment -

          @spikerjenk2 did you find the time to test this, already?

          kutzi kutzi added a comment - @spikerjenk2 did you find the time to test this, already?
          spikerjenk2 Jordan Spiker added a comment -

          @kutzi sorry I haven't, I've had some high priority things going on. I will be able to test this next week.

          spikerjenk2 Jordan Spiker added a comment - @kutzi sorry I haven't, I've had some high priority things going on. I will be able to test this next week.
          spikerjenk2 Jordan Spiker added a comment -

          All,

          I finally found the time to install and test this. Ironically it was right after you released 1.31. I can verify it works great! Thank you so much for the help.

          spikerjenk2 Jordan Spiker added a comment - All, I finally found the time to install and test this. Ironically it was right after you released 1.31. I can verify it works great! Thank you so much for the help.
          kutzi kutzi added a comment -

          Thanks for testing

          kutzi kutzi added a comment - Thanks for testing

          People

            kutzi kutzi
            spikerjenk2 Jordan Spiker
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: