Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-6009

Jabber 1.7: Hudson (1.352) doesn't restart

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • jabber-plugin
    • None
    • Linux samos 2.6.28-16-server #55-Ubuntu SMP Tue Oct 20 20:37:10 UTC 2009 x86_64 GNU/Linux

    Description

      After upgrading Jabber (from 1.6 -> 1.7), Hudson (1.352) doesn't restart.
      The reload page is shown to browsers, but reloading doesn't terminate.
      There is no error in hudson.log.
      I have to kill -9 the JVM because soft ways don't work.
      Before upgrade, all other plugins were up to date and running well, especially instant-messaging.
      Régis

      Attachments

        Activity

          rdesgroppes Régis Desgroppes created issue -
          kutzi kutzi added a comment -

          Could you take a stacktrace of Hudson when it hangs?
          kill -3 <pid> or jstack

          kutzi kutzi added a comment - Could you take a stacktrace of Hudson when it hangs? kill -3 <pid> or jstack
          rdesgroppes Régis Desgroppes added a comment - - edited

          There is no stdout/stderr with:
          $ sudo kill -3 <pid>

          With jstack, the output is not that useful:
          $ jstack -F -m -l <pid>
          Attaching to process ID <pid>, please wait...
          Error attaching to process: sun.jvm.hotspot.debugger.DebuggerException: Can't attach to the process

          ... Or maybe it has to be run under sudo too. However, before restarting our continuous integration server again (so that its uptime keeps celebrated at my company), there may be useful stuff in hudson.log, where threads state was dumped: several jobs are indeed still attempting to create Jabber connections after sth like 1 minute.

          Please have a look at attached hudson.log.

          Cheers,
          Regis.

          rdesgroppes Régis Desgroppes added a comment - - edited There is no stdout/stderr with: $ sudo kill -3 <pid> With jstack, the output is not that useful: $ jstack -F -m -l <pid> Attaching to process ID <pid>, please wait... Error attaching to process: sun.jvm.hotspot.debugger.DebuggerException: Can't attach to the process ... Or maybe it has to be run under sudo too. However, before restarting our continuous integration server again (so that its uptime keeps celebrated at my company), there may be useful stuff in hudson.log, where threads state was dumped: several jobs are indeed still attempting to create Jabber connections after sth like 1 minute. Please have a look at attached hudson.log. Cheers, Regis.
          rdesgroppes Régis Desgroppes made changes -
          Field Original Value New Value
          Attachment hudson.log [ 19263 ]
          kutzi kutzi added a comment - - edited

          (edited: forget it, the stacktrace is in the hudson.log you appended)

          kutzi kutzi added a comment - - edited (edited: forget it, the stacktrace is in the hudson.log you appended)
          kutzi kutzi added a comment -

          Some questions:

          • what Jabber server are you trying to connect to?
          • is the server using TLS, is TLS required?
          • is there a Firewall somewhere between Hudson and the Jabber server?
          kutzi kutzi added a comment - Some questions: what Jabber server are you trying to connect to? is the server using TLS, is TLS required? is there a Firewall somewhere between Hudson and the Jabber server?

          1. what Jabber server are you trying to connect to?
          > gmail.com/googlemail.com
          2. is the server using TLS, is TLS required?
          > no
          3. is there a Firewall somewhere between Hudson and the Jabber server?
          > yes, but the rules are not that strict. And our other Jabber clients seem to work well (gtalk, empathy... and plugin 1.6).

          rdesgroppes Régis Desgroppes added a comment - 1. what Jabber server are you trying to connect to? > gmail.com/googlemail.com 2. is the server using TLS, is TLS required? > no 3. is there a Firewall somewhere between Hudson and the Jabber server? > yes, but the rules are not that strict. And our other Jabber clients seem to work well (gtalk, empathy... and plugin 1.6).
          kutzi kutzi added a comment -

          Please try v1.8. It contains a fix for another GoogleTalk connection issue (JENKINS-6018), maybe it helps in your case, too.

          kutzi kutzi added a comment - Please try v1.8. It contains a fix for another GoogleTalk connection issue ( JENKINS-6018 ), maybe it helps in your case, too.
          kutzi kutzi added a comment -

          This is related to JENKINS-4346.
          I'd speculate that Hudson is trying to connect on some firewall-protected port which results in it waiting for a timeout (how long did you wait before killing the process?)
          Seems like something in Smack changed which results in this behaviour.

          Could you check which ports Hudson (resp. Jabber bot) is trying to connect to and if one of the ports is a firewall-protected one? lsof is e.g. a good tool to do this.

          kutzi kutzi added a comment - This is related to JENKINS-4346 . I'd speculate that Hudson is trying to connect on some firewall-protected port which results in it waiting for a timeout (how long did you wait before killing the process?) Seems like something in Smack changed which results in this behaviour. Could you check which ports Hudson (resp. Jabber bot) is trying to connect to and if one of the ports is a firewall-protected one? lsof is e.g. a good tool to do this.
          kutzi kutzi added a comment -

          Are you still seeing this issue?

          kutzi kutzi added a comment - Are you still seeing this issue?

          Yes, none of Jabber plugin 1.7, 1.8 or 1.9 works here. So, we're still using version 1.6.

          rdesgroppes Régis Desgroppes added a comment - Yes, none of Jabber plugin 1.7, 1.8 or 1.9 works here. So, we're still using version 1.6.
          kutzi kutzi added a comment -

          > Could you check which ports Hudson (resp. Jabber bot) is trying to connect to and if one of the ports is a firewall-protected one? lsof is e.g. a good tool to do this.

          I'm sorry that it still doesn't work for you but without further information, I would close this bug soon as "cannot reproduce"

          kutzi kutzi added a comment - > Could you check which ports Hudson (resp. Jabber bot) is trying to connect to and if one of the ports is a firewall-protected one? lsof is e.g. a good tool to do this. I'm sorry that it still doesn't work for you but without further information, I would close this bug soon as "cannot reproduce"

          instant-messaging 1.7 & jabber 1.6 work well. ~10 seconds after startup, we have:
          rdesgroppes@samos:~$ sudo lsof -P -p 2868 -a -i | grep -v build-node
          COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
          java 2868 hudson 102u IPv6 327856119 TCP *:8080 (LISTEN)
          java 2868 hudson 151u IPv6 327856316 TCP *:37665 (LISTEN)
          java 2868 hudson 154u IPv6 327856318 UDP *:33848
          java 2868 hudson 162u IPv6 327856320 UDP *:5353
          java 2868 hudson 194u IPv6 327856287 TCP samos.acrolinx.local:40630->ww-in-f125.1e100.net:5222 (ESTABLISHED)

          instant-messaging 1.8 & jabber 1.10 don't work (here). ~10 seconds after startup, we have:
          rdesgroppes@samos:~$ sudo lsof -P -p 2366 -a -i | grep -v build-node
          COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
          java 2366 hudson 102u IPv6 327853797 TCP *:8080 (LISTEN)
          java 2366 hudson 104u IPv6 327853858 TCP *:54211 (LISTEN)
          java 2366 hudson 107u IPv6 327853860 UDP *:33848
          java 2366 hudson 155u IPv6 327853862 UDP *:5353
          java 2366 hudson 171u IPv6 327854293 TCP samos.acrolinx.local:57848->ey-in-f18.1e100.net:5222 (SYN_SENT)
          java 2366 hudson 176u IPv6 327854289 TCP samos.acrolinx.local:57847->ey-in-f18.1e100.net:5222 (SYN_SENT)

          Let me know what else I could try.
          Thanks,
          Régis

          rdesgroppes Régis Desgroppes added a comment - instant-messaging 1.7 & jabber 1.6 work well. ~10 seconds after startup, we have: rdesgroppes@samos:~$ sudo lsof -P -p 2868 -a -i | grep -v build-node COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME java 2868 hudson 102u IPv6 327856119 TCP *:8080 (LISTEN) java 2868 hudson 151u IPv6 327856316 TCP *:37665 (LISTEN) java 2868 hudson 154u IPv6 327856318 UDP *:33848 java 2868 hudson 162u IPv6 327856320 UDP *:5353 java 2868 hudson 194u IPv6 327856287 TCP samos.acrolinx.local:40630->ww-in-f125.1e100.net:5222 (ESTABLISHED) instant-messaging 1.8 & jabber 1.10 don't work (here). ~10 seconds after startup, we have: rdesgroppes@samos:~$ sudo lsof -P -p 2366 -a -i | grep -v build-node COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME java 2366 hudson 102u IPv6 327853797 TCP *:8080 (LISTEN) java 2366 hudson 104u IPv6 327853858 TCP *:54211 (LISTEN) java 2366 hudson 107u IPv6 327853860 UDP *:33848 java 2366 hudson 155u IPv6 327853862 UDP *:5353 java 2366 hudson 171u IPv6 327854293 TCP samos.acrolinx.local:57848->ey-in-f18.1e100.net:5222 (SYN_SENT) java 2366 hudson 176u IPv6 327854289 TCP samos.acrolinx.local:57847->ey-in-f18.1e100.net:5222 (SYN_SENT) Let me know what else I could try. Thanks, Régis
          kutzi kutzi added a comment -

          Thanks for the feedback.

          You could additionally do a lsof on ww-in-f125.1e100.net (I suppose that's your Jabber server)
          and then check if any of the corresponding ports re protected by some firewall rule.

          kutzi kutzi added a comment - Thanks for the feedback. You could additionally do a lsof on ww-in-f125.1e100.net (I suppose that's your Jabber server) and then check if any of the corresponding ports re protected by some firewall rule.

          No, that's an external server (.1e100.net belongs to Google)
          $ ping gmail.com

          rdesgroppes Régis Desgroppes added a comment - No, that's an external server (.1e100.net belongs to Google) $ ping gmail.com
          kutzi kutzi added a comment - - edited

          Hmm, not much ideas left how to help you.

          • have you tried to disable SASL authentication in the global config page?
          • SYN_SENT seems to incidcate that there are connection problems on the TCP level. Maybe you have some network admin at hand who can help you to debug this? This is definitely out of my knowledge field

          BTW: with Jabber plugin v1.10 Hudson does restart, right? 'Only' Jabber connection doesn't work.

          kutzi kutzi added a comment - - edited Hmm, not much ideas left how to help you. have you tried to disable SASL authentication in the global config page? SYN_SENT seems to incidcate that there are connection problems on the TCP level. Maybe you have some network admin at hand who can help you to debug this? This is definitely out of my knowledge field maybe you could try to ask at the Smack forum http://www.igniterealtime.org/community/community/developers/smack if someone there knows about problems connecting to GoogleTalk in Smack 3.1 BTW: with Jabber plugin v1.10 Hudson does restart, right? 'Only' Jabber connection doesn't work.

          Yes!

          So the problem is now solved. All what I've done:
          1. upgraded Hudson Jabber notifier plugin to 1.10 (w/ Hudson instant-messaging plugin 1.8)
          2. forced server to: talk.google.com [I didn't see this notice in the wiki page before]
          3. disabled SASL authentication

          I think we can close this issue.

          Thanks for all your inputs.
          Régis

          rdesgroppes Régis Desgroppes added a comment - Yes! So the problem is now solved. All what I've done: 1. upgraded Hudson Jabber notifier plugin to 1.10 (w/ Hudson instant-messaging plugin 1.8) 2. forced server to: talk.google.com [I didn't see this notice in the wiki page before] 3. disabled SASL authentication I think we can close this issue. Thanks for all your inputs. Régis
          rdesgroppes Régis Desgroppes added a comment - Fixed since version 1.8, see: http://wiki.jenkins-ci.org/pages/viewpage.action?pageId=753777
          rdesgroppes Régis Desgroppes made changes -
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]
          rtyler R. Tyler Croy made changes -
          Workflow JNJira [ 136102 ] JNJira + In-Review [ 203855 ]

          People

            kutzi kutzi
            rdesgroppes Régis Desgroppes
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: