Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-3889

EOFException on reconnect after network connection loss

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • core
    • Platform: All, OS: Windows XP

      In our network, we repeatedly have problems, that network connection suddenly
      drops but is available again right away. After that happened, I only see the
      "Connected to.." popup in Windows taskbar - that's how I recognized the
      connection loss. I don't know why this happens, and it's usually no problem
      during work (but network guys couldn't yet give me an answer).

      In such a case, a Hudson slave (running on a computer where this happens)
      terminates (Hudson slave window shows state "Terminated"). After the network
      connection being available again, the slave tries to reconnect to the master,
      but it fails. Hudson slave window shows state "Handshaking" and then an
      EOFException occurs. See following stack:

      java.io.EOFException: unexpected stream termination
      at hudson.remoting.Channel.<init>(Channel.java:312)
      at hudson.remoting.Channel.<init>(Channel.java:251)
      at hudson.remoting.Channel.<init>(Channel.java:239)
      at hudson.remoting.Engine.run(Engine.java:159)

      In order to reproduce the problem, simply deactivate your LAN connection in
      Windows control panel (you will see Hudson slave window showing state
      "Terminated") and then re-activate it (then you see state "Handshaking" and the
      EOFException).

      As this is easily reproducable for me, I suppose it is not a problem with our
      network infrastructure, but a bug in Hudson slave implementation when
      reconnecting to the master.

          [JENKINS-3889] EOFException on reconnect after network connection loss

          fantasmic added a comment -

          Created an attachment (id=745)
          Hudson slave windows & error dialog with EOFException

          fantasmic added a comment - Created an attachment (id=745) Hudson slave windows & error dialog with EOFException

          Thanks for the instruction to reproduce the problem.

          I believe this is happening because master failed to notice that the connection
          went down in a timely fashion, thus subsequent attempts from the slave to
          connect gets rejected (and we aren't nicely telling this problem to the plugin.

          You can verify this by checking what the master says about the slave status,
          after you disabled the network.

          This is a bug at multiple levels...

          Kohsuke Kawaguchi added a comment - Thanks for the instruction to reproduce the problem. I believe this is happening because master failed to notice that the connection went down in a timely fashion, thus subsequent attempts from the slave to connect gets rejected (and we aren't nicely telling this problem to the plugin. You can verify this by checking what the master says about the slave status, after you disabled the network. This is a bug at multiple levels...

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/core/src/main/java/hudson/TcpSlaveAgentListener.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Engine.java
          trunk/www/changelog.html
          http://fisheye4.cenqua.com/changelog/hudson/?cs=22037
          Log:
          JENKINS-3889 JNLP clients now report the reason when the connection is rejected by the master.

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/core/src/main/java/hudson/TcpSlaveAgentListener.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/Engine.java trunk/www/changelog.html http://fisheye4.cenqua.com/changelog/hudson/?cs=22037 Log: JENKINS-3889 JNLP clients now report the reason when the connection is rejected by the master.

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/core/src/main/java/hudson/slaves/ConnectionActivityMonitor.java
          http://fisheye4.cenqua.com/changelog/hudson/?cs=22039
          Log:
          JENKINS-3889 In 1.325, I added a ping support so that the master detects dead connections more rapidly. This feature is off by default, until we feel comfortable enough that this feature works correctly.

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/core/src/main/java/hudson/slaves/ConnectionActivityMonitor.java http://fisheye4.cenqua.com/changelog/hudson/?cs=22039 Log: JENKINS-3889 In 1.325, I added a ping support so that the master detects dead connections more rapidly. This feature is off by default, until we feel comfortable enough that this feature works correctly.

          Atiq Rahman added a comment -

          Is there a permanent fix for this issue? I am seeing this all of a sudden on our slave test machines. Our slave machines were working just fine all this time but out of no where we started to notice this issue.

          Any help would be appreciated!

          Thanks,

          Atiq

          Atiq Rahman added a comment - Is there a permanent fix for this issue? I am seeing this all of a sudden on our slave test machines. Our slave machines were working just fine all this time but out of no where we started to notice this issue. Any help would be appreciated! Thanks, Atiq

          Daniel Beck added a comment -

          Given the age of this issue and the large number of changes since this was reported, I'm resolving this as 'Cannot Reproduce'.

          If this or a similar issue still occurs on recent Jenkins versions, please file a new issue. Provide as much information about your setup as you can.

          Daniel Beck added a comment - Given the age of this issue and the large number of changes since this was reported, I'm resolving this as 'Cannot Reproduce'. If this or a similar issue still occurs on recent Jenkins versions, please file a new issue. Provide as much information about your setup as you can.

            Unassigned Unassigned
            fantasmic fantasmic
            Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: