Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-8700

"Node offline during build" and problems reconnecting.

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved (View Workflow)
    • Blocker
    • Resolution: Duplicate
    • core
    • None
    • Windows Server 2003 (master - VMWare VM)
      Windows XP, SP2 (slave - VMWare VM)

    Description

      We have jobs that start out fine, get sources from Subversion fine, and then die after about 30 seconds of Python execution. It looks from the logs as if the slave connection fails and an attempt is made to reconnect, then the reconnection is rejected because it is thought to have already been made.

      We upgraded to Jenkins 1.396 from Hudson 1.310 and then 1.395 to resolve similar connection failures (see Hudson issue 5073). Building is impossible for us with this current version.

      Any workarounds or runtime parameters we can try? What could be causing the disconnections?

      Failing job log:
      Testing ENG-RTOS-2DGI-C9FATAL: command execution failed
      hudson.util.IOException2: Failed to join the process
      at hudson.Proc$RemoteProc.join(Proc.java:359)
      at hudson.Launcher$ProcStarter.join(Launcher.java:280)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:622)
      at hudson.model.Build$RunnerImpl.build(Build.java:172)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:137)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:420)
      at hudson.model.Run.run(Run.java:1362)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:145)
      Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: socket closed
      at hudson.remoting.Request$1.get(Request.java:218)
      at hudson.remoting.Request$1.get(Request.java:172)
      at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
      at hudson.Proc$RemoteProc.join(Proc.java:351)
      ... 12 more
      Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: socket closed
      at hudson.remoting.Request.abort(Request.java:257)
      at hudson.remoting.Channel.terminate(Channel.java:680)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:971)
      Caused by: java.net.SocketException: socket closed
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(Unknown Source)
      at java.io.BufferedInputStream.fill(Unknown Source)
      at java.io.BufferedInputStream.read(Unknown Source)
      at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      at java.io.ObjectInputStream.readObject0(Unknown Source)
      at java.io.ObjectInputStream.readObject(Unknown Source)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      FATAL: Unable to delete script file C:\DOCUME~1\aditools\LOCALS~1\Temp\hudson53001.bat
      hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\aditools\LOCALS~1\Temp\hudson53001.bat at hudson.remoting.Channel@a7cba9:ctse-test-xp-5
      at hudson.FilePath.act(FilePath.java:752)
      at hudson.FilePath.act(FilePath.java:738)
      at hudson.FilePath.delete(FilePath.java:993)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:622)
      at hudson.model.Build$RunnerImpl.build(Build.java:172)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:137)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:420)
      at hudson.model.Run.run(Run.java:1362)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:145)
      Caused by: hudson.remoting.ChannelClosedException: channel is already closed
      at hudson.remoting.Channel.send(Channel.java:466)
      at hudson.remoting.Request.call(Request.java:105)
      at hudson.remoting.Channel.call(Channel.java:629)
      at hudson.FilePath.act(FilePath.java:745)
      ... 13 more
      Caused by: java.net.SocketException: socket closed
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(Unknown Source)
      at java.io.BufferedInputStream.fill(Unknown Source)
      at java.io.BufferedInputStream.read(Unknown Source)
      at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      at java.io.ObjectInputStream.readObject0(Unknown Source)
      at java.io.ObjectInputStream.readObject(Unknown Source)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      Looks like the node went offline during the build. Check the slave log for the details.details
      JNLP agent connected from /10.64.83.182

      FATAL: channel is already closed
      hudson.remoting.ChannelClosedException: channel is already closed
      at hudson.remoting.Channel.send(Channel.java:466)
      at hudson.remoting.Request.call(Request.java:105)
      at hudson.remoting.Channel.call(Channel.java:629)
      at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:441)
      at hudson.model.Run.run(Run.java:1362)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:145)
      Caused by: java.net.SocketException: socket closed
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(Unknown Source)
      at java.io.BufferedInputStream.fill(Unknown Source)
      at java.io.BufferedInputStream.read(Unknown Source)
      at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      at java.io.ObjectInputStream.readObject0(Unknown Source)
      at java.io.ObjectInputStream.readObject(Unknown Source)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)

      Jenkins log:

      Feb 4, 2011 11:28:18 AM hudson.TcpSlaveAgentListener$ConnectionHandler$1 onClosed
      WARNING: Connection #10 for + ctse-test-xp-5 terminated
      java.net.SocketException: socket closed
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(Unknown Source)
      at java.io.BufferedInputStream.fill(Unknown Source)
      at java.io.BufferedInputStream.read(Unknown Source)
      at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      at java.io.ObjectInputStream.readObject0(Unknown Source)
      at java.io.ObjectInputStream.readObject(Unknown Source)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      Feb 4, 2011 11:28:18 AM hudson.remoting.Channel$ReaderThread run
      SEVERE: I/O error in channel ctse-test-xp-5
      java.net.SocketException: socket closed
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(Unknown Source)
      at java.io.BufferedInputStream.fill(Unknown Source)
      at java.io.BufferedInputStream.read(Unknown Source)
      at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
      at java.io.ObjectInputStream.readObject0(Unknown Source)
      at java.io.ObjectInputStream.readObject(Unknown Source)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:947)
      Feb 4, 2011 11:28:18 AM hudson.TcpSlaveAgentListener$ConnectionHandler runJnlp2Connect
      INFO: Disconnecting ctse-test-xp-5 as we are reconnected from the current peer
      Feb 4, 2011 11:28:18 AM hudson.TcpSlaveAgentListener$ConnectionHandler run
      INFO: Accepted connection #11 from /10.64.83.182:2467

      Attachments

        Issue Links

          Activity

            msacarny msacarny created issue -
            lacostej lacostej made changes -
            Field Original Value New Value
            Link This issue is related to JENKINS-5073 [ JENKINS-5073 ]
            jglick Jesse Glick made changes -
            Link This issue duplicates JENKINS-1948 [ JENKINS-1948 ]
            jglick Jesse Glick made changes -
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Resolved [ 5 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 138844 ] JNJira + In-Review [ 188175 ]

            People

              Unassigned Unassigned
              msacarny msacarny
              Votes:
              4 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: