Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39289

Better diagnostics on "Backing channel is disconnected"

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • remoting
    • None

      Often, the code that handles lost connection and report that problem is different from code that uses proxies that go over this channel. Example stack trace:

      15:44:26 java.io.IOException: Backing channel is disconnected.
      15:44:26        at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:184)
      15:44:26        at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:249)
      15:44:26        at com.sun.proxy.$Proxy104.isAlive(Unknown Source)
      15:44:26        at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:996)
      15:44:26        at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:166)
      15:44:26        at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:853)
      15:44:26        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
      15:44:26        at hudson.model.Run.execute(Run.java:1738)
      15:44:26        at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531)
      15:44:26        at hudson.model.ResourceController.execute(ResourceController.java:98)
      15:44:26        at hudson.model.Executor.run(Executor.java:410)
      

      The error diagnostics is easier if the cause of the channel loss is linked to this exception.

          [JENKINS-39289] Better diagnostics on "Backing channel is disconnected"

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/RemoteInvocationHandler.java
          http://jenkins-ci.org/commit/remoting/89a1774918d1e2adba41ae7ae4ba34f095dc2676
          Log:
          JENKINS-39289 When a proxy fails, report what caused the channel to go down

          Today, this requires out of bound knowledge about which connection the
          proxy was representing, then use other means to figure out why it has
          failed.

          This exception chaining shortens that step and makes it easy to find the
          cause as to why the channel was shut down.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/RemoteInvocationHandler.java http://jenkins-ci.org/commit/remoting/89a1774918d1e2adba41ae7ae4ba34f095dc2676 Log: JENKINS-39289 When a proxy fails, report what caused the channel to go down Today, this requires out of bound knowledge about which connection the proxy was representing, then use other means to figure out why it has failed. This exception chaining shortens that step and makes it easy to find the cause as to why the channel was shut down.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/RemoteInvocationHandler.java
          http://jenkins-ci.org/commit/remoting/0d8a2af7cffa6aa5a6f2675e810b286a984af04e
          Log:
          JENKINS-39289 When a proxy fails, report what caused the channel to go down (#128)

          Today, this requires out of bound knowledge about which connection the
          proxy was representing, then use other means to figure out why it has
          failed.

          This exception chaining shortens that step and makes it easy to find the
          cause as to why the channel was shut down.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/RemoteInvocationHandler.java http://jenkins-ci.org/commit/remoting/0d8a2af7cffa6aa5a6f2675e810b286a984af04e Log: JENKINS-39289 When a proxy fails, report what caused the channel to go down (#128) Today, this requires out of bound knowledge about which connection the proxy was representing, then use other means to figure out why it has failed. This exception chaining shortens that step and makes it easy to find the cause as to why the channel was shut down.

          Oleg Nenashev added a comment -

          The fix has been integrated towards remoting-3.1

          Oleg Nenashev added a comment - The fix has been integrated towards remoting-3.1

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          pom.xml
          http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69
          Log:
          Update remoting to 2.31 in the Jenkins core. (#2628)

          The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements.

          Bugfixes:

          Improvements:

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69 Log: Update remoting to 2.31 in the Jenkins core. (#2628) The change introduces one serious bugfix ( JENKINS-39596 ) and a bunch of various diagnostics improvements. Bugfixes: JENKINS-39596 ( https://issues.jenkins-ci.org/browse/JENKINS-39596 ) - Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`. It was causing connection failures of Jenkins JNLP agents when using Java Web Start. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) JENKINS-39617 ( https://issues.jenkins-ci.org/browse/JENKINS-39617 ) - `hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) Improvements: JENKINS-39150 ( https://issues.jenkins-ci.org/browse/JENKINS-39150 ) - Add logic for dumping diagnostics across all the channels. ( PR #122 ( https://github.com/jenkinsci/remoting/pull/122 ), PR #125 ( https://github.com/jenkinsci/remoting/pull/125 )) JENKINS-39543 ( https://issues.jenkins-ci.org/browse/JENKINS-39543 ) - Improve the caller/callee correlation diagnostics in thread dumps. ( PR #119 ( https://github.com/jenkinsci/remoting/pull/119 )) JENKINS-39290 ( https://issues.jenkins-ci.org/browse/JENKINS-39290 ) - Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes). ( PR #123 ( https://github.com/jenkinsci/remoting/pull/123 )) JENKINS-38692 ( https://issues.jenkins-ci.org/browse/JENKINS-38692 ) - Add extra logging to help diagnosing `IOHub` Thread spikes. ( PR #116 ( https://github.com/jenkinsci/remoting/pull/116 )) JENKINS-39289 ( https://issues.jenkins-ci.org/browse/JENKINS-39289 ) - When a proxy fails, report what caused the channel to go down. ( PR #128 ( https://github.com/jenkinsci/remoting/pull/128 ))

          Oleg Nenashev added a comment -

          The fix has been integrated towards jenkins-2.31

          Oleg Nenashev added a comment - The fix has been integrated towards jenkins-2.31

          Jan Duris added a comment - - edited

          Sometimes (randomly), my tests are failing with this error. Tests runs on docker nodes. Jenkins (or docker plugin) cancel docker node where tests are executed too early. Jenkins job can't  finish because executor is terminated.

          Jenkins version: 2.100 | Docker Plugin version: 1.1.2

           
          FATAL: command execution failed
          Command close created at
          at hudson.remoting.Command.<init>(Command.java:62)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1220)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1218)
          at hudson.remoting.Channel.close(Channel.java:1391)
          at hudson.remoting.Channel.close(Channel.java:1358)
          at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1225)
          Caused: hudson.remoting.Channel$OrderlyShutdown
          at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1226)
          at hudson.remoting.Channel$1.handle(Channel.java:561)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:87)
          Caused: java.io.IOException: Backing channel 'docker-4e1dcb53088a' is disconnected.
          at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:212)
          at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281)
          at com.sun.proxy.$Proxy87.isAlive(Unknown Source)
          at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
          at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
          at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          at hudson.model.Build$BuildExecution.build(Build.java:206)
          at hudson.model.Build$BuildExecution.doRun(Build.java:163)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
          at hudson.model.Run.execute(Run.java:1727)
          at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:429)

          Jan Duris added a comment - - edited Sometimes (randomly), my tests are failing with this error. Tests runs on docker nodes. Jenkins (or docker plugin) cancel docker node where tests are executed too early. Jenkins job can't  finish because executor is terminated. Jenkins version: 2.100 | Docker Plugin version: 1.1.2   FATAL: command execution failed Command close created at at hudson.remoting.Command.<init>(Command.java:62) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1220) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1218) at hudson.remoting.Channel.close(Channel.java:1391) at hudson.remoting.Channel.close(Channel.java:1358) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1225) Caused: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1226) at hudson.remoting.Channel$1.handle(Channel.java:561) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:87) Caused: java.io.IOException: Backing channel 'docker-4e1dcb53088a' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:212) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281) at com.sun.proxy.$Proxy87.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1727) at hudson.matrix.MatrixRun.run(MatrixRun.java:146) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429)

          Oleg Nenashev added a comment -

          megafon why have you reopened it? Could you please clarify the status?

          Oleg Nenashev added a comment - megafon why have you reopened it? Could you please clarify the status?

          Oleg Nenashev added a comment -

          Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

          Oleg Nenashev added a comment - Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

          Dell Green added a comment - - edited

           I am also getting this almost every day with the volume of jobs we have on the server. Either that or a job hangs forever and never completes.

          My slaves run in lxd containers, 

          jenkins 2.131,

           
          java.io.IOException: Backing channel '005-lxd-ubuntu_16.04-x86_64' is disconnected.
          at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)
          at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
          at com.sun.proxy.$Proxy73.isAlive(Unknown Source)
          at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
          at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
          at hudson.Launcher$ProcStarter.join(Launcher.java:464)
          at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)
          at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          at hudson.model.Build$BuildExecution.build(Build.java:206)
          at hudson.model.Build$BuildExecution.doRun(Build.java:163)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
          at hudson.model.Run.execute(Run.java:1815)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:429)
          Caused by: java.io.IOException: Unexpected termination of the channel
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
          Caused by: java.io.EOFException
          at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2679)
          at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3154)
          at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
          at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
          at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
          at hudson.remoting.Command.readFrom(Command.java:140)
          at hudson.remoting.Command.readFrom(Command.java:126)
          at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)

          Dell Green added a comment - - edited  I am also getting this almost every day with the volume of jobs we have on the server. Either that or a job hangs forever and never completes. My slaves run in lxd containers,  jenkins 2.131,   java.io.IOException: Backing channel '005-lxd-ubuntu_16.04-x86_64' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) at com.sun.proxy.$Proxy73.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129) at hudson.Launcher$ProcStarter.join(Launcher.java:464) at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333) at hudson.plugins.gradle.Gradle.perform(Gradle.java:225) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1815) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2679) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3154) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)

            Unassigned Unassigned
            kohsuke Kohsuke Kawaguchi
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: