Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48895

Channels closed exception after upgrade Jenkins version 2.90

      Hi oleg_nenashev,

      We upgraded Jenkins v2.90 version. Still, we are facing channels closed exception.Can you please check and provide the solution to resolve this? Since this makes our CI environment unstable. 

      Environment:

      Jenkins server: Linux machine

      Slave: Windows slave. Windows 10 OS. 

      Error Details

      Connection was broken: java.nio.channels.ClosedChannelException

                    at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)

                    at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)

                    at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)

                    at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)

                    at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)

                    at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)

                    at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)

                    at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)

                    at hudson.remoting.Channel.close(Channel.java:1405)

                    at hudson.remoting.Channel.close(Channel.java:1358)

                    at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:737)

                    at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)

                    at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:655)

                    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)

                    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

                    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

                    at java.lang.Thread.run(Thread.java:748)

          [JENKINS-48895] Channels closed exception after upgrade Jenkins version 2.90

          +1 on this: We're having exactly the same problem. Same error, same environment. However, we're running Jenkins 2.101, but we had the same problem when running Jenkins 2.97.

          This is a major problem for us, since most of our longer Jenkins builds fail due to a connection timeout.

          Fabian Sörensson added a comment - +1 on this: We're having exactly the same problem. Same error, same environment. However, we're running Jenkins 2.101, but we had the same problem when running Jenkins 2.97. This is a major problem for us, since most of our longer Jenkins builds fail due to a connection timeout.

          Oleg Nenashev added a comment -

          Form which version have you upgraded BTW?
          And which Remoting version are you using on agents?

          Oleg Nenashev added a comment - Form which version have you upgraded BTW? And which Remoting version are you using on agents?

          Fabian Sörensson added a comment - - edited

          We are running version 3.15 on both our Windows slaves. However, only one of them seem to be behaving this way. While there are some differences between them, I am not really sure what would be relevant. If it is of interest, perhaps it is possible to dump system properties in some way?

          I am not sure what version we were running before 2.97, but I would guess on Jenkins v. 2.92 based on when we updated.

          Fabian Sörensson added a comment - - edited We are running version 3.15 on both our Windows slaves. However, only one of them seem to be behaving this way. While there are some differences between them, I am not really sure what would be relevant. If it is of interest, perhaps it is possible to dump system properties in some way? I am not sure what version we were running before 2.97, but I would guess on Jenkins v. 2.92 based on when we updated.

          In hopes of increasing visibility on this issue, I would like to share that I too have agents close connection for various reasons, sometimes unknown and sometimes known. The unknown reasons I have no information on other than these agents run on MS Windows of various versions or run in Kubernetes using the official docker image on stable Kubernetes hosts.

          The known reasons have to do with running agents again in Kubernetes with the same image but on preemptable hosts (hosts that can be taken down by the cloud platform whenever it needs space). Thus the agent is shutdown in the middle of an operation which cause the channel to close.

          My concern in this JIRA is the unknown reasons for channel close and no information in logs.

          We are using Jenkins Master 2.107.3 & agents using jenkins/jnlp-slave:3.19-1 but also occurred on older versions of both.

          Sam Beckwith III added a comment - In hopes of increasing visibility on this issue, I would like to share that I too have agents close connection for various reasons, sometimes unknown and sometimes known. The unknown reasons I have no information on other than these agents run on MS Windows of various versions or run in Kubernetes using the official docker image on stable Kubernetes hosts. The known reasons have to do with running agents again in Kubernetes with the same image but on preemptable hosts (hosts that can be taken down by the cloud platform whenever it needs space). Thus the agent is shutdown in the middle of an operation which cause the channel to close. My concern in this JIRA is the unknown reasons for channel close and no information in logs. We are using Jenkins Master 2.107.3 & agents using jenkins/jnlp-slave:3.19-1 but also occurred on older versions of both.

          I can mention that we are not having these problems anymore! I think in our case, our problems were solved by something so simple as fiddling with the Windows sleep/hibernate options...

          Fabian Sörensson added a comment - I can mention that we are not having these problems anymore! I think in our case, our problems were solved by something so simple as fiddling with the Windows sleep/hibernate options...

          We are also facing same error. we are using 2.107.1 jenkins version. We have many builds running on this slave. Good thing is it is getting connected automatically after 15 mins. But it is creating much problem

          Narendra Lankalapalli added a comment - We are also facing same error. we are using 2.107.1 jenkins version. We have many builds running on this slave. Good thing is it is getting connected automatically after 15 mins. But it is creating much problem

          Jeff Thompson added a comment -

          Several of these issues involve similar reports but possibly very different causes. Frequently the error indicates that the channel is closed but provides no indication as to how or why that occurred. Commonly remoting issues involve something in the networking or system environment terminating the connection from outside the process. The trick can be to determine what is doing that. In one instance (JENKINS-52922), Nush Ahmd discovered that setting hudson.slaves.ChannelPinger.pingIntervalSeconds kept the channel from getting disconnected. Or as sfabian noted, fiddling with Windows sleep / hibernate options. Or various timeouts.

          One thing that can help is to increase agent or master logging output. You can read about it here: https://github.com/jenkinsci/remoting/blob/master/docs/logging.md . In summary, if you add a java.util.logging properties file and then reference it via the `-loggingConfig` parameter to the agent. For example something like this: `-loggingConfig jenkins-logging.properties`.

          Without further information it is difficult to diagnose anything from this side. Frequently the error is environmental.
           

          Jeff Thompson added a comment - Several of these issues involve similar reports but possibly very different causes. Frequently the error indicates that the channel is closed but provides no indication as to how or why that occurred. Commonly remoting issues involve something in the networking or system environment terminating the connection from outside the process. The trick can be to determine what is doing that. In one instance ( JENKINS-52922 ), Nush Ahmd discovered that setting hudson.slaves.ChannelPinger.pingIntervalSeconds kept the channel from getting disconnected. Or as  sfabian  noted, fiddling with Windows sleep / hibernate options. Or various timeouts. One thing that can help is to increase agent or master logging output. You can read about it here: https://github.com/jenkinsci/remoting/blob/master/docs/logging.md  . In summary, if you add a java.util.logging properties file and then reference it via the `-loggingConfig` parameter to the agent. For example something like this: `-loggingConfig jenkins-logging.properties`. Without further information it is difficult to diagnose anything from this side. Frequently the error is environmental.  

          Jeff Thompson added a comment -

          Closing for lack of sufficient diagnostics and information to reproduce after no response for quite a while.

          Jeff Thompson added a comment - Closing for lack of sufficient diagnostics and information to reproduce after no response for quite a while.

            tonho Elton Alves
            vadivel Vadivel Natarajan
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: