Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14332

Repeated channel/timeout errors from Jenkins slave

    XMLWordPrintable

Details

    Description

      The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

      We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see logs above) => All scheduled builds fail with an error

      The issue affects ssh-slaves only:

      • Linux SSH slaves are "online", but all jobs on the fail with the error above
      • Windows services have reconnected automatically...
      • Windows JNLP slaves have reconnected as well

      Attachments

        Issue Links

          Activity

            seanabbott Sean Abbott added a comment -

            I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

            seanabbott Sean Abbott added a comment - I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

            Hi,

            I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave.
            And the problem gone ...
            Jenkins version used is the last stable : 1.609.1

            Regards

            gboucherie Guillaume Boucherie added a comment - Hi, I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 Regards
            jglick Jesse Glick added a comment -

            Related to JENKINS-1948 perhaps?

            jglick Jesse Glick added a comment - Related to JENKINS-1948 perhaps?

            In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.

             

            ERROR: SEVERE ERROR occurs
            org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42)
            at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492)
            at hudson.model.Run.execute(Run.java:1741)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            at hudson.model.ResourceController.execute(ResourceController.java:98)
            at hudson.model.Executor.run(Executor.java:410)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:578)
            at hudson.remoting.Request.call(Request.java:130)
            at hudson.remoting.Channel.call(Channel.java:780)
            at hudson.FilePath.act(FilePath.java:1102)
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48)
            ... 8 more
            Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:1163)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121)
            at hudson.remoting.PingThread.ping(PingThread.java:130)
            at hudson.remoting.PingThread.run(PingThread.java:86)
            Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229
            
            

             

            srivadlamani Srikanth Vadlamani added a comment - In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.   ERROR: SEVERE ERROR occurs org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492) at hudson.model.Run.execute(Run.java:1741) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:578) at hudson.remoting.Request.call(Request.java:130) at hudson.remoting.Channel.call(Channel.java:780) at hudson.FilePath.act(FilePath.java:1102) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48) ... 8 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1163) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121) at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229  

            because there is not recent info here and seems similar to JENKINS-53810 I will close it.

            ifernandezcalvo Ivan Fernandez Calvo added a comment - because there is not recent info here and seems similar to JENKINS-53810 I will close it.

            People

              ifernandezcalvo Ivan Fernandez Calvo
              olamy Olivier Lamy
              Votes:
              33 Vote for this issue
              Watchers:
              51 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: