Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14332

Repeated channel/timeout errors from Jenkins slave

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • jenkins-1.509.4 with remoting-2.36
      ssh-slaves-1.2

      The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

      We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see logs above) => All scheduled builds fail with an error

      The issue affects ssh-slaves only:

      • Linux SSH slaves are "online", but all jobs on the fail with the error above
      • Windows services have reconnected automatically...
      • Windows JNLP slaves have reconnected as well

          [JENKINS-14332] Repeated channel/timeout errors from Jenkins slave

          Olivier Lamy added a comment -

          happened with recent version Jenkins ver. 1.475-SNAPSHOT stack trace:

          Started by timer
          Building remotely on ubuntu1 in workspace /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16
          hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
          	at hudson.FilePath.act(FilePath.java:838)
          	at hudson.FilePath.act(FilePath.java:824)
          	at hudson.FilePath.mkdirs(FilePath.java:890)
          	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
          	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
          	at hudson.model.Run.execute(Run.java:1488)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:236)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:492)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:663)
          	at hudson.FilePath.act(FilePath.java:831)
          	... 10 more
          Caused by: java.io.IOException
          	at hudson.remoting.Channel.close(Channel.java:895)
          	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          	at hudson.remoting.PingThread.ping(PingThread.java:114)
          	at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
          	... 2 more
          Caused by: java.util.concurrent.TimeoutException
          	at hudson.remoting.Request$1.get(Request.java:275)
          	at hudson.remoting.Request$1.get(Request.java:210)
          	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
          	at hudson.remoting.PingThread.ping(PingThread.java:107)
          	... 1 more
          Retrying after 10 seconds
          hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
          	at hudson.FilePath.act(FilePath.java:838)
          	at hudson.FilePath.act(FilePath.java:824)
          	at hudson.FilePath.mkdirs(FilePath.java:890)
          	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
          	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
          	at hudson.model.Run.execute(Run.java:1488)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:236)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:492)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:663)
          	at hudson.FilePath.act(FilePath.java:831)
          	... 10 more
          Caused by: java.io.IOException
          	at hudson.remoting.Channel.close(Channel.java:895)
          	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          	at hudson.remoting.PingThread.ping(PingThread.java:114)
          	at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
          	... 2 more
          Caused by: java.util.concurrent.TimeoutException
          	at hudson.remoting.Request$1.get(Request.java:275)
          	at hudson.remoting.Request$1.get(Request.java:210)
          	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
          	at hudson.remoting.PingThread.ping(PingThread.java:107)
          	... 1 more
          Retrying after 10 seconds
          hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
          	at hudson.FilePath.act(FilePath.java:838)
          	at hudson.FilePath.act(FilePath.java:824)
          	at hudson.FilePath.mkdirs(FilePath.java:890)
          	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
          	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
          	at hudson.model.Run.execute(Run.java:1488)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:236)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:492)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:663)
          	at hudson.FilePath.act(FilePath.java:831)
          	... 10 more
          Caused by: java.io.IOException
          	at hudson.remoting.Channel.close(Channel.java:895)
          	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          	at hudson.remoting.PingThread.ping(PingThread.java:114)
          	at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
          	... 2 more
          Caused by: java.util.concurrent.TimeoutException
          	at hudson.remoting.Request$1.get(Request.java:275)
          	at hudson.remoting.Request$1.get(Request.java:210)
          	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
          	at hudson.remoting.PingThread.ping(PingThread.java:107)
          	... 1 more
          Recording test results
          ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception
          hudson.remoting.ChannelClosedException: channel is already closed
          	at hudson.remoting.Channel.send(Channel.java:492)
          	at hudson.remoting.Request.call(Request.java:129)
          	at hudson.remoting.Channel.call(Channel.java:663)
          	at hudson.EnvVars.getRemote(EnvVars.java:202)
          	at hudson.model.Computer.getEnvironment(Computer.java:843)
          	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28)
          	at hudson.model.Run.getEnvironment(Run.java:1938)
          	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:843)
          	at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:131)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:692)
          	at hudson.model.Build$BuildExecution.post2(Build.java:183)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:639)
          	at hudson.model.Run.execute(Run.java:1513)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:236)
          Caused by: java.io.IOException
          	at hudson.remoting.Channel.close(Channel.java:895)
          	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          	at hudson.remoting.PingThread.ping(PingThread.java:114)
          	at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
          	... 2 more
          Caused by: java.util.concurrent.TimeoutException
          	at hudson.remoting.Request$1.get(Request.java:275)
          	at hudson.remoting.Request$1.get(Request.java:210)
          	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
          	at hudson.remoting.PingThread.ping(PingThread.java:107)
          	... 1 more
          Finished: FAILURE
          

          I don't understand this:

          Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
          

          Olivier Lamy added a comment - happened with recent version Jenkins ver. 1.475-SNAPSHOT stack trace: Started by timer Building remotely on ubuntu1 in workspace /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Retrying after 10 seconds hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Retrying after 10 seconds hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Recording test results ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.EnvVars.getRemote(EnvVars.java:202) at hudson.model.Computer.getEnvironment(Computer.java:843) at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) at hudson.model.Run.getEnvironment(Run.java:1938) at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:843) at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:131) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:692) at hudson.model.Build$BuildExecution.post2(Build.java:183) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:639) at hudson.model.Run.execute(Run.java:1513) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Finished: FAILURE I don't understand this: Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424

          The message looks ok to me. There is 360 seconds difference between the two timestamps which would be consistent with a timeout.

          However I have a feeling that this is due to some additional changes that Koshuke made when he merged my remoting pull-3. See

          https://github.com/jenkinsci/remoting/commit/aceaf5c038e8179723f9a83be6f994f37e7cfa46

          Previously the TimeoutException (ping failed) message would only have been observed if a TimeoutException was received. Now the message will be logged when the loop ends. Maybe there is a real timeout/delay here that was previously hidden.

          One other note. The while(remaining>0) gets evaluated after a timeout exception but remaining will only get recalculated on the next loop so there will be an extra pass around the loop.

          Richard Mortimer added a comment - The message looks ok to me. There is 360 seconds difference between the two timestamps which would be consistent with a timeout. However I have a feeling that this is due to some additional changes that Koshuke made when he merged my remoting pull-3. See https://github.com/jenkinsci/remoting/commit/aceaf5c038e8179723f9a83be6f994f37e7cfa46 Previously the TimeoutException (ping failed) message would only have been observed if a TimeoutException was received. Now the message will be logged when the loop ends. Maybe there is a real timeout/delay here that was previously hidden. One other note. The while(remaining>0) gets evaluated after a timeout exception but remaining will only get recalculated on the next loop so there will be an extra pass around the loop.

          Ah I take some of that back. I didn't check the main jenkins build closely enough. Jenkins is using remoting 1.16 and this does not include the latest changes made by Kohsuke/me.

          The PingThread logic in 1.475-SNAPSHOT is unchanged since Sept/2011

          Richard Mortimer added a comment - Ah I take some of that back. I didn't check the main jenkins build closely enough. Jenkins is using remoting 1.16 and this does not include the latest changes made by Kohsuke/me. The PingThread logic in 1.475-SNAPSHOT is unchanged since Sept/2011

          Similar symptoms at JENKINS-14307

          Richard Mortimer added a comment - Similar symptoms at JENKINS-14307

          Alexander Likulin added a comment - - edited

          The same problems happens on 1.477

          hudson.util.IOException2: remote file operation failed: /home/slave/.jenkins/slave/workspace/4.0-SNAPSHOT-POSTGRESQL2 at hudson.remoting.Channel@7e053ab4:slave
          at hudson.FilePath.act(FilePath.java:838)
          at hudson.FilePath.act(FilePath.java:824)
          at hudson.FilePath.mkdirs(FilePath.java:890)
          at hudson.model.AbstractProject.checkout(AbstractProject.java:1254)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
          at hudson.model.Run.execute(Run.java:1502)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:236)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          at hudson.remoting.Channel.send(Channel.java:492)
          at hudson.remoting.Request.call(Request.java:129)
          at hudson.remoting.Channel.call(Channel.java:663)
          at hudson.FilePath.act(FilePath.java:831)
          ... 10 more
          Caused by: java.io.IOException
          at hudson.remoting.Channel.close(Channel.java:895)
          at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          at hudson.remoting.PingThread.ping(PingThread.java:114)
          at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1346241264458 hasn't completed at 1346241504458
          ... 2 more
          Caused by: java.util.concurrent.TimeoutException
          at hudson.remoting.Request$1.get(Request.java:275)
          at hudson.remoting.Request$1.get(Request.java:210)
          at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
          at hudson.remoting.PingThread.ping(PingThread.java:107)
          ... 1 more

          Alexander Likulin added a comment - - edited The same problems happens on 1.477 hudson.util.IOException2: remote file operation failed: /home/slave/.jenkins/slave/workspace/4.0-SNAPSHOT-POSTGRESQL2 at hudson.remoting.Channel@7e053ab4:slave at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1254) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1502) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1346241264458 hasn't completed at 1346241504458 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more

          Julien Carsique added a comment - - edited

          Maybe not a Jenkins issue.

          Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave:

          [05/06/13 14:06:04] Launching slave agent
          $ ssh slavedns java -jar ~/bin/slave.jar
          <===[JENKINS REMOTING CAPACITY]===<===[JENKINS REMOTING CAPACITY]===>>channel started
          channel started
          Slave.jar version: 2.22
          This is a Unix slave
          Slave.jar version: 2.22
          This is a Unix slave
          Copied maven-agent.jar
          Copied maven3-agent.jar
          Copied maven3-interceptor.jar
          Copied maven-agent.jar
          Copied maven-interceptor.jar
          Copied maven2.1-interceptor.jar
          Copied plexus-classworld.jar
          Copied maven3-agent.jar
          Copied maven3-interceptor.jar
          Copied classworlds.jar
          Copied maven-interceptor.jar
          Copied maven2.1-interceptor.jar
          Copied plexus-classworld.jar
          Copied classworlds.jar
          Evacuated stdout
          Evacuated stdout
          ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
          (...)java.lang.IllegalStateException: Already connected
          	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459)
          	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
          	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
          	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
          	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          	at java.lang.Thread.run(Thread.java:662)
          Connection terminated
          channel stopped
          ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
          (...)java.lang.NullPointerException
          	at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32)
          	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471)
          	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
          	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
          	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
          	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          	at java.lang.Thread.run(Thread.java:662)
          channel stopped
          Connection terminated

          Then the slaved successfully reconnected itself.

          It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue.

          Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times.

          Note the looping thread came from a previous job which badly stopped on timeout:

          03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted.
          03:00:16.873 Build was aborted
          03:00:16.874 Archiving artifacts
          03:00:16.874 ERROR: Failed to archive artifacts: **/log/*.log, tomcat*/nxserver/config/distribution.properties
          03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed
          03:00:16.876 	at hudson.remoting.Channel.send(Channel.java:494)
          03:00:16.876 	at hudson.remoting.Request.call(Request.java:129)
          03:00:16.876 	at hudson.remoting.Channel.call(Channel.java:672)
          03:00:16.876 	at hudson.EnvVars.getRemote(EnvVars.java:212)
          03:00:16.876 	at hudson.model.Computer.getEnvironment(Computer.java:882)
          03:00:16.876 	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28)
          03:00:16.876 	at hudson.model.Run.getEnvironment(Run.java:2028)
          03:00:16.876 	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927)
          03:00:16.876 	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115)
          03:00:16.876 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798)
          03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:770)
          03:00:16.876 	at hudson.model.Build$BuildExecution.post2(Build.java:183)
          03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:720)
          03:00:16.876 	at hudson.model.Run.execute(Run.java:1600)
          03:00:16.876 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          03:00:16.876 	at hudson.model.ResourceController.execute(ResourceController.java:88)
          03:00:16.876 	at hudson.model.Executor.run(Executor.java:237)
          03:00:16.876 Caused by: java.io.IOException
          03:00:16.876 	at hudson.remoting.Channel.close(Channel.java:910)
          03:00:16.876 	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          03:00:16.876 	at hudson.remoting.PingThread.ping(PingThread.java:120)
          03:00:16.876 	at hudson.remoting.PingThread.run(PingThread.java:81)
          03:00:16.876 Caused by: java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682
          03:00:16.876 	... 2 more

          Julien Carsique added a comment - - edited Maybe not a Jenkins issue. Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave: [05/06/13 14:06:04] Launching slave agent $ ssh slavedns java -jar ~/bin/slave.jar <===[JENKINS REMOTING CAPACITY]===<===[JENKINS REMOTING CAPACITY]===>>channel started channel started Slave.jar version: 2.22 This is a Unix slave Slave.jar version: 2.22 This is a Unix slave Copied maven-agent.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied maven-agent.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied classworlds.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied classworlds.jar Evacuated stdout Evacuated stdout ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) Connection terminated channel stopped ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.NullPointerException at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) channel stopped Connection terminated Then the slaved successfully reconnected itself. It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log ( java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682 ): the system was partially frozen and with really unstable response times. Note the looping thread came from a previous job which badly stopped on timeout: 03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted. 03:00:16.873 Build was aborted 03:00:16.874 Archiving artifacts 03:00:16.874 ERROR: Failed to archive artifacts: **/log /*.log, tomcat*/ nxserver/config/distribution.properties 03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed 03:00:16.876 at hudson.remoting.Channel.send(Channel.java:494) 03:00:16.876 at hudson.remoting.Request.call(Request.java:129) 03:00:16.876 at hudson.remoting.Channel.call(Channel.java:672) 03:00:16.876 at hudson.EnvVars.getRemote(EnvVars.java:212) 03:00:16.876 at hudson.model.Computer.getEnvironment(Computer.java:882) 03:00:16.876 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) 03:00:16.876 at hudson.model.Run.getEnvironment(Run.java:2028) 03:00:16.876 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927) 03:00:16.876 at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115) 03:00:16.876 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:770) 03:00:16.876 at hudson.model.Build$BuildExecution.post2(Build.java:183) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:720) 03:00:16.876 at hudson.model.Run.execute(Run.java:1600) 03:00:16.876 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 03:00:16.876 at hudson.model.ResourceController.execute(ResourceController.java:88) 03:00:16.876 at hudson.model.Executor.run(Executor.java:237) 03:00:16.876 Caused by: java.io.IOException 03:00:16.876 at hudson.remoting.Channel.close(Channel.java:910) 03:00:16.876 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 03:00:16.876 at hudson.remoting.PingThread.ping(PingThread.java:120) 03:00:16.876 at hudson.remoting.PingThread.run(PingThread.java:81) 03:00:16.876 Caused by: java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682 03:00:16.876 ... 2 more

          Oleg Nenashev added a comment - - edited

          Ping Monitoring timeout should be configurable at least...
          Issue is being reproduced in 1.509.2

          Oleg Nenashev added a comment - - edited Ping Monitoring timeout should be configurable at least... Issue is being reproduced in 1.509.2

          Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load.
          Jenkins 1.543.

          Nickolay Rumyantsev added a comment - Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load. Jenkins 1.543.

          Marcus Klein added a comment -

          My jobs are randomly failing with version 1.550 with the following exception:

          FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
          at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
          at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
          at hudson.remoting.Request.call(Request.java:174)
          at hudson.remoting.Channel.call(Channel.java:722)
          at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
          at com.sun.proxy.$Proxy47.join(Unknown Source)
          at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
          at hudson.Launcher$ProcStarter.join(Launcher.java:360)
          at hudson.tasks.Ant.perform(Ant.java:217)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785)
          at hudson.model.Build$BuildExecution.build(Build.java:199)
          at hudson.model.Build$BuildExecution.doRun(Build.java:160)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566)
          at hudson.model.Run.execute(Run.java:1678)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:231)
          Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
          at hudson.remoting.Request.abort(Request.java:299)
          at hudson.remoting.Channel.terminate(Channel.java:782)
          at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927)
          at hudson.remoting.Channel$2.handle(Channel.java:461)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
          ... 3 more
          Caused by: Command close created at
          at hudson.remoting.Command.<init>(Command.java:56)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:921)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:919)
          at hudson.remoting.Channel.close(Channel.java:1002)
          at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          at hudson.remoting.PingThread.ping(PingThread.java:120)
          at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
          ... 2 more

          I rolled back to version 1.546 because I think it did not happen with that version.

          Marcus Klein added a comment - My jobs are randomly failing with version 1.550 with the following exception: FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167) at com.sun.proxy.$Proxy47.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.Ant.perform(Ant.java:217) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566) at hudson.model.Run.execute(Run.java:1678) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:231) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927) at hudson.remoting.Channel$2.handle(Channel.java:461) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:921) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:919) at hudson.remoting.Channel.close(Channel.java:1002) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 2 more I rolled back to version 1.546 because I think it did not happen with that version.

          Oleg Nenashev added a comment - - edited

          The issue appears on remoting-2.36
          Nodes still alive after the issue

          Oleg Nenashev added a comment - - edited The issue appears on remoting-2.36 Nodes still alive after the issue

          Oleg Nenashev added a comment - - edited

          The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

          We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) => All scheduled builds fail with an error

          The issue affects ssh-slaves only:

          • Linux SSH slaves are "online", but all jobs on the fail with the error above
          • Windows services have reconnected automatically...
          • Windows JNLP slaves have reconnected as well

          I'm not sure if there's any Linux specifics in the issue.
          On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side

          Oleg Nenashev added a comment - - edited The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well. We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) => All scheduled builds fail with an error The issue affects ssh-slaves only: Linux SSH slaves are "online", but all jobs on the fail with the error above Windows services have reconnected automatically... Windows JNLP slaves have reconnected as well I'm not sure if there's any Linux specifics in the issue. On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side

          Oleg Nenashev added a comment -

          Updated the issue's description.

          Oleg Nenashev added a comment - Updated the issue's description.

          wbauer added a comment -

          Same issue here with Linux SSH slaves with Jenkins 1.564:

          14:33:30 Started by user wbauer
          14:33:30 [EnvInject] - Loading node environment variables.
          14:33:30 ERROR: SEVERE ERROR occurs
          14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
          14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75)
          14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
          14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
          14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570)
          14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476)
          14:33:30 at hudson.model.Run.execute(Run.java:1706)
          14:33:30 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          14:33:30 at hudson.model.ResourceController.execute(ResourceController.java:88)
          14:33:30 at hudson.model.Executor.run(Executor.java:231)
          14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          14:33:30 at hudson.remoting.Channel.send(Channel.java:541)
          14:33:30 at hudson.remoting.Request.call(Request.java:129)
          14:33:30 at hudson.remoting.Channel.call(Channel.java:739)
          14:33:30 at hudson.FilePath.act(FilePath.java:1009)
          14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
          14:33:30 ... 8 more
          14:33:30 Caused by: java.io.IOException
          14:33:30 at hudson.remoting.Channel.close(Channel.java:1027)
          14:33:30 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          14:33:30 at hudson.remoting.PingThread.ping(PingThread.java:120)
          14:33:30 at hudson.remoting.PingThread.run(PingThread.java:81)
          14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473
          14:33:30 ... 2 more
          14:33:30 Notifying upstream projects of job completion
          14:33:30 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
          14:33:33 Finished: FAILURE

          wbauer added a comment - Same issue here with Linux SSH slaves with Jenkins 1.564: 14:33:30 Started by user wbauer 14:33:30 [EnvInject] - Loading node environment variables. 14:33:30 ERROR: SEVERE ERROR occurs 14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476) 14:33:30 at hudson.model.Run.execute(Run.java:1706) 14:33:30 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 14:33:30 at hudson.model.ResourceController.execute(ResourceController.java:88) 14:33:30 at hudson.model.Executor.run(Executor.java:231) 14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at hudson.remoting.Channel.send(Channel.java:541) 14:33:30 at hudson.remoting.Request.call(Request.java:129) 14:33:30 at hudson.remoting.Channel.call(Channel.java:739) 14:33:30 at hudson.FilePath.act(FilePath.java:1009) 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) 14:33:30 ... 8 more 14:33:30 Caused by: java.io.IOException 14:33:30 at hudson.remoting.Channel.close(Channel.java:1027) 14:33:30 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 14:33:30 at hudson.remoting.PingThread.ping(PingThread.java:120) 14:33:30 at hudson.remoting.PingThread.run(PingThread.java:81) 14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473 14:33:30 ... 2 more 14:33:30 Notifying upstream projects of job completion 14:33:30 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed 14:33:33 Finished: FAILURE

          Beho Alheit added a comment - - edited

          I have this bug popping up with Mac Mini OSX SSH slaves.

          Jenkins version is 1.564.

          Updated version to 1.574 and the issue is still present.

          It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue.
          This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place.

          Please let me know if there is more details I can submit.

          Beho Alheit added a comment - - edited I have this bug popping up with Mac Mini OSX SSH slaves. Jenkins version is 1.564. Updated version to 1.574 and the issue is still present. It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue. This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place. Please let me know if there is more details I can submit.

          Adam added a comment -

          I am seeing similar issues on EC2 linux.

          [EnvInject] - Loading node environment variables.
          FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
          	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
          	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
          	at hudson.remoting.Request.call(Request.java:174)
          	at hudson.remoting.Channel.call(Channel.java:739)
          	at hudson.FilePath.act(FilePath.java:1009)
          	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
          	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
          	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487)
          	at hudson.model.Run.execute(Run.java:1706)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          	at hudson.model.ResourceController.execute(ResourceController.java:88)
          	at hudson.model.Executor.run(Executor.java:232)
          Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
          	at hudson.remoting.Request.abort(Request.java:299)
          	at hudson.remoting.Channel.terminate(Channel.java:802)
          	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951)
          	at hudson.remoting.Channel$2.handle(Channel.java:475)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
          Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
          	... 3 more
          Caused by: Command close created at
          	at hudson.remoting.Command.<init>(Command.java:56)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:945)
          	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:943)
          	at hudson.remoting.Channel.close(Channel.java:1026)
          	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          	at hudson.remoting.PingThread.ping(PingThread.java:120)
          	at hudson.remoting.PingThread.run(PingThread.java:81)
          Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
          	... 2 more
          

          Adam added a comment - I am seeing similar issues on EC2 linux. [EnvInject] - Loading node environment variables. FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:739) at hudson.FilePath.act(FilePath.java:1009) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:802) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951) at hudson.remoting.Channel$2.handle(Channel.java:475) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:945) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:943) at hudson.remoting.Channel.close(Channel.java:1026) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 2 more

          Trevor Baker added a comment -

          This happens consistently when the jenkins master is running amazon linux kernel 3.10.x. If you downgrade the kernel to 3.4.x on the master it goes away. It does not seem to matter which kernel the slave runs.

          Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf

          Trevor Baker added a comment - This happens consistently when the jenkins master is running amazon linux kernel 3.10.x. If you downgrade the kernel to 3.4.x on the master it goes away. It does not seem to matter which kernel the slave runs. Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf

          Adam added a comment -

          hey Trevor, any chance you have the update to 3.4.x kernel steps handy. I didn't see the kernel in the main repo.

          Anyone from the Jenkins team able to comment if they could help resolve this issue? I leave ssh sessions open overnight with no issues or drops, so I really can't understand why jenkins keeps having these ping timeouts. Could we get a patch which would disable the ping check entirely? I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help.

          Would appreciate any suggestion to avoid rolling back the kernel.

          Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact.

          Adam added a comment - hey Trevor, any chance you have the update to 3.4.x kernel steps handy. I didn't see the kernel in the main repo. Anyone from the Jenkins team able to comment if they could help resolve this issue? I leave ssh sessions open overnight with no issues or drops, so I really can't understand why jenkins keeps having these ping timeouts. Could we get a patch which would disable the ping check entirely? I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help. Would appreciate any suggestion to avoid rolling back the kernel. Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact.

          Hi,

          First thanks you very much for this tips.
          And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x.
          So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine.
          I also try with an ubuntu 14.04 (3.13.x kernel) and its not good.

          So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help.

          Regards

          Guillaume Boucherie added a comment - Hi, First thanks you very much for this tips. And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x. So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine. I also try with an ubuntu 14.04 (3.13.x kernel) and its not good. So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help. Regards

          Trevor Baker added a comment -

          The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates. I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels. From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions.

          I've attached a kernel param dump.

          Trevor Baker added a comment - The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates. I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels. From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions. I've attached a kernel param dump.

          Danny Verbeek added a comment - - edited

          I have the same kind of issue.

          Working combination:
          Jenkins version: 1.588
          Remoting version: 2.47
          Master JVM: 1.7.0_65
          Slave JVM: 1.6.0_33/1.7.0_65)
          Linux kernel master: 3.13.0-37-generic
          Linux kernel slave: 3.2.0-69-virtual

          Not Working combination:
          Jenkins version: 1.588
          Remoting version: 2.47
          Master JVM: 1.7.0_65
          Slave JVM: 1.6.0_33/1.7.0_65(Tested both)
          Linux kernel master: 3.13.0-37-generic
          Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both)

          The slave hangs on the POM parsing

          stack trace slave.jar:

          Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):
          
          "Attach Listener" daemon prio=10 tid=0x00007f7eb0001800 nid=0x645 runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x00007f7ec004f800 nid=0x611 runnable [0x00007f7ea4825000]
             java.lang.Thread.State: RUNNABLE
          	at java.net.SocketInputStream.socketRead0(Native Method)
          	at java.net.SocketInputStream.read(SocketInputStream.java:152)
          	at java.net.SocketInputStream.read(SocketInputStream.java:122)
          	at java.io.FilterInputStream.read(FilterInputStream.java:133)
          	at java.io.FilterInputStream.read(FilterInputStream.java:107)
          	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)
          
          "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x00007f7ec0046800 nid=0x607 runnable [0x00007f7ea4623000]
             java.lang.Thread.State: RUNNABLE
          	at java.io.FileInputStream.readBytes(Native Method)
          	at java.io.FileInputStream.read(FileInputStream.java:272)
          	at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
          	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
          	- locked <0x00000000bbbc43f0> (a java.lang.UNIXProcess$ProcessPipeInputStream)
          	at java.io.FilterInputStream.read(FilterInputStream.java:107)
          	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)
          
          "Proc.executor [#1]" daemon prio=10 tid=0x00007f7eb81e6000 nid=0x570 waiting on condition [0x00007f7ea4724000]
             java.lang.Thread.State: TIMED_WAITING (parking)
          	at sun.misc.Unsafe.park(Native Method)
          	- parking to wait for  <0x00000000cec5cef0> (a java.util.concurrent.SynchronousQueue$TransferStack)
          	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
          	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
          	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
          	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
          	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "process reaper" daemon prio=10 tid=0x00007f7eb81ec800 nid=0x56c runnable [0x00007f7ea485e000]
             java.lang.Thread.State: RUNNABLE
          	at java.lang.UNIXProcess.waitForProcessExit(Native Method)
          	at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
          	at java.lang.UNIXProcess$4.run(UNIXProcess.java:227)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "pool-1-thread-6" prio=10 tid=0x00007f7eac011800 nid=0x557 waiting on condition [0x00007f7ea5cba000]
             java.lang.Thread.State: TIMED_WAITING (parking)
          	at sun.misc.Unsafe.park(Native Method)
          	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
          	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
          	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
          	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
          	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
          	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "pool-1-thread-4" prio=10 tid=0x00007f7eac00e800 nid=0x555 waiting on condition [0x00007f7ea5ebc000]
             java.lang.Thread.State: TIMED_WAITING (parking)
          	at sun.misc.Unsafe.park(Native Method)
          	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
          	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
          	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
          	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
          	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
          	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "pool-1-thread-3" prio=10 tid=0x00007f7eac00d000 nid=0x554 waiting on condition [0x00007f7ec419b000]
             java.lang.Thread.State: TIMED_WAITING (parking)
          	at sun.misc.Unsafe.park(Native Method)
          	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
          	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
          	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
          	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
          	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
          	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7eb42cd800 nid=0x551 waiting on condition [0x00007f7ec5685000]
             java.lang.Thread.State: TIMED_WAITING (sleeping)
          	at java.lang.Thread.sleep(Native Method)
          	at hudson.remoting.PingThread.run(PingThread.java:91)
          
          "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7ecc1a7800 nid=0x54f waiting on condition [0x00007f7ec6eaf000]
             java.lang.Thread.State: TIMED_WAITING (sleeping)
          	at java.lang.Thread.sleep(Native Method)
          	at hudson.remoting.PingThread.run(PingThread.java:91)
          
          "Channel reader thread: channel" prio=10 tid=0x00007f7ecc19c800 nid=0x54e runnable [0x00007f7ec6fb0000]
             java.lang.Thread.State: RUNNABLE
          	at java.io.FileInputStream.readBytes(Native Method)
          	at java.io.FileInputStream.read(FileInputStream.java:272)
          	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
          	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
          	- locked <0x00000000ce80e768> (a java.io.BufferedInputStream)
          	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          
          "Service Thread" daemon prio=10 tid=0x00007f7ecc0cc800 nid=0x54c runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "C2 CompilerThread1" daemon prio=10 tid=0x00007f7ecc0ca000 nid=0x54b waiting on condition [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "C2 CompilerThread0" daemon prio=10 tid=0x00007f7ecc0c8000 nid=0x54a waiting on condition [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "Signal Dispatcher" daemon prio=10 tid=0x00007f7ecc0c5800 nid=0x549 runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "Finalizer" daemon prio=10 tid=0x00007f7ecc099000 nid=0x548 in Object.wait() [0x00007f7ec773f000]
             java.lang.Thread.State: WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock)
          	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
          	- locked <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock)
          	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
          	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
          
          "Reference Handler" daemon prio=10 tid=0x00007f7ecc097000 nid=0x547 in Object.wait() [0x00007f7ec7840000]
             java.lang.Thread.State: WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock)
          	at java.lang.Object.wait(Object.java:503)
          	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
          	- locked <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock)
          
          "main" prio=10 tid=0x00007f7ecc00a000 nid=0x545 in Object.wait() [0x00007f7ed24b4000]
             java.lang.Thread.State: TIMED_WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x00000000ce80c310> (a hudson.remoting.Channel)
          	at hudson.remoting.Channel.join(Channel.java:902)
          	- locked <0x00000000ce80c310> (a hudson.remoting.Channel)
          	at hudson.remoting.Launcher.main(Launcher.java:527)
          	at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:459)
          	at hudson.remoting.Launcher.run(Launcher.java:239)
          	at hudson.remoting.Launcher.main(Launcher.java:192)
          
          "VM Thread" prio=10 tid=0x00007f7ecc092800 nid=0x546 runnable
          
          "VM Periodic Task Thread" prio=10 tid=0x00007f7ecc0d7800 nid=0x54d waiting on condition
          
          JNI global references: 221
          
          

          Stack trace maven.jar:

           
          Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):
          
          "Attach Listener" daemon prio=10 tid=0x00007f6308046000 nid=0x650 runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "pool-1-thread-1 for channel / waiting for hudson.remoting.Channel@2855e552:channel" prio=10 tid=0x00007f6304002000 nid=0x614 in Object.wait() [0x00007f62ff89a000]
             java.lang.Thread.State: TIMED_WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x0000000085272700> (a hudson.remoting.UserRequest)
          	at hudson.remoting.Request.call(Request.java:146)
          	- locked <0x0000000085272700> (a hudson.remoting.UserRequest)
          	at hudson.remoting.Channel.call(Channel.java:751)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:173)
          	at com.sun.proxy.$Proxy5.setExecutedMojos(Unknown Source)
          	at hudson.maven.MavenBuildProxy$Filter.setExecutedMojos(MavenBuildProxy.java:267)
          	at hudson.maven.Maven3Builder$MavenExecutionListener.recordMojoEnded(Maven3Builder.java:630)
          	at hudson.maven.Maven3Builder$MavenExecutionListener.mojoSucceeded(Maven3Builder.java:615)
          	at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:87)
          	at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:42)
          	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:228)
          	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
          	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
          	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
          	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
          	at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
          	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
          	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
          	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
          	at org.jvnet.hudson.maven3.launcher.Maven3Launcher.main(Maven3Launcher.java:117)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:606)
          	at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330)
          	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238)
          	at org.jvnet.hudson.maven3.agent.Maven3Main.launch(Maven3Main.java:178)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:606)
          	at hudson.maven.Maven3Builder.call(Maven3Builder.java:136)
          	at hudson.maven.Maven3Builder.call(Maven3Builder.java:71)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:121)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:49)
          	at hudson.remoting.Request$2.run(Request.java:324)
          	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          	at java.lang.Thread.run(Thread.java:745)
          
          "Channel reader thread: channel" prio=10 tid=0x00007f63201ad000 nid=0x613 runnable [0x00007f62ff99d000]
             java.lang.Thread.State: RUNNABLE
          	at java.net.SocketInputStream.socketRead0(Native Method)
          	at java.net.SocketInputStream.read(SocketInputStream.java:152)
          	at java.net.SocketInputStream.read(SocketInputStream.java:122)
          	at java.io.FilterInputStream.read(FilterInputStream.java:133)
          	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
          	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
          	- locked <0x0000000098dc0720> (a java.io.BufferedInputStream)
          	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
          	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          
          "Service Thread" daemon prio=10 tid=0x00007f63200a5800 nid=0x60f runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "C2 CompilerThread1" daemon prio=10 tid=0x00007f63200a3000 nid=0x60e waiting on condition [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "C2 CompilerThread0" daemon prio=10 tid=0x00007f63200a1000 nid=0x60d waiting on condition [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "Signal Dispatcher" daemon prio=10 tid=0x00007f632009e800 nid=0x60c runnable [0x0000000000000000]
             java.lang.Thread.State: RUNNABLE
          
          "Finalizer" daemon prio=10 tid=0x00007f6320072000 nid=0x60b in Object.wait() [0x00007f6324886000]
             java.lang.Thread.State: WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock)
          	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
          	- locked <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock)
          	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
          	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
          
          "Reference Handler" daemon prio=10 tid=0x00007f6320070000 nid=0x60a in Object.wait() [0x00007f6324987000]
             java.lang.Thread.State: WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock)
          	at java.lang.Object.wait(Object.java:503)
          	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
          	- locked <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock)
          
          "main" prio=10 tid=0x00007f632000a000 nid=0x608 in Object.wait() [0x00007f632895c000]
             java.lang.Thread.State: TIMED_WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <0x0000000098dbe408> (a hudson.remoting.Channel)
          	at hudson.remoting.Channel.join(Channel.java:902)
          	- locked <0x0000000098dbe408> (a hudson.remoting.Channel)
          	at hudson.remoting.Launcher.main(Launcher.java:527)
          	at hudson.remoting.Launcher.main(Launcher.java:501)
          	at hudson.remoting.Launcher.main(Launcher.java:493)
          	at hudson.remoting.Launcher.main(Launcher.java:489)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:606)
          	at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:136)
          	at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:64)
          
          "VM Thread" prio=10 tid=0x00007f632006b800 nid=0x609 runnable
          
          "VM Periodic Task Thread" prio=10 tid=0x00007f63200b1000 nid=0x610 waiting on condition
          
          JNI global references: 180
          
          

          Danny Verbeek added a comment - - edited I have the same kind of issue. Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.2.0-69-virtual Not Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65(Tested both) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both) The slave hangs on the POM parsing stack trace slave.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007f7eb0001800 nid=0x645 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x00007f7ec004f800 nid=0x611 runnable [0x00007f7ea4825000] java.lang. Thread .State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x00007f7ec0046800 nid=0x607 runnable [0x00007f7ea4623000] java.lang. Thread .State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x00000000bbbc43f0> (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "Proc.executor [#1]" daemon prio=10 tid=0x00007f7eb81e6000 nid=0x570 waiting on condition [0x00007f7ea4724000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000cec5cef0> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "process reaper" daemon prio=10 tid=0x00007f7eb81ec800 nid=0x56c runnable [0x00007f7ea485e000] java.lang. Thread .State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$500(UNIXProcess.java:54) at java.lang.UNIXProcess$4.run(UNIXProcess.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-6" prio=10 tid=0x00007f7eac011800 nid=0x557 waiting on condition [0x00007f7ea5cba000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-4" prio=10 tid=0x00007f7eac00e800 nid=0x555 waiting on condition [0x00007f7ea5ebc000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-3" prio=10 tid=0x00007f7eac00d000 nid=0x554 waiting on condition [0x00007f7ec419b000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7eb42cd800 nid=0x551 waiting on condition [0x00007f7ec5685000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at hudson.remoting.PingThread.run(PingThread.java:91) "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7ecc1a7800 nid=0x54f waiting on condition [0x00007f7ec6eaf000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at hudson.remoting.PingThread.run(PingThread.java:91) "Channel reader thread: channel" prio=10 tid=0x00007f7ecc19c800 nid=0x54e runnable [0x00007f7ec6fb0000] java.lang. Thread .State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) - locked <0x00000000ce80e768> (a java.io.BufferedInputStream) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) "Service Thread " daemon prio=10 tid=0x00007f7ecc0cc800 nid=0x54c runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007f7ecc0ca000 nid=0x54b waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007f7ecc0c8000 nid=0x54a waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007f7ecc0c5800 nid=0x549 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007f7ecc099000 nid=0x548 in Object .wait() [0x00007f7ec773f000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" daemon prio=10 tid=0x00007f7ecc097000 nid=0x547 in Object .wait() [0x00007f7ec7840000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock) at java.lang. Object .wait( Object .java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x00007f7ecc00a000 nid=0x545 in Object .wait() [0x00007f7ed24b4000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80c310> (a hudson.remoting.Channel) at hudson.remoting.Channel.join(Channel.java:902) - locked <0x00000000ce80c310> (a hudson.remoting.Channel) at hudson.remoting.Launcher.main(Launcher.java:527) at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:459) at hudson.remoting.Launcher.run(Launcher.java:239) at hudson.remoting.Launcher.main(Launcher.java:192) "VM Thread " prio=10 tid=0x00007f7ecc092800 nid=0x546 runnable "VM Periodic Task Thread " prio=10 tid=0x00007f7ecc0d7800 nid=0x54d waiting on condition JNI global references: 221 Stack trace maven.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007f6308046000 nid=0x650 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "pool-1-thread-1 for channel / waiting for hudson.remoting.Channel@2855e552:channel" prio=10 tid=0x00007f6304002000 nid=0x614 in Object .wait() [0x00007f62ff89a000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000085272700> (a hudson.remoting.UserRequest) at hudson.remoting.Request.call(Request.java:146) - locked <0x0000000085272700> (a hudson.remoting.UserRequest) at hudson.remoting.Channel.call(Channel.java:751) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:173) at com.sun.proxy.$Proxy5.setExecutedMojos(Unknown Source) at hudson.maven.MavenBuildProxy$Filter.setExecutedMojos(MavenBuildProxy.java:267) at hudson.maven.Maven3Builder$MavenExecutionListener.recordMojoEnded(Maven3Builder.java:630) at hudson.maven.Maven3Builder$MavenExecutionListener.mojoSucceeded(Maven3Builder.java:615) at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:87) at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:42) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:228) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.jvnet.hudson.maven3.launcher.Maven3Launcher.main(Maven3Launcher.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238) at org.jvnet.hudson.maven3.agent.Maven3Main.launch(Maven3Main.java:178) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at hudson.maven.Maven3Builder.call(Maven3Builder.java:136) at hudson.maven.Maven3Builder.call(Maven3Builder.java:71) at hudson.remoting.UserRequest.perform(UserRequest.java:121) at hudson.remoting.UserRequest.perform(UserRequest.java:49) at hudson.remoting.Request$2.run(Request.java:324) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "Channel reader thread: channel" prio=10 tid=0x00007f63201ad000 nid=0x613 runnable [0x00007f62ff99d000] java.lang. Thread .State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) - locked <0x0000000098dc0720> (a java.io.BufferedInputStream) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) "Service Thread " daemon prio=10 tid=0x00007f63200a5800 nid=0x60f runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007f63200a3000 nid=0x60e waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007f63200a1000 nid=0x60d waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007f632009e800 nid=0x60c runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007f6320072000 nid=0x60b in Object .wait() [0x00007f6324886000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" daemon prio=10 tid=0x00007f6320070000 nid=0x60a in Object .wait() [0x00007f6324987000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock) at java.lang. Object .wait( Object .java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x00007f632000a000 nid=0x608 in Object .wait() [0x00007f632895c000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dbe408> (a hudson.remoting.Channel) at hudson.remoting.Channel.join(Channel.java:902) - locked <0x0000000098dbe408> (a hudson.remoting.Channel) at hudson.remoting.Launcher.main(Launcher.java:527) at hudson.remoting.Launcher.main(Launcher.java:501) at hudson.remoting.Launcher.main(Launcher.java:493) at hudson.remoting.Launcher.main(Launcher.java:489) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:136) at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:64) "VM Thread " prio=10 tid=0x00007f632006b800 nid=0x609 runnable "VM Periodic Task Thread " prio=10 tid=0x00007f63200b1000 nid=0x610 waiting on condition JNI global references: 180

          I had the same issue.

          Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it.

          Poul Henriksen added a comment - I had the same issue. Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it.

          Bert Jan Schrijver added a comment - - edited

          Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out.
          We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since.
          Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64).
          I'll report back later.

          Bert Jan Schrijver added a comment - - edited Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out. We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64). I'll report back later.

          I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS.

          I've downgraded the kernel (manually) to 3.4.

          wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
          wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb
          wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
          dpkg -i linux-*
          

          Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/

          I'm trying out some builds now to see how Jenkins behaves...

          Jonathan Langevin added a comment - I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS. I've downgraded the kernel (manually) to 3.4. wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb dpkg -i linux-* Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/ I'm trying out some builds now to see how Jenkins behaves...

          Downgrading the kernel on the master has definitely fixed it for us.
          Running for a week now without any trouble.

          Bert Jan Schrijver added a comment - Downgrading the kernel on the master has definitely fixed it for us. Running for a week now without any trouble.

          Sean Abbott added a comment -

          I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs:

          Expanded the channel window size to 4MB
          [05/11/15 18:02:23] [SSH] Starting slave process: cd "/var/lib/jenkins" && java -Dfile.encoding=UTF8 -jar slave.jar

          <===[JENKINS REMOTING CAPACITY]===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
          java.lang.IllegalStateException: Already connected
          at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448)
          at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366)
          at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945)
          at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133)
          at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711)
          at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745)
          [05/11/15 18:02:24] Launch failed - cleaning up connection
          [05/11/15 18:02:24] [SSH] Connection closed.

          Even though the log reports the connection closed, jenkins still reports it as up.

          My jenkins master is on 1.596.2.

          Sean Abbott added a comment - I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs: Expanded the channel window size to 4MB [05/11/15 18:02:23] [SSH] Starting slave process: cd "/var/lib/jenkins" && java -Dfile.encoding=UTF8 -jar slave.jar <=== [JENKINS REMOTING CAPACITY] ===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366) at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945) at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [05/11/15 18:02:24] Launch failed - cleaning up connection [05/11/15 18:02:24] [SSH] Connection closed. Even though the log reports the connection closed, jenkins still reports it as up. My jenkins master is on 1.596.2.

          Sean Abbott added a comment -

          I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

          Sean Abbott added a comment - I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

          Hi,

          I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave.
          And the problem gone ...
          Jenkins version used is the last stable : 1.609.1

          Regards

          Guillaume Boucherie added a comment - Hi, I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 Regards

          Jesse Glick added a comment -

          Related to JENKINS-1948 perhaps?

          Jesse Glick added a comment - Related to JENKINS-1948 perhaps?

          In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.

           

          ERROR: SEVERE ERROR occurs
          org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
          at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79)
          at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80)
          at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42)
          at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492)
          at hudson.model.Run.execute(Run.java:1741)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:98)
          at hudson.model.Executor.run(Executor.java:410)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          at hudson.remoting.Channel.send(Channel.java:578)
          at hudson.remoting.Request.call(Request.java:130)
          at hudson.remoting.Channel.call(Channel.java:780)
          at hudson.FilePath.act(FilePath.java:1102)
          at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48)
          ... 8 more
          Caused by: java.io.IOException
          at hudson.remoting.Channel.close(Channel.java:1163)
          at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121)
          at hudson.remoting.PingThread.ping(PingThread.java:130)
          at hudson.remoting.PingThread.run(PingThread.java:86)
          Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229
          
          

           

          Srikanth Vadlamani added a comment - In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.   ERROR: SEVERE ERROR occurs org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492) at hudson.model.Run.execute(Run.java:1741) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:578) at hudson.remoting.Request.call(Request.java:130) at hudson.remoting.Channel.call(Channel.java:780) at hudson.FilePath.act(FilePath.java:1102) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48) ... 8 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1163) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121) at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229  

          because there is not recent info here and seems similar to JENKINS-53810 I will close it.

          Ivan Fernandez Calvo added a comment - because there is not recent info here and seems similar to JENKINS-53810 I will close it.

            ifernandezcalvo Ivan Fernandez Calvo
            olamy Olivier Lamy
            Votes:
            33 Vote for this issue
            Watchers:
            51 Start watching this issue

              Created:
              Updated:
              Resolved: