I observe that jobs starting at the same time the slave is performing some sort of cleanup action hang:
Build Log:
17:04:21 Started by upstream project "pipeline-test-2" build number 422 17:04:21 originally caused by: 17:04:21 Started by upstream project "master-test" build number 867 17:04:21 originally caused by: 17:04:21 Started by upstream project "master-build" build number 5131 17:04:21 originally caused by: 17:04:21 Started by an SCM change 17:04:21 [EnvInject] - Loading node environment variables. ... hang for 14h ...
Slave Log:
... lots of output ... INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-j2se.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/javassist.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system-jmx.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-mdr.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-logging-spi.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
After interrupting the job, I get this:
0:30:26 ERROR: SEVERE ERROR occurs 10:30:26 org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException 10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77) 10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) 10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481) 10:30:26 at hudson.model.Run.execute(Run.java:1689) 10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88) 10:30:26 at hudson.model.Executor.run(Executor.java:231) 10:30:26 Caused by: java.lang.InterruptedException 10:30:26 at java.lang.Object.wait(Native Method) 10:30:26 at hudson.remoting.Request.call(Request.java:146) 10:30:26 at hudson.remoting.Channel.call(Channel.java:722) 10:30:26 at hudson.FilePath.act(FilePath.java:1003) 10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) 10:30:26 ... 8 more 10:30:26 Archiving artifacts 10:30:26 ERROR: Publisher hudson.tasks.Mailer aborted due to exception 10:30:26 hudson.remoting.ChannelClosedException: channel is already closed 10:30:26 at hudson.remoting.Channel.send(Channel.java:524) 10:30:26 at hudson.remoting.Request.call(Request.java:129) 10:30:26 at hudson.remoting.Channel.call(Channel.java:722) 10:30:26 at hudson.EnvVars.getRemote(EnvVars.java:404) 10:30:26 at hudson.model.Computer.getEnvironment(Computer.java:911) 10:30:26 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29) 10:30:26 at hudson.model.Run.getEnvironment(Run.java:2202) 10:30:26 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873) 10:30:26 at hudson.tasks.Mailer.perform(Mailer.java:134) 10:30:26 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:714) 10:30:26 at hudson.model.Build$BuildExecution.post2(Build.java:182) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:663) 10:30:26 at hudson.model.Run.execute(Run.java:1714) 10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88) 10:30:26 at hudson.model.Executor.run(Executor.java:231) 10:30:26 Caused by: java.io.IOException 10:30:26 at hudson.remoting.Channel.close(Channel.java:1007) 10:30:26 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 10:30:26 at hudson.remoting.PingThread.ping(PingThread.java:120) 10:30:26 at hudson.remoting.PingThread.run(PingThread.java:81) 10:30:26 Caused by: java.util.concurrent.TimeoutException: Ping started on 1409011309670 hasn't completed at 1409011549671 10:30:26 ... 2 more 10:30:26 [BFA] Scanning build for known causes... 10:30:26 10:30:26 [BFA] Done. 0s 10:30:26 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
- links to
It appears that the slave crashes on the abort, and the only solution is to disconnect and reconnect the slave.