-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Jenkins 2.60.3 LTS
Durable Task Plugin 1.15
Pipeline Supporting APIs: 2.16
Pipeline 2.5
My parallel pipeline job runs primarily on Jenkins slave nodes and I came across a case where a parallel branch went to a slave node that disconnected from the Jenkins master due to an issue with our hosting provider. This hung the build until I manually stepped in. I noticed it after all of the other branches completed their work and one branch was running on a disconnected slave. Even though Jenkins master had many idle Jenkins slave nodes, this branch waited on the disconnected agent.
I manually stepped in and restarted the instance and it registered again on the Jenkins master. Only after the slave node connected did the build fail. I was expecting one of the three outcomes, instead I had to manually step in to free the hung build.
1. The branch would have detected the disconnected slave node and ran on another available one.
2. The branch would have failed immediately when the slave node disconnected similar to freestyle.
3. The branch and build would have resumed successfully once the slave reconnected.
I was able to reproduce this issue using the Pipeline code below and disconnecting the slave during the "sleep 15s" step.
timestamps { node("JENKINS-SLAVE-LABEL") { sh 'echo "First task"' sh 'sleep 15s' sh 'echo "Last task"' } }
Below are the build logs after disconnecting the slave during "sleep 15s" and reconnecting the slave again after about a minute.
[Pipeline] timestamps [Pipeline] { [Pipeline] node 23:27:05 Running on JENKINS-SLAVE-NODE-NAME-a (i-xxxxxxxxxxxxxxxxxxx) in /home/centos/workspace/JOBNAME [Pipeline] { [Pipeline] sh 23:27:13 [JOBNAME] Running shell script 23:27:14 + echo 'First task' 23:27:14 First task [Pipeline] sh 23:27:14 [JOBNAME] Running shell script 23:27:15 + sleep 15s 23:27:25 Cannot contact JENKINS-SLAVE-NODE-NAME-a (i-xxxxxxxxxxxxxxxxxxx): java.io.IOException: remote file operation failed: /home/centos/workspace/JOBNAME at hudson.remoting.Channel@32fe452c:JENKINS-SLAVE-NODE-NAME-a (i-xxxxxxxxxxxxxxxxxxx): hudson.remoting.ChannelClosedException: channel is already closed [Pipeline] sh [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // timestamps [Pipeline] End of Pipeline Command close created at at hudson.remoting.Command.<init>(Command.java:60) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1123) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1121) at hudson.remoting.Channel.close(Channel.java:1281) at hudson.remoting.Channel.close(Channel.java:1263) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) Caused: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) at hudson.remoting.Channel$1.handle(Channel.java:527) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83) Caused: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:605) at hudson.remoting.Request.call(Request.java:130) at hudson.remoting.Channel.call(Channel.java:829) at hudson.FilePath.act(FilePath.java:987) at hudson.FilePath.act(FilePath.java:976) at hudson.FilePath.mkdirs(FilePath.java:1159) at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:113) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:167) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:161) at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:90) at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:64) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:177) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:224) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:150) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1218) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1027) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155) at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:133) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:153) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:157) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:127) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:127) at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) Caused: java.io.IOException: remote file operation failed: /home/centos/workspace/JOBNAME at hudson.remoting.Channel@32fe452c:JENKINS-SLAVE-NODE-NAME-a (i-xxxxxxxxxxxxxxxxxxx) at hudson.FilePath.act(FilePath.java:994) at hudson.FilePath.act(FilePath.java:976) at hudson.FilePath.mkdirs(FilePath.java:1159) at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:113) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:167) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:161) at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:90) at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:64) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:177) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:224) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:150) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1218) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1027) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155) at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:133) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:153) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:157) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:127) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:127) at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) at WorkflowScript.run(WorkflowScript:6) at ___cps.transform___(Native Method) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) at com.cloudbees.groovy.cps.Next.step(Next.java:83) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163) at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122) at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:330) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Finished: FAILURE
- duplicates
-
JENKINS-41854 Contextualize a fresh FilePath after an agent reconnection
- Resolved
-
JENKINS-49707 Auto retry for elastic agents after channel closure
- Resolved