Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46067

Pipeline task scheduled on uninitialized node

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None

      One of our nodes disconnected due to a channel disconnect. Meanwhile a pipeline job was running, and as soon as it reconnected, Jenkins tried to schedule a task on the node, which failed with the error:

       

       

      ERROR: Issue with creating launcher for agent <nodename>. The agent has not been fully initialized yet

      ERROR: Issue with creating launcher for agent <nodename>. The agent has not been fully initialized yet

       

       

      I have attached the console output from the agent.

        1. jenkins.master.trimmed.log
          25 kB
        2. slave.log.1
          8 kB
        3. 2xlarge-738_node.log
          78 kB
        4. catalina_738.log
          65 kB
        5. agentconsole.log
          7 kB

          [JENKINS-46067] Pipeline task scheduled on uninitialized node

          Oleg Nenashev added a comment -

          It will be likely fixed by JENKINS-45023. Or you get a failure early at least.

          Oleg Nenashev added a comment - It will be likely fixed by JENKINS-45023 . Or you get a failure early at least.

          Oleg Nenashev added a comment -

          benh57 Please retest the issue on Remoting 3.14

          Oleg Nenashev added a comment - benh57 Please retest the issue on Remoting 3.14

          John Lengeling added a comment - - edited

          We upgraded to Jenkins LTS 2.89.1 which is supposed to have Remoting 3.14 and we are seeing some builds fail with the same error message.  These are elastic Openstack build nodes.   We are also using a Slave Setup - Setup Script After Copy script too if that matters.
           
          ERROR: Issue with creating launcher for agent tph-build-c1.2xlarge-8428. The agent has not been fully initialized yet
          ERROR: Issue with creating launcher for agent tph-build-c1.2xlarge-8428. The agent has not been fully initialized yet
          remote file operation failed: /opt/jenkins/workspace/fh_guide_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ at hudson.remoting.Channel@681ae622:tph-build-c1.2xlarge-8428: hudson.remoting.ChannelClosedException: Remote call on tph-build-c1.2xlarge-8428 failed. The channel is closing down or has closed down
           

          John Lengeling added a comment - - edited We upgraded to Jenkins LTS 2.89.1 which is supposed to have Remoting 3.14 and we are seeing some builds fail with the same error message.  These are elastic Openstack build nodes.   We are also using a Slave Setup - Setup Script After Copy script too if that matters.   ERROR: Issue with creating launcher for agent tph-build-c1.2xlarge-8428. The agent has not been fully initialized yet ERROR: Issue with creating launcher for agent tph-build-c1.2xlarge-8428. The agent has not been fully initialized yet remote file operation failed: /opt/jenkins/workspace/fh_guide_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ at hudson.remoting.Channel@681ae622:tph-build-c1.2xlarge-8428: hudson.remoting.ChannelClosedException: Remote call on tph-build-c1.2xlarge-8428 failed. The channel is closing down or has closed down  

          Oleg Nenashev added a comment -

          johnlengeling would you be able to attach...

          1) Jenkins master logs
          2) Agent logs
          3) OpenStack VM provisioniing/termination logs
          4) Agent configuration and the Openstack plugin versions (if any)

          Oleg Nenashev added a comment - johnlengeling would you be able to attach... 1) Jenkins master logs 2) Agent logs 3) OpenStack VM provisioniing/termination logs 4) Agent configuration and the Openstack plugin versions (if any)

          Oleg,

          We are running Jenkins 2.89.1 with Openstack Plugin 2.29.   We are running a declarative pipeline job.  We have a parallel section which runs 7 different build steps on 7 different nodes that are spun up elastically on Openstack Ocata.   We see this failure intermittently. 

          We see the following error in the console output:

          [beautify none-noarch] Still waiting to schedule task
          [beautify none-noarch] All nodes of label ‘openstack’ are offline
          [beautify none-noarch] Running on team-ph-build-c1.2xlarge-738 in /opt/jenkins/workspace/h_foo_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ
          [Pipeline] [beautify none-noarch] {
          [Pipeline] [beautify none-noarch] checkout
          [beautify none-noarch] ERROR: Issue with creating launcher for agent team-ph-build-c1.2xlarge-738. The agent has not been fully initialized yet
          [beautify none-noarch] ERROR: Issue with creating launcher for agent team-ph-build-c1.2xlarge-738. The agent has not been fully initialized yet

           

          And the following stack trace at the end of the console log:

           

          java.io.EOFException
           at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2675)
           at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3150)
           at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:859)
           at java.io.ObjectInputStream.<init>(ObjectInputStream.java:355)
           at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
           at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
           at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
          Caused: java.io.IOException: Unexpected termination of the channel
           at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
          Caused: hudson.remoting.ChannelClosedException: Remote call on team-ph-build-c1.2xlarge-738 failed. The channel is closing down or has closed down
           at hudson.remoting.Channel.call(Channel.java:901)
           at hudson.FilePath.act(FilePath.java:986)
          Caused: java.io.IOException: remote file operation failed: /opt/jenkins/workspace/h_foo_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ at hudson.remoting.Channel@2d334044:team-ph-build-c1.2xlarge-738
           at hudson.FilePath.act(FilePath.java:993)
           at hudson.FilePath.act(FilePath.java:975)
           at hudson.FilePath.mkdirs(FilePath.java:1158)
           at hudson.plugins.git.GitSCM.createClient(GitSCM.java:747)
           at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1117)
           at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:113)
           at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:85)
           at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:75)
           at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
           at hudson.security.ACL.impersonate(ACL.java:260)
           at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
          Finished: FAILURE

           

          Attached is an excerpt of the catalina.out and also the node log showing the Agent Setup script running.

          John Lengeling added a comment - Oleg, We are running Jenkins 2.89.1 with Openstack Plugin 2.29.   We are running a declarative pipeline job.  We have a parallel section which runs 7 different build steps on 7 different nodes that are spun up elastically on Openstack Ocata.   We see this failure intermittently.  We see the following error in the console output: [beautify none-noarch] Still waiting to schedule task [beautify none-noarch] All nodes of label ‘openstack’ are offline [beautify none-noarch] Running on team-ph-build-c1.2xlarge-738 in /opt/jenkins/workspace/h_foo_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ [Pipeline] [beautify none-noarch] { [Pipeline] [beautify none-noarch] checkout [beautify none-noarch] ERROR: Issue with creating launcher for agent team-ph-build-c1.2xlarge-738. The agent has not been fully initialized yet [beautify none-noarch] ERROR: Issue with creating launcher for agent team-ph-build-c1.2xlarge-738. The agent has not been fully initialized yet   And the following stack trace at the end of the console log:   java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2675) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3150) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:859) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:355) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused: hudson.remoting.ChannelClosedException: Remote call on team-ph-build-c1.2xlarge-738 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:901) at hudson.FilePath.act(FilePath.java:986) Caused: java.io.IOException: remote file operation failed: /opt/jenkins/workspace/h_foo_master-DAUIRXWDCG4RH3OVIMVYS3W2QDGGAIJIWFEPCAIQEFWE7YG5KZDQ at hudson.remoting.Channel@2d334044:team-ph-build-c1.2xlarge-738 at hudson.FilePath.act(FilePath.java:993) at hudson.FilePath.act(FilePath.java:975) at hudson.FilePath.mkdirs(FilePath.java:1158) at hudson.plugins.git.GitSCM.createClient(GitSCM.java:747) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1117) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:113) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:85) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:75) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47) at hudson.security.ACL.impersonate(ACL.java:260) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Finished: FAILURE   Attached is an excerpt of the catalina.out and also the node log showing the Agent Setup script running.

          We have the same problem on Jenkins 2.60.3, host system is Debian 9, with KVM/QEMU and the libvirt plugin. OS on the VMs is Windows 7 with the Jenkins service installed. We have configured the VMs to shutdown after each job and to revert to a certain snapshot when shutting down.

          We have noticed that the machines do not shut down every time, especially if there is heavy load (lots of jobs in the queue) or if the job failed during compilation. And if they do not shut down, this error appears

          Andreas Tscharner added a comment - We have the same problem on Jenkins 2.60.3, host system is Debian 9, with KVM/QEMU and the libvirt plugin. OS on the VMs is Windows 7 with the Jenkins service installed. We have configured the VMs to shutdown after each job and to revert to a certain snapshot when shutting down. We have noticed that the machines do not shut down every time, especially if there is heavy load (lots of jobs in the queue) or if the job failed during compilation. And if they do not shut down, this error appears

          Steve Ames added a comment -

          As a workaround, does anyone know how to setup the retry so that it will try to use a different node? I can setup some retries around functions executing on the node (e.g.)

          jenkinsObject.retry(3)
          {
             jenkinsObject.doTheThing() // Do a clever thing on this node
          }

          However I'm not sure how to setup a retry around a failed node. Infrastructure fails sometimes. Fact of life. I know there's gotta be a way to compensate for that so that my Jenkins jobs are a little more robust, but I'm not seeing it...

          Steve Ames added a comment - As a workaround, does anyone know how to setup the retry so that it will try to use a different node? I can setup some retries around functions executing on the node (e.g.) jenkinsObject.retry(3) {    jenkinsObject.doTheThing() // Do a clever thing on this node } However I'm not sure how to setup a retry around a failed node. Infrastructure fails sometimes. Fact of life. I know there's gotta be a way to compensate for that so that my Jenkins jobs are a little more robust, but I'm not seeing it...

          Federico Naum added a comment -

          Somehow this is happening more often in our instance Jenkins 2.116 with the latest pipeline plugins.  (Note we have not any OpenStack* plugin installed)  

          I have attached the master and slave jobs

          jenkins.master.trimmed.log 

          slave.log.1[^jenkins.log]

          At this stage, I'm with sames I'm happy with any workaround

          Federico Naum added a comment - Somehow this is happening more often in our instance Jenkins 2.116 with the latest pipeline plugins.  (Note we have not any OpenStack* plugin installed)   I have attached the master and slave jobs jenkins.master.trimmed.log   slave.log.1 [^jenkins.log] At this stage, I'm with sames I'm happy with any workaround

          AJ Ferrigno added a comment -

          We just got this exact error for the first time in one of our pipeline jobs running many things in parallel. It stinks because there is no recovery whatsoever, and even the "post { always {}}" block did not execute. This worries me because it means there is potential for jobs not cleaning up after themselves. 

          I am trying to determine whether this is supposed to be fixed, but I can't tell. Can't the underlying code just wait a couple seconds for the slave to become fully available, rather than crash the whole job? The referenced ticket (45023) seems to somewhat address that, but it seems to not take effect in my installation.

          We are using Jenkins version 2.130 with relatively recent pipeline plugins (pipline:api is v 2.28, pipeline:job is 2.22, pipeline:multibranch is 2.19, pipeline:utility is v2.1.0). Nothing was upgraded recently, and this same job has worked many times in the past.

          One possible solution I can see is that our Jenkins nodes are too overloaded, causing them to shut down in the first place. So maybe we should add more nodes?

          AJ Ferrigno added a comment - We just got this exact error for the first time in one of our pipeline jobs running many things in parallel. It stinks because there is no recovery whatsoever, and even the "post { always {}}" block did not execute. This worries me because it means there is potential for jobs not cleaning up after themselves.  I am trying to determine whether this is supposed to be fixed, but I can't tell. Can't the underlying code just wait a couple seconds for the slave to become fully available, rather than crash the whole job? The referenced ticket (45023) seems to somewhat address that, but it seems to not take effect in my installation. We are using Jenkins version 2.130 with relatively recent pipeline plugins (pipline:api is v 2.28, pipeline:job is 2.22, pipeline:multibranch is 2.19, pipeline:utility is v2.1.0). Nothing was upgraded recently, and this same job has worked many times in the past. One possible solution I can see is that our Jenkins nodes are too overloaded, causing them to shut down in the first place. So maybe we should add more nodes?

          Jeff Thompson added a comment -

          ajferrigno, it would be wise to make sure none of the nodes are overloaded. All sorts of unusual symptoms can occur when systems are overloaded. An overloaded system can cause a node or a connection to fail. Perhaps something needs to be more robust to these conditions, but reducing the load is a good step.

          Jeff Thompson added a comment - ajferrigno , it would be wise to make sure none of the nodes are overloaded. All sorts of unusual symptoms can occur when systems are overloaded. An overloaded system can cause a node or a connection to fail. Perhaps something needs to be more robust to these conditions, but reducing the load is a good step.

          Jeff Thompson added a comment -

          It sounds like there are possibly a few different, though interacting problems going on here.

          Agents / nodes fail or otherwise lose their connection at times. I don't see enough information here to suggest any causes, other than system issues such as overload or other lack of resources.

          Some portions of the system could be more resilient in the face of such failures. Pipeline, for one example, could better handle these failures or provide more tools for handling them.

          There may be some issue with sequencing of nodes starting up and other things interacting with them, such as pipeline. Again, there isn't sufficient information to determine what the problematic sequencing might be.

          Possibly other issues, also.

          Jeff Thompson added a comment - It sounds like there are possibly a few different, though interacting problems going on here. Agents / nodes fail or otherwise lose their connection at times. I don't see enough information here to suggest any causes, other than system issues such as overload or other lack of resources. Some portions of the system could be more resilient in the face of such failures. Pipeline, for one example, could better handle these failures or provide more tools for handling them. There may be some issue with sequencing of nodes starting up and other things interacting with them, such as pipeline. Again, there isn't sufficient information to determine what the problematic sequencing might be. Possibly other issues, also.

          Federico Naum added a comment -

          Just to mention that in the past I did monitor the cpu, memory,I/o, network for the nodes to check if there was an overload and found that this issue happens even when there is almost no load in the node. 

          Federico Naum added a comment - Just to mention that in the past I did monitor the cpu, memory,I/o, network for the nodes to check if there was an overload and found that this issue happens even when there is almost no load in the node. 

          H L added a comment -

          Noticed this issue started occurring after upgrading to 2.138.1 and the latest plugins a couple weeks ago. Reliability has gone down significantly after the update. Voting for this issue.

           

          H L added a comment - Noticed this issue started occurring after upgrading to 2.138.1 and the latest plugins a couple weeks ago. Reliability has gone down significantly after the update. Voting for this issue.  

          Jesse Glick added a comment -

          johnlengeling’s issue sounds like a garden-variety Remoting channel outage, which could have any of a host of root causes. Other reports sound a bit like JENKINS-41854, though there is not enough detail to tell. I was not familiar with that particular error message; from full-text search I can find this (plus an apparently copied usage in a much more obscure plugin).

          Jesse Glick added a comment - johnlengeling ’s issue sounds like a garden-variety Remoting channel outage, which could have any of a host of root causes. Other reports sound a bit like JENKINS-41854 , though there is not enough detail to tell. I was not familiar with that particular error message; from full-text search I can find this (plus an apparently copied usage in a much more obscure plugin).

          Jesse Glick added a comment -

          If reporter/commenters are running the Docker Slaves plugin, that may be to blame. I do not think this plugin is maintained. CC ndeloof

          Jesse Glick added a comment - If reporter/commenters are running the Docker Slaves plugin, that may be to blame. I do not think this plugin is maintained. CC ndeloof

          Jan Bottek added a comment -

          we also saw this kind of issue with the AWS EC2 plugin by creating a new EC2 instance/slave on the fly - most of the times we get this error on windows machine when the slave is started via WinRM - on unix machines where the slave is started via SSH we never saw this issue

          Jan Bottek added a comment - we also saw this kind of issue with the AWS EC2 plugin by creating a new EC2 instance/slave on the fly - most of the times we get this error on windows machine when the slave is started via WinRM - on unix machines where the slave is started via SSH we never saw this issue

          We saw this but are not using Docker Slaves or AWS EC2 plugin. We are using windows machines connected via JNLP.

          Russell Gallop added a comment - We saw this but are not using Docker Slaves or AWS EC2 plugin. We are using windows machines connected via JNLP.

          Jan Bottek added a comment -

          rg How did you start the slave? Everytime when we saw this issue we had it with automatic and newly created slaves, never with the slave we create manually and start via Java Web or via cmd -> also the newly created slaves via Swarm Plugin didn't cause this issue, only the slaves created on the fly by the EC2 plugin

          Jan Bottek added a comment - rg How did you start the slave? Everytime when we saw this issue we had it with automatic and newly created slaves, never with the slave we create manually and start via Java Web or via cmd -> also the newly created slaves via Swarm Plugin didn't cause this issue, only the slaves created on the fly by the EC2 plugin

          I don't know for sure whether it was Java Web or some other way, but definitely a permanent node. Is it possible that it happens more with automatically created agents as they are added/removed more frequently? It's not something we see frequently so we could have just been unlucky!

          Russell Gallop added a comment - I don't know for sure whether it was Java Web or some other way, but definitely a permanent node. Is it possible that it happens more with automatically created agents as they are added/removed more frequently? It's not something we see frequently so we could have just been unlucky!

          Jesse Glick added a comment -

          Never mind, I missed a usage of this message pattern in Jenkins core in Slave.reportLauncherCreateError.

          Jesse Glick added a comment - Never mind, I missed a usage of this message pattern in Jenkins core in Slave.reportLauncherCreateError .

          Jesse Glick added a comment -

          Tracking to oleg_nenashev’s changes for JENKINS-38527, I found the older JENKINS-23305 which sounds like it is tracking a similar problem. I have no clue as to the root cause here. It should go without saying that if anyone knows how to reproduce the error from scratch they should speak up. Otherwise I am not sure whether there is any way to proceed.

          Jesse Glick added a comment - Tracking to oleg_nenashev ’s changes for JENKINS-38527 , I found the older JENKINS-23305 which sounds like it is tracking a similar problem. I have no clue as to the root cause here. It should go without saying that if anyone knows how to reproduce the error from scratch they should speak up. Otherwise I am not sure whether there is any way to proceed.

          Jesse Glick added a comment -

          The stack traces of the form

          java.lang.IllegalStateException: No remoting channel to the agent OR it has not been fully initialized yet
          	at hudson.model.Slave.reportLauncherCreateError(Slave.java:524)
          	at hudson.model.Slave.createLauncher(Slave.java:496)
          	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.makeLauncher(DefaultStepContext.java:112)
          	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:68)
          	at org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:258)
          	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:206)
          	at …
          

          suggest that the problem arises when starting a sh, checkout, or similar step inside a node block. That would imply that this is a duplicate of JENKINS-41854. No idea why the frequency of that issue might have changed recently.

          Jesse Glick added a comment - The stack traces of the form java.lang.IllegalStateException: No remoting channel to the agent OR it has not been fully initialized yet at hudson.model.Slave.reportLauncherCreateError(Slave.java:524) at hudson.model.Slave.createLauncher(Slave.java:496) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.makeLauncher(DefaultStepContext.java:112) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:68) at org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:258) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:206) at … suggest that the problem arises when starting a sh , checkout , or similar step inside a node block. That would imply that this is a duplicate of JENKINS-41854 . No idea why the frequency of that issue might have changed recently.

          Caley Goff added a comment - - edited

          I am seeing this issue consistently when trying to use the post.aborted.container['slave'] below.

                    post {
                      aborted { 
                          echo "testaborted"
                          container('slave') {
                             sh "ls"
                          }
                      }
                      failure {
                          echo "testfail"
                          container('slave') {
                             sh "ls" 
                          }
                      }
                  } 

          What is interesting is I can use  the post.failure.container['slave'] above all day long and I can spin up a container and exit cleanly. 

           

          Jenkins 2.147 Kubernetes Plugin 1.13.6

           

          [Pipeline] withCredentials
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          [Pipeline] {
          [Pipeline] sh
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          [Pipeline] }
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet
          [Pipeline] // withCredentials
          [Pipeline] }
          [Pipeline] // container
          [Pipeline] }
          [Pipeline] // waitUntil
          [Pipeline] }
          [Pipeline] // timeout
          Error when executing aborted post condition:
          hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from 10.255.1.162/10.255.1.162:11312 failed. The channel is closing down or has closed down
          	at hudson.remoting.Channel.call(Channel.java:948)
          	at hudson.FilePath.act(FilePath.java:1070)
          	at hudson.FilePath.act(FilePath.java:1059)
          	at hudson.FilePath.mkdirs(FilePath.java:1244)
          	at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:171)
          	at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:197)
          	at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:189)
          	at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:110)
          	at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:98)
          	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:264)
          	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:270)
          	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:178)
          	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
          	at sun.reflect.GeneratedMethodAccessor969.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
          	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
          	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
          	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
          	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:157)
          	at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
          	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:155)
          	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129)
          	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
          	at WorkflowScript.run(WorkflowScript:198)
          	at ___cps.transform___(Native Method)
          	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
          	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
          	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
          	at sun.reflect.GeneratedMethodAccessor516.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
          	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
          	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
          	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
          	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
          	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
          	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
          	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
          	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$101(SandboxContinuable.java:34)
          	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.lambda$run0$0(SandboxContinuable.java:59)
          	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
          	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:58)
          	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:182)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232)
          	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          	at java.lang.Thread.run(Thread.java:748)
          Caused by: java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142)
          	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          	... 3 more 

          Caley Goff added a comment - - edited I am seeing this issue consistently when trying to use the post.aborted.container ['slave'] below. post { aborted { echo "testaborted" container( 'slave' ) { sh "ls" } } failure { echo "testfail" container( 'slave' ) { sh "ls" } } } What is interesting is I can use  the post.failure.container ['slave']  above all day long and I can spin up a container and exit cleanly.    Jenkins 2.147 Kubernetes Plugin 1.13.6   [Pipeline] withCredentials ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet [Pipeline] { [Pipeline] sh ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet [Pipeline] } ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet ERROR: Issue with creating launcher for agent multi-image-1l98r-vc3rk. The agent has not been fully initialized yet [Pipeline] // withCredentials [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // waitUntil [Pipeline] } [Pipeline] // timeout Error when executing aborted post condition: hudson.remoting.ChannelClosedException: Channel "unknown" : Remote call on JNLP4-connect connection from 10.255.1.162/10.255.1.162:11312 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:948) at hudson.FilePath.act(FilePath.java:1070) at hudson.FilePath.act(FilePath.java:1059) at hudson.FilePath.mkdirs(FilePath.java:1244) at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:171) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:197) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:189) at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:110) at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:98) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:264) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:270) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:178) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122) at sun.reflect.GeneratedMethodAccessor969.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:157) at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:155) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) at WorkflowScript.run(WorkflowScript:198) at ___cps.transform___(Native Method) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82) at sun.reflect.GeneratedMethodAccessor516.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) at com.cloudbees.groovy.cps.Next.step(Next.java:83) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163) at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129) at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$101(SandboxContinuable.java:34) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.lambda$run0$0(SandboxContinuable.java:59) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:58) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:182) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Caused by: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) ... 3 more

          Ben Hines added a comment -

          We are relatively updated now, and hit this again when restarting our agents. Jenkins 2.154 / Remoting 3.27. The node had just connected via JNLP.

          For us i think it would be pretty easy to reproduce if we made a task that continually restarts agents while a job is scheduled via pipeline to run on them. 
          Cannot contact <nodename>: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from <node>/<ip>:56650 failed. The channel is closing down or has closed down

          Ben Hines added a comment - We are relatively updated now, and hit this again when restarting our agents. Jenkins 2.154 / Remoting 3.27. The node had just connected via JNLP. For us i think it would be pretty easy to reproduce if we made a task that continually restarts agents while a job is scheduled via pipeline to run on them.  Cannot contact <nodename>: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from <node>/<ip>:56650 failed. The channel is closing down or has closed down

          Chris N. added a comment - - edited

          I encounter a similar issue on a test pipeline doing 13 test satges with only a clean workspace and a NAS mount. Something like that:

                          stage('test_disconnexion') {
                              agent {
                                  node {
                                      label 'srdspicpops01||srdspicpops03'
                                  }
                              }                  
                              steps {
                                  script {
                                      try {
                                          //sleep 90 // Test with a wait before start
                                          echo "Running on node: ${env.NODE_NAME} with workspace: ${env.WORKSPACE}"
                                          cleanWs()
                                          batFunctionToMountCIFSVolume(){
                                              echo "DEBUG test_disconnexion"
                                          }
                                      } catch (exc) {
                                          def sw = new StringWriter()
                                          def pw = new PrintWriter(sw)
                                          exc.printStackTrace(pw)
                                          echo sw.toString()
                                          unstable(message: "${STAGE_NAME} is unstable")
                                          stage_failed."{STAGE_NAME}" = true
                                      }
                                  }
                              }
                              post {
                                  always {
                                      revertFunction([node: "${env.NODE_NAME}")
                                  }
                              }
                          } 

          I was running this test on a lunux Jenkins master (2.401.3) and two windows JNLP websocket slaves (with the same OpenJDK version than the master and the correct agent.jar file).

          As I didn't found any reason why this happens; I move to a windows master (2.414.1) and run the same slave into this windows master... Same behaviour.

          Disconnection happens only 1.2% of the time (avg). So this is hard to reproduce.

          Last log from Linux master; error when executing the cleanWs():

          [2023-09-12T09:33:28.611Z] hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException
          [2023-09-12T09:33:28.612Z] 	at hudson.remoting.Request.abort(Request.java:346)
          [2023-09-12T09:33:28.612Z] 	at hudson.remoting.Channel.terminate(Channel.java:1080)
          [2023-09-12T09:33:28.612Z] 	at hudson.remoting.Channel$1.terminate(Channel.java:620)
          [2023-09-12T09:33:28.612Z] 	at hudson.remoting.AbstractByteBufferCommandTransport.terminate(AbstractByteBufferCommandTransport.java:356)
          [2023-09-12T09:33:28.612Z] 	at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:158)
          [2023-09-12T09:33:28.612Z] 	at jenkins.websocket.WebSockets$1.onWebSocketClose(WebSockets.java:88)
          [2023-09-12T09:33:28.612Z] 	at jenkins.websocket.WebSockets$1.onWebSocketError(WebSockets.java:94)
          [2023-09-12T09:33:28.612Z] 	at jenkins.websocket.Jetty10Provider$2.onWebSocketError(Jetty10Provider.java:174)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onError(JettyWebSocketFrameHandler.java:260)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$2(WebSocketCoreSession.java:284)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1468)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1487)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.closeConnection(WebSocketCoreSession.java:284)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onEof(WebSocketCoreSession.java:254)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:482)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934)
          [2023-09-12T09:33:28.612Z] 	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078)
          [2023-09-12T09:33:28.612Z] 	at java.base/java.lang.Thread.run(Thread.java:829)
          [2023-09-12T09:33:28.612Z] 	Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to srdspicpops03
          [2023-09-12T09:33:28.612Z] 		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784)
          [2023-09-12T09:33:28.612Z] 		at hudson.remoting.Request.call(Request.java:199)
          [2023-09-12T09:33:28.612Z] 		at hudson.remoting.Channel.call(Channel.java:999)
          [2023-09-12T09:33:28.612Z] 		at hudson.FilePath.act(FilePath.java:1192)
          [2023-09-12T09:33:28.612Z] 		at hudson.FilePath.act(FilePath.java:1181)
          [2023-09-12T09:33:28.612Z] 		at hudson.FilePath.mkdirs(FilePath.java:1372)
          [2023-09-12T09:33:28.612Z] 		at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:97)
          [2023-09-12T09:33:28.612Z] 		at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71)
          [2023-09-12T09:33:28.612Z] 		at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
          [2023-09-12T09:33:28.612Z] 		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
          [2023-09-12T09:33:28.612Z] 		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
          [2023-09-12T09:33:28.612Z] 		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
          [2023-09-12T09:33:28.612Z] 		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
          [2023-09-12T09:33:28.612Z] 		... 1 more
          [2023-09-12T09:33:28.612Z] 	Suppressed: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: e4a5a559-6c82-43f5-b66a-3056dd450409
          [2023-09-12T09:33:28.612Z] Caused by: java.nio.channels.ClosedChannelException
          [2023-09-12T09:33:28.612Z] 	at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:157)
          [2023-09-12T09:33:28.612Z] 	... 23 more

          Last log from windows master; error when executing the cleanWs():

          14:10:55  ERROR: Issue with creating launcher for agent srdspicpops01. The agent is being disconnected
          14:10:55  [Pipeline] echo
          14:10:55  hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@533ae8:srdspicpops01": Remote call on srdspicpops01 failed. The channel is closing down or has closed down
          14:10:55  	at hudson.remoting.Channel.call(Channel.java:993)
          14:10:55  	at hudson.FilePath.act(FilePath.java:1192)
          14:10:55  	at hudson.FilePath.act(FilePath.java:1181)
          14:10:55  	at hudson.FilePath.mkdirs(FilePath.java:1372)
          14:10:55  	at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:97)
          14:10:55  	at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71)
          14:10:55  	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
          14:10:55  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
          14:10:55  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
          14:10:55  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
          14:10:55  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
          14:10:55  	at java.base/java.lang.Thread.run(Thread.java:829)
          14:10:55  	Suppressed: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 2f0912fb-c33e-43ea-826c-41331758b7ec
          14:10:55  Caused by: java.nio.channels.ClosedChannelException
          14:10:55  	at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:157)
          14:10:55  	at jenkins.websocket.WebSockets$1.onWebSocketClose(WebSockets.java:88)
          14:10:55  	at jenkins.websocket.Jetty10Provider$2.onWebSocketClose(Jetty10Provider.java:164)
          14:10:55  	at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.notifyOnClose(JettyWebSocketFrameHandler.java:308)
          14:10:55  	at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onClosed(JettyWebSocketFrameHandler.java:292)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$0(WebSocketCoreSession.java:272)
          14:10:55  	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1451)
          14:10:55  	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1488)
          14:10:55  	at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$1(WebSocketCoreSession.java:272)
          14:10:55  	at org.eclipse.jetty.util.Callback$4.completed(Callback.java:184)
          14:10:55  	at org.eclipse.jetty.util.Callback$Completing.succeeded(Callback.java:344)
          14:10:55  	at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onError(JettyWebSocketFrameHandler.java:268)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$2(WebSocketCoreSession.java:284)
          14:10:55  	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1469)
          14:10:55  	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1488)
          14:10:55  	at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.closeConnection(WebSocketCoreSession.java:284)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.processConnectionError(WebSocketCoreSession.java:339)
          14:10:55  	at org.eclipse.jetty.websocket.core.internal.WebSocketConnection$Flusher.onCompleteFailure(WebSocketConnection.java:654)
          14:10:55  	at org.eclipse.jetty.util.IteratingCallback.failed(IteratingCallback.java:417)
          14:10:55  	at org.eclipse.jetty.util.Callback$Nested.failed(Callback.java:405)
          14:10:55  	at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:402)
          14:10:55  	at org.eclipse.jetty.io.SelectableChannelEndPoint$3.run(SelectableChannelEndPoint.java:87)
          14:10:55  	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
          14:10:55  	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
          14:10:55  	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
          14:10:55  	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
          14:10:55  	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
          14:10:55  	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
          14:10:55  	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
          14:10:55  	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
          14:10:55  	... 1 more

          This windows slave is a standard node that I revert after each stage test (post > always). It seems that this revert may have an incidence with the test stability. On a slave that I don't revert (build / installer); I do not have that kind of disconnexion.

          Most of the time, the disconnexion occurs when starting the cleanWs(). So at the very beginning of the satge.

          Chris N. added a comment - - edited I encounter a similar issue on a test pipeline doing 13 test satges with only a clean workspace and a NAS mount. Something like that:               stage( 'test_disconnexion' ) {                     agent {                         node {                             label 'srdspicpops01||srdspicpops03'                         }                     }                                       steps {                         script {                             try {                                 //sleep 90 // Test with a wait before start                                 echo "Running on node: ${env.NODE_NAME} with workspace: ${env.WORKSPACE}"                                 cleanWs()                                 batFunctionToMountCIFSVolume(){                                     echo "DEBUG test_disconnexion"                                 }                             } catch (exc) {                                 def sw = new StringWriter()                                 def pw = new PrintWriter(sw)                                 exc.printStackTrace(pw)                                 echo sw.toString()                                 unstable(message: "${STAGE_NAME} is unstable" )                                 stage_failed. "{STAGE_NAME}" = true                             }                         }                     }                     post {                         always {                             revertFunction([node: "${env.NODE_NAME}" )                         }                     }                 } I was running this test on a lunux Jenkins master ( 2.401.3 ) and two windows JNLP websocket slaves (with the same OpenJDK version than the master and the correct agent.jar file). As I didn't found any reason why this happens; I move to a windows master ( 2.414.1 ) and run the same slave into this windows master... Same behaviour. Disconnection happens only 1.2% of the time (avg) . So this is hard to reproduce. Last log from Linux master; error when executing the cleanWs(): [2023-09-12T09:33:28.611Z] hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException [2023-09-12T09:33:28.612Z] at hudson.remoting.Request.abort(Request.java:346) [2023-09-12T09:33:28.612Z] at hudson.remoting.Channel.terminate(Channel.java:1080) [2023-09-12T09:33:28.612Z] at hudson.remoting.Channel$1.terminate(Channel.java:620) [2023-09-12T09:33:28.612Z] at hudson.remoting.AbstractByteBufferCommandTransport.terminate(AbstractByteBufferCommandTransport.java:356) [2023-09-12T09:33:28.612Z] at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:158) [2023-09-12T09:33:28.612Z] at jenkins.websocket.WebSockets$1.onWebSocketClose(WebSockets.java:88) [2023-09-12T09:33:28.612Z] at jenkins.websocket.WebSockets$1.onWebSocketError(WebSockets.java:94) [2023-09-12T09:33:28.612Z] at jenkins.websocket.Jetty10Provider$2.onWebSocketError(Jetty10Provider.java:174) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onError(JettyWebSocketFrameHandler.java:260) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$2(WebSocketCoreSession.java:284) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1468) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1487) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.closeConnection(WebSocketCoreSession.java:284) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onEof(WebSocketCoreSession.java:254) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:482) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$ new $0(AdaptiveExecutionStrategy.java:140) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934) [2023-09-12T09:33:28.612Z] at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078) [2023-09-12T09:33:28.612Z] at java.base/java.lang. Thread .run( Thread .java:829) [2023-09-12T09:33:28.612Z] Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to srdspicpops03 [2023-09-12T09:33:28.612Z] at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784) [2023-09-12T09:33:28.612Z] at hudson.remoting.Request.call(Request.java:199) [2023-09-12T09:33:28.612Z] at hudson.remoting.Channel.call(Channel.java:999) [2023-09-12T09:33:28.612Z] at hudson.FilePath.act(FilePath.java:1192) [2023-09-12T09:33:28.612Z] at hudson.FilePath.act(FilePath.java:1181) [2023-09-12T09:33:28.612Z] at hudson.FilePath.mkdirs(FilePath.java:1372) [2023-09-12T09:33:28.612Z] at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:97) [2023-09-12T09:33:28.612Z] at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71) [2023-09-12T09:33:28.612Z] at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47) [2023-09-12T09:33:28.612Z] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [2023-09-12T09:33:28.612Z] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [2023-09-12T09:33:28.612Z] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [2023-09-12T09:33:28.612Z] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [2023-09-12T09:33:28.612Z] ... 1 more [2023-09-12T09:33:28.612Z] Suppressed: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: e4a5a559-6c82-43f5-b66a-3056dd450409 [2023-09-12T09:33:28.612Z] Caused by: java.nio.channels.ClosedChannelException [2023-09-12T09:33:28.612Z] at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:157) [2023-09-12T09:33:28.612Z] ... 23 more Last log from windows master; error when executing the cleanWs(): 14:10:55 ERROR: Issue with creating launcher for agent srdspicpops01. The agent is being disconnected 14:10:55 [Pipeline] echo 14:10:55 hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@533ae8:srdspicpops01" : Remote call on srdspicpops01 failed. The channel is closing down or has closed down 14:10:55 at hudson.remoting.Channel.call(Channel.java:993) 14:10:55 at hudson.FilePath.act(FilePath.java:1192) 14:10:55 at hudson.FilePath.act(FilePath.java:1181) 14:10:55 at hudson.FilePath.mkdirs(FilePath.java:1372) 14:10:55 at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:97) 14:10:55 at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71) 14:10:55 at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47) 14:10:55 at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 14:10:55 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) 14:10:55 at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 14:10:55 at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 14:10:55 at java.base/java.lang. Thread .run( Thread .java:829) 14:10:55 Suppressed: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 2f0912fb-c33e-43ea-826c-41331758b7ec 14:10:55 Caused by: java.nio.channels.ClosedChannelException 14:10:55 at jenkins.agents.WebSocketAgents$Session.closed(WebSocketAgents.java:157) 14:10:55 at jenkins.websocket.WebSockets$1.onWebSocketClose(WebSockets.java:88) 14:10:55 at jenkins.websocket.Jetty10Provider$2.onWebSocketClose(Jetty10Provider.java:164) 14:10:55 at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.notifyOnClose(JettyWebSocketFrameHandler.java:308) 14:10:55 at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onClosed(JettyWebSocketFrameHandler.java:292) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$0(WebSocketCoreSession.java:272) 14:10:55 at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1451) 14:10:55 at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1488) 14:10:55 at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$1(WebSocketCoreSession.java:272) 14:10:55 at org.eclipse.jetty.util.Callback$4.completed(Callback.java:184) 14:10:55 at org.eclipse.jetty.util.Callback$Completing.succeeded(Callback.java:344) 14:10:55 at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onError(JettyWebSocketFrameHandler.java:268) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.lambda$closeConnection$2(WebSocketCoreSession.java:284) 14:10:55 at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1469) 14:10:55 at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1488) 14:10:55 at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.closeConnection(WebSocketCoreSession.java:284) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.processConnectionError(WebSocketCoreSession.java:339) 14:10:55 at org.eclipse.jetty.websocket.core.internal.WebSocketConnection$Flusher.onCompleteFailure(WebSocketConnection.java:654) 14:10:55 at org.eclipse.jetty.util.IteratingCallback.failed(IteratingCallback.java:417) 14:10:55 at org.eclipse.jetty.util.Callback$Nested.failed(Callback.java:405) 14:10:55 at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:402) 14:10:55 at org.eclipse.jetty.io.SelectableChannelEndPoint$3.run(SelectableChannelEndPoint.java:87) 14:10:55 at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416) 14:10:55 at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385) 14:10:55 at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272) 14:10:55 at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$ new $0(AdaptiveExecutionStrategy.java:140) 14:10:55 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411) 14:10:55 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) 14:10:55 at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) 14:10:55 at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149) 14:10:55 ... 1 more This windows slave is a standard node that I revert after each stage test (post > always). It seems that this revert may have an incidence with the test stability. On a slave that I don't revert (build / installer); I do not have that kind of disconnexion. Most of the time, the disconnexion occurs when starting the cleanWs(). So at the very beginning of the satge.

            Unassigned Unassigned
            benh57 Ben Hines
            Votes:
            35 Vote for this issue
            Watchers:
            47 Start watching this issue

              Created:
              Updated: