Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59705

hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from IP/IP:58344 failed. The channel is closing down or has closed dow

    • Icon: Bug Bug
    • Resolution: Not A Defect
    • Icon: Blocker Blocker
    • kubernetes-plugin
    • None
    • Jenkins master version: 2.190.1
      Kubernetes Plugin: 1.19.3

      It also happened before the upgrade in
      Jenkins: 2.176.3
      K8S plugin: 1.19.0

      It happens frequently not something constant, which makes it very hard to debug.

      This is my podTemplate:

      podTemplate(containers: [
          containerTemplate(
              name: 'build',
              image: 'my_builder:latest',
              command: 'cat',
              ttyEnabled: true,
              workingDir: '/mnt/jenkins'
          )
      ],
      volumes: [
          hostPathVolume(mountPath: '/var/run/docker.sock', hostPath: '/var/run/docker.sock'),
          hostPathVolume(mountPath: '/mnt/jenkins', hostPath: '/mnt/jenkins')
      ],
      yaml: """
      spec:
       containers:
         - name: build
           resources:
             requests:
               cpu: "10"
               memory: "10Gi" 
       securityContext:
         fsGroup: 995
      """
      )
      {
          node(POD_LABEL) {
              stage("Checkout") {
              }       
              // more stages
          }
      }
      

      This is the log from the pod:

      Inbound agent connected from IP/IP
      Waiting for agent to connect (0/100): my_branch
      Remoting version: 3.35
      This is a Unix agent
      Waiting for agent to connect (1/100): my_branch
      Agent successfully connected and online
      ERROR: Connection terminated
      java.nio.channels.ClosedChannelException
          at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
          at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142)
          at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
      

      Logs from Jenkins "cat /var/log/jenkins/jenkins.log":

      2019-10-08 14:40:48.171+0000 [id=287] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: branch_name, template=PodTemplate{, name='pod_name', namespace='default', label='label_name', nodeUsageMode=EXCLUSIVE, volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], HostPathVolume [mountPath=/mnt/jenkins, hostPath=/mnt/jenkins]], containers=[ContainerTemplate{name='build', image='my_builder', workingDir='/mnt/jenkins', command='cat', ttyEnabled=true, envVars=[KeyValueEnvVar [getValue()=deploy/.dazelrc, getKey()=RC_FILE]]}], annotations=[org.csanchez.jenkins.plugins.kubernetes.PodAnnotation@aab9c821]} io.fabric8.kubernetes.client.KubernetesClientTimeoutException: Timed out waiting for [100000] milliseconds for [Pod] with name:[branch_name] in namespace [default]. at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:130) at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:134) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:297) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
      

          [JENKINS-59705] hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from IP/IP:58344 failed. The channel is closing down or has closed dow

          I think I have found the issue, I'm using EKS and using SPOT instance to run my CI. When using spot this issue happens frequently, when using on demand it pass all the time.

          The reason is that the Jenkins is getting the wrong instance ip to connect to the Jenkins master.

          Example:

          kubectl get pods -o wide --all-namespaces
          NAMESPACE       NAME                                                              READY   STATUS              RESTARTS   AGE     IP              NODE                            NOMINATED NODE
          default         some-job-5-c328g-kd-k2pfz   0/2     ContainerCreating   0          2s      <none>          ip-172-26-18-44.ec2.internal    <none>
          

          As you can see the job run on instance "ip-172-26-18-44.ec2.internal"

          Instance is in ready state in K8S:

           

          kubectl get nodes
          NAME                            STATUS                     ROLES    AGE     VERSION
          ip-172-26-18-44.ec2.internal    Ready                      <none>   11d     v1.12.10-eks-1246e3
          

           

          This is the log from Jenkins console:

          [Pipeline] End of Pipeline
          Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from ip-172-26-30-207.ec2.internal/jenkins_master_IP:37312
                  at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
                  at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
                  at hudson.remoting.Channel.call(Channel.java:957)
                  at hudson.FilePath.act(FilePath.java:1072)
                  at hudson.FilePath.act(FilePath.java:1061)
                  at hudson.FilePath.mkdirs(FilePath.java:1246)
                  at hudson.plugins.git.GitSCM.createClient(GitSCM.java:811)
                  at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
                  at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:124)
                  at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
                  at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
                  at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          java.nio.file.AccessDeniedException: /mnt/jenkins/workspace
              at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
              at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
              at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
              at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
              at java.nio.file.Files.createDirectory(Files.java:674)
              at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
              at java.nio.file.Files.createDirectories(Files.java:767)
              at hudson.FilePath.mkdirs(FilePath.java:3239)
              at hudson.FilePath.access$1300(FilePath.java:212)
              at hudson.FilePath$Mkdirs.invoke(FilePath.java:1254)
              at hudson.FilePath$Mkdirs.invoke(FilePath.java:1250)
              at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052)
              at hudson.remoting.UserRequest.perform(UserRequest.java:211)
              at hudson.remoting.UserRequest.perform(UserRequest.java:54)
              at hudson.remoting.Request$2.run(Request.java:369)
              at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97)
              at java.lang.Thread.run(Thread.java:748)
          

          It try to connect to Jenkins master with "ip-172-26-30-207.ec2.internal" and this instance doesn't exist.

           

          Seems like some bug in the K8S plugin and the communication to get the SPOT correct IP.

           

          Eddie Mashayev added a comment - I think I have found the issue, I'm using EKS and using SPOT instance to run my CI. When using spot this issue happens frequently, when using on demand it pass all the time. The reason is that the Jenkins is getting the wrong instance ip to connect to the Jenkins master. Example: kubectl get pods -o wide --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default some-job-5-c328g-kd-k2pfz 0/2 ContainerCreating 0 2s <none> ip-172-26-18-44.ec2.internal <none> As you can see the job run on instance "ip-172-26-18-44.ec2.internal" Instance is in ready state in K8S:   kubectl get nodes NAME STATUS ROLES AGE VERSION ip-172-26-18-44.ec2.internal Ready <none> 11d v1.12.10-eks-1246e3   This is the log from Jenkins console: [Pipeline] End of Pipeline Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from ip-172-26-30-207.ec2.internal/jenkins_master_IP:37312 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743) at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357) at hudson.remoting.Channel.call(Channel.java:957) at hudson.FilePath.act(FilePath.java:1072) at hudson.FilePath.act(FilePath.java:1061) at hudson.FilePath.mkdirs(FilePath.java:1246) at hudson.plugins.git.GitSCM.createClient(GitSCM.java:811) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:124) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93) at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80) at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.nio.file.AccessDeniedException: /mnt/jenkins/workspace at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at java.nio.file.Files.createDirectories(Files.java:767) at hudson.FilePath.mkdirs(FilePath.java:3239) at hudson.FilePath.access$1300(FilePath.java:212) at hudson.FilePath$Mkdirs.invoke(FilePath.java:1254) at hudson.FilePath$Mkdirs.invoke(FilePath.java:1250) at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97) at java.lang. Thread .run( Thread .java:748) It try to connect to Jenkins master with "ip-172-26-30-207.ec2.internal" and this instance doesn't exist.   Seems like some bug in the K8S plugin and the communication to get the SPOT correct IP.  

          Seems like need to add more resources to the JNLP container, by editing the yaml resources(I have added 4CPU and 4G Ram):

          yaml: """
          spec:
           containers:
             - name: "jnlp"
               resources:
                 requests:
                   cpu: "4"
                   memory: "4Gi"
          """
          

          I was getting the Pod evicted notification for every time this error appeared in Jenkins console.

          Eddie Mashayev added a comment - Seems like need to add more resources to the JNLP container, by editing the yaml resources(I have added 4CPU and 4G Ram): yaml: """ spec: containers: - name: "jnlp" resources: requests: cpu: "4" memory: "4Gi" """ I was getting the Pod evicted notification for every time this error appeared in Jenkins console.

          Need to add more resources to JNLP container, default resources sometimes are not enough.

          Eddie Mashayev added a comment - Need to add more resources to JNLP container, default resources sometimes are not enough.

          Karol Gil added a comment -

          eddiem21 did this stop for you after you increased resources? We're observing these failures on a daily basis and in all cases it's trying to connect to non-existing node hostnames. We're running on demand worker nodes on EKS (no spots used).

          According to our monitoring JNLP container never uses more than 1.2 GB RAM and ~0.8 CPU, hence I doubt it's because of resources.

          Karol Gil added a comment - eddiem21 did this stop for you after you increased resources? We're observing these failures on a daily basis and in all cases it's trying to connect to non-existing node hostnames. We're running on demand worker nodes on EKS (no spots used). According to our monitoring JNLP container never uses more than 1.2 GB RAM and ~0.8 CPU, hence I doubt it's because of resources.

          karolgil Hey we  still face this issue once in a while. I worked on it a lot and described all my actions in this ticket.

          These things are NOT related to the issue:

          1. increasing JNLP resources.
          2. using spot/ondemand.

           

          There is one thing which fixing it reduce this issue to "Once in a while" :

          1. Increasing the root volume size for each EKS node - We are building many docker images and the root volume get full very quickly, increasing it to 250G(Default is 20G) and clean the images frequently fixed the majority of the failures.

           

          BUT we still facing this issue, I have suspicions it's related to the fact that Jenkins is scheduling a job in EKS node that is going down as part of the autoscaler policy. Job is being triggered and at the same time autoscaler components mark the same node to be cordoned. I dont have a prove yet for it. and it's being investigated.

          Eddie Mashayev added a comment - karolgil Hey we  still face this issue once in a while. I worked on it a lot and described all my actions in this ticket. These things are NOT related to the issue: increasing JNLP resources. using spot/ondemand.   There is one thing which fixing it reduce this issue to "Once in a while" : Increasing the root volume size for each EKS node - We are building many docker images and the root volume get full very quickly, increasing it to 250G(Default is 20G) and clean the images frequently fixed the majority of the failures.   BUT we still facing this issue, I have suspicions it's related to the fact that Jenkins is scheduling a job in EKS node that is going down as part of the autoscaler policy. Job is being triggered and at the same time autoscaler components mark the same node to be cordoned.  I dont have a prove yet for it. and it's being investigated.

          Karol Gil added a comment -

          Hey eddiem21, thanks for the response. I've been fighting this one for a while now as well and can confirm that your "not related" section is correct - we did both changes and issues are still being observed once in a while.

          Our monitoring shows that root volumes are far from full in any of the nodes being used for running our jobs so I doubt it's related - maybe the symptom is similar?

          I think it may be related to autoscaling as you said - we're observing this mostly in jobs that are using specific autoscaling group that has default capacity set to 0 and in peaks scales up to 80 nodes - this is when issue is most common. What bugs be is the fact that I can't track the hostnames that are listed in build log - these machines are not defined in AWS nor can I see them in autoscaler logs.

          By any chance - did you manage to reproduce that effectively? Or it appears to be "random"?

          Karol Gil added a comment - Hey eddiem21 , thanks for the response. I've been fighting this one for a while now as well and can confirm that your "not related" section is correct - we did both changes and issues are still being observed once in a while. Our monitoring shows that root volumes are far from full in any of the nodes being used for running our jobs so I doubt it's related - maybe the symptom is similar? I think it may be related to autoscaling as you said - we're observing this mostly in jobs that are using specific autoscaling group that has default capacity set to 0 and in peaks scales up to 80 nodes - this is when issue is most common. What bugs be is the fact that I can't track the hostnames that are listed in build log - these machines are not defined in AWS nor can I see them in autoscaler logs. By any chance - did you manage to reproduce that effectively? Or it appears to be "random"?

          karolgil It appears to be random.

          Eddie Mashayev added a comment - karolgil It appears to be random.

          Joe Roberts added a comment -

          I'm having the same random issue:

          Jenkins master version: 2.204.2
          Kubernetes Plugin: 1.23.2

          Pod Spec

          "name": "jnlp",
          "resources": {
            "limits": {
              "cpu": "800m",
              "memory": "1Gi"
            },
            "requests": {
              "cpu": "800m",
              "memory": "1Gi"
            }
          },
          

          Master

          2020-02-03 16:41:28.091+0000 [id=8198] INFO j.s.DefaultJnlpSlaveReceiver#channelClosed: IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/10.15.199.33:50000 remote=10.15.198.22/10.15.198.22:39686]] / Computer.threadPoolForRemoting [#2516] for default-jsnmd terminated: java.nio.channels.ClosedChannelException
          

          Slave

          INFO: ConnectedINFO: ConnectedFeb 03, 2020 1:49:45 PM org.eclipse.jgit.util.FS$FileStoreAttributes saveToConfigWARNING: locking FileBasedConfig[/home/jenkins/.config/jgit/config] failed after 5 retriesFeb 03, 2020 4:41:28 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: TerminatedFeb 03, 2020 4:41:28 PM hudson.Launcher$RemoteLaunchCallable$1 joinINFO: Failed to synchronize IO streams on the channel hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000": Remote call on JNLP4-connect connection to jenkins-agent/10.15.54.222:50000 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:948) at hudson.remoting.Channel.syncIO(Channel.java:1683) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1331) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:929) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:903) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:855) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97) at java.lang.Thread.run(Thread.java:748)Caused by: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48) at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264) ... 4 more
          Feb 03, 2020 4:41:28 PM hudson.remoting.UserRequest performWARNING: LinkageError while performing UserRequest:UserRPCRequest:hudson.Launcher$RemoteProcess.join[](39)java.lang.NoClassDefFoundError: hudson/util/ProcessTree at hudson.Proc$LocalProc.destroy(Proc.java:385) at hudson.Proc$LocalProc.join(Proc.java:358) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1324) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:929) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:903) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:855) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97) at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.ClassNotFoundException: hudson.util.ProcessTree at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:171) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 19 more
          Feb 03, 2020 4:41:28 PM hudson.remoting.Request$2 runINFO: Failed to send back a reply to the request hudson.remoting.Request$2@731c1874: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000": channel is already closedFeb 03, 2020 4:41:38 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Performing onReconnect operation.Feb 03, 2020 4:41:38 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnectINFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@460ff0bcFeb 03, 2020 4:41:39 PM hudson.remoting.jnlp.Main createEngineINFO: Setting up agent: default-jsnmdFeb 03, 2020 4:41:39 PM hudson.remoting.jnlp.Main$CuiListener <init>INFO: Jenkins agent is running in headless mode.Feb 03, 2020 4:41:40 PM hudson.remoting.Engine startEngineINFO: Using Remoting version: 3.35Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDirINFO: Using /home/jenkins/agent/remoting as a remoting work directoryFeb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.WorkDirManager setupLoggingINFO: Both error and output logs will be printed to /home/jenkins/agent/remotingFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Locating server among [http://jenkins:8080]Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolveINFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolveINFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability checkFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Agent discovery successful  Agent address: jenkins-agent  Agent port:    50000  Identity:      88:82:81:a2:3b:c3:24:07:e6:1f:6a:e3:5b:27:e1:26Feb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: HandshakingFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Connecting to jenkins-agent:50000Feb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Trying protocol: JNLP4-connectFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Remote identity confirmed: 88:82:81:a2:3b:c3:24:07:e6:1f:6a:e3:5b:27:e1:26Feb 03, 2020 4:41:41 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Connected

          Joe Roberts added a comment - I'm having the same random issue: Jenkins master version: 2.204.2 Kubernetes Plugin: 1.23.2 Pod Spec "name" : "jnlp" , "resources" : { "limits" : { "cpu" : "800m" , "memory" : "1Gi" }, "requests" : { "cpu" : "800m" , "memory" : "1Gi" } }, Master 2020-02-03 16:41:28.091+0000 [id=8198] INFO j.s.DefaultJnlpSlaveReceiver#channelClosed: IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/10.15.199.33:50000 remote=10.15.198.22/10.15.198.22:39686]] / Computer.threadPoolForRemoting [#2516] for default -jsnmd terminated: java.nio.channels.ClosedChannelException Slave INFO: ConnectedINFO: ConnectedFeb 03, 2020 1:49:45 PM org.eclipse.jgit.util.FS$FileStoreAttributes saveToConfigWARNING: locking FileBasedConfig[/home/jenkins/.config/jgit/config] failed after 5 retriesFeb 03, 2020 4:41:28 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: TerminatedFeb 03, 2020 4:41:28 PM hudson.Launcher$RemoteLaunchCallable$1 joinINFO: Failed to synchronize IO streams on the channel hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000" : Remote call on JNLP4-connect connection to jenkins-agent/10.15.54.222:50000 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:948) at hudson.remoting.Channel.syncIO(Channel.java:1683) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1331) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:929) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:903) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:855) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97) at java.lang. Thread .run( Thread .java:748)Caused by: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48) at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264) ... 4 more Feb 03, 2020 4:41:28 PM hudson.remoting.UserRequest performWARNING: LinkageError while performing UserRequest:UserRPCRequest:hudson.Launcher$RemoteProcess.join[](39)java.lang.NoClassDefFoundError: hudson/util/ProcessTree at hudson.Proc$LocalProc.destroy(Proc.java:385) at hudson.Proc$LocalProc.join(Proc.java:358) at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1324) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:929) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:903) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:855) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:97) at java.lang. Thread .run( Thread .java:748)Caused by: java.lang.ClassNotFoundException: hudson.util.ProcessTree at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:171) at java.lang. ClassLoader .loadClass( ClassLoader .java:424) at java.lang. ClassLoader .loadClass( ClassLoader .java:357) ... 19 more Feb 03, 2020 4:41:28 PM hudson.remoting.Request$2 runINFO: Failed to send back a reply to the request hudson.remoting.Request$2@731c1874: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5e672cca:JNLP4-connect connection to jenkins-agent/10.15.54.222:50000" : channel is already closedFeb 03, 2020 4:41:38 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Performing onReconnect operation.Feb 03, 2020 4:41:38 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnectINFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@460ff0bcFeb 03, 2020 4:41:39 PM hudson.remoting.jnlp.Main createEngineINFO: Setting up agent: default -jsnmdFeb 03, 2020 4:41:39 PM hudson.remoting.jnlp.Main$CuiListener <init>INFO: Jenkins agent is running in headless mode.Feb 03, 2020 4:41:40 PM hudson.remoting.Engine startEngineINFO: Using Remoting version: 3.35Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDirINFO: Using /home/jenkins/agent/remoting as a remoting work directoryFeb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.WorkDirManager setupLoggingINFO: Both error and output logs will be printed to /home/jenkins/agent/remotingFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Locating server among [http: //jenkins:8080]Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolveINFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]Feb 03, 2020 4:41:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolveINFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability checkFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Agent discovery successful  Agent address: jenkins-agent  Agent port:    50000  Identity:      88:82:81:a2:3b:c3:24:07:e6:1f:6a:e3:5b:27:e1:26Feb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: HandshakingFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Connecting to jenkins-agent:50000Feb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Trying protocol: JNLP4-connectFeb 03, 2020 4:41:40 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Remote identity confirmed: 88:82:81:a2:3b:c3:24:07:e6:1f:6a:e3:5b:27:e1:26Feb 03, 2020 4:41:41 PM hudson.remoting.jnlp.Main$CuiListener statusINFO: Connected

          cliff wildman added a comment -

          onlymoreso 

          Hit this today and my issue was that I did not have my kubernetes environment for kops explicitly set. Once I did that, this stopped.

           

          cliff wildman added a comment - onlymoreso   Hit this today and my issue was that I did not have my kubernetes environment for kops explicitly set. Once I did that, this stopped.  

          ....16 gems installed

          Cannot contact XXXX: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5cae5b6d:JNLP4-connect connection from XXXXXX": Remote call on JNLP4-connect connection from XXXXXX failed. The channel is closing down or has closed down

          Philip Gollucci added a comment - ....16 gems installed Cannot contact XXXX: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5cae5b6d:JNLP4-connect connection from XXXXXX": Remote call on JNLP4-connect connection from XXXXXX failed. The channel is closing down or has closed down

          Viktor Mayer added a comment -

          issue still persist, increasing resources did not help, struggling on Jenkins 2.249.1  Kubernetes plugin v1.27.1.1

          does the downgrade help ?

          Cannot contact mypod-75f00db0-cb38-4210-96f2-a07a8771d126-jxq21-tbv0s: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@73c10744:JNLP4-connect connection from 10.1.44.228/10.1.44.228:46992": Remote call on JNLP4-connect connection from 10.1.44.228/10.1.44.228:46992 failed. The channel is closing down or has closed down
          

          Viktor Mayer added a comment - issue still persist, increasing resources did not help, struggling on  Jenkins 2.249.1   Kubernetes plugin v1.27.1.1 does the downgrade help ? Cannot contact mypod-75f00db0-cb38-4210-96f2-a07a8771d126-jxq21-tbv0s: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@73c10744:JNLP4-connect connection from 10.1.44.228/10.1.44.228:46992" : Remote call on JNLP4-connect connection from 10.1.44.228/10.1.44.228:46992 failed. The channel is closing down or has closed down

          Vishal kumar added a comment -

          Noticing the same exception for several of my jobs :-

          Exception :- hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException

           

          Kubernetes plugin :- v1.27.1

          Jenkins Plugin:  Jenkins 2.249.2

           

          Vishal kumar added a comment - Noticing the same exception for several of my jobs :- Exception :- hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException   Kubernetes plugin :- v1.27.1 Jenkins Plugin:   Jenkins 2.249.2  

          For me that happens independent of K8s plugins. So it seems a problem inside the core Jenkins.

          Kai Kretschmann added a comment - For me that happens independent of K8s plugins. So it seems a problem inside the core Jenkins.

          Answering the original description

          You're hitting a provisioning timeout. It is set by default to 100 seconds, but if you're running on a new node, this may not be enough to pull the docker image and launch the container. Increase the value in the pod template configuration.

           

          Vincent Latombe added a comment - Answering the original description You're hitting a provisioning timeout. It is set by default to 100 seconds, but if you're running on a new node, this may not be enough to pull the docker image and launch the container. Increase the value in the pod template configuration.  

            Unassigned Unassigned
            eddiem21 Eddie Mashayev
            Votes:
            5 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: