-
Bug
-
Resolution: Not A Defect
-
Blocker
-
None
-
Jenkins master version: 2.190.1
Kubernetes Plugin: 1.19.3
It also happened before the upgrade in
Jenkins: 2.176.3
K8S plugin: 1.19.0
It happens frequently not something constant, which makes it very hard to debug.
This is my podTemplate:
podTemplate(containers: [ containerTemplate( name: 'build', image: 'my_builder:latest', command: 'cat', ttyEnabled: true, workingDir: '/mnt/jenkins' ) ], volumes: [ hostPathVolume(mountPath: '/var/run/docker.sock', hostPath: '/var/run/docker.sock'), hostPathVolume(mountPath: '/mnt/jenkins', hostPath: '/mnt/jenkins') ], yaml: """ spec: containers: - name: build resources: requests: cpu: "10" memory: "10Gi" securityContext: fsGroup: 995 """ ) { node(POD_LABEL) { stage("Checkout") { } // more stages } }
This is the log from the pod:
Inbound agent connected from IP/IP Waiting for agent to connect (0/100): my_branch Remoting version: 3.35 This is a Unix agent Waiting for agent to connect (1/100): my_branch Agent successfully connected and online ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Logs from Jenkins "cat /var/log/jenkins/jenkins.log":
2019-10-08 14:40:48.171+0000 [id=287] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: branch_name, template=PodTemplate{, name='pod_name', namespace='default', label='label_name', nodeUsageMode=EXCLUSIVE, volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], HostPathVolume [mountPath=/mnt/jenkins, hostPath=/mnt/jenkins]], containers=[ContainerTemplate{name='build', image='my_builder', workingDir='/mnt/jenkins', command='cat', ttyEnabled=true, envVars=[KeyValueEnvVar [getValue()=deploy/.dazelrc, getKey()=RC_FILE]]}], annotations=[org.csanchez.jenkins.plugins.kubernetes.PodAnnotation@aab9c821]} io.fabric8.kubernetes.client.KubernetesClientTimeoutException: Timed out waiting for [100000] milliseconds for [Pod] with name:[branch_name] in namespace [default]. at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:130) at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:134) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:297) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
[JENKINS-59705] hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from IP/IP:58344 failed. The channel is closing down or has closed dow
Description |
Original:
It happens frequently not something constant, which makes it very hard to debug. This is my podTemplate: {code:java} podTemplate(containers: [ containerTemplate( name: 'build', image: 'my_builder:latest', command: 'cat', ttyEnabled: true, workingDir: '/mnt/jenkins' ) ], volumes: [ hostPathVolume(mountPath: '/var/run/docker.sock', hostPath: '/var/run/docker.sock'), hostPathVolume(mountPath: '/mnt/jenkins', hostPath: '/mnt/jenkins') ], yaml: """ spec: containers: - name: build resources: requests: cpu: "10" memory: "10Gi" securityContext: fsGroup: 995 """ ) { node(POD_LABEL) { stage("Checkout") { } // more stages } } {code} This is the log from the pod: {code:java} Inbound agent connected from IP/IP Waiting for agent to connect (0/100): my_branch Remoting version: 3.35 This is a Unix agent Waiting for agent to connect (1/100): my_branch Agent successfully connected and online ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Logs from Jenkins "cat /var/log/jenkins/jenkins.log": {code:java} 2019-10-08 14:40:48.171+0000 [id=287] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: branch_name, template=PodTemplate{, name='pod_name', namespace='default', label='label_name', nodeUsageMode=EXCLUSIVE, volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], HostPathVolume [mountPath=/mnt/jenkins, hostPath=/mnt/jenkins]], containers=[ContainerTemplate{name='build', image='my_builder', workingDir='/mnt/jenkins', command='cat', ttyEnabled=true, envVars=[KeyValueEnvVar [getValue()=deploy/.dazelrc, getKey()=RC_FILE]]}], annotations=[org.csanchez.jenkins.plugins.kubernetes.PodAnnotation@aab9c821]} io.fabric8.kubernetes.client.KubernetesClientTimeoutException: Timed out waiting for [100000] milliseconds for [Pod] with name:[branch_name] in namespace [default]. at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:130) at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:134) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:297) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} |
New:
It happens frequently not something constant, which makes it very hard to debug. This is my podTemplate: {code:java} podTemplate(containers: [ containerTemplate( name: 'build', image: 'my_builder:latest', command: 'cat', ttyEnabled: true, workingDir: '/mnt/jenkins' ) ], volumes: [ hostPathVolume(mountPath: '/var/run/docker.sock', hostPath: '/var/run/docker.sock'), hostPathVolume(mountPath: '/mnt/jenkins', hostPath: '/mnt/jenkins') ], yaml: """ spec: containers: - name: build resources: requests: cpu: "10" memory: "10Gi" securityContext: fsGroup: 995 """ ) { node(POD_LABEL) { stage("Checkout") { } // more stages } } {code} This is the log from the pod: {code:java} Inbound agent connected from IP/IP Waiting for agent to connect (0/100): my_branch Remoting version: 3.35 This is a Unix agent Waiting for agent to connect (1/100): my_branch Agent successfully connected and online ERROR: Connection terminated java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154) at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:142) at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Logs from Jenkins "cat /var/log/jenkins/jenkins.log": {code:java} 2019-10-08 14:40:48.171+0000 [id=287] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: branch_name, template=PodTemplate{, name='pod_name', namespace='default', label='label_name', nodeUsageMode=EXCLUSIVE, volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], HostPathVolume [mountPath=/mnt/jenkins, hostPath=/mnt/jenkins]], containers=[ContainerTemplate{name='build', image='my_builder', workingDir='/mnt/jenkins', command='cat', ttyEnabled=true, envVars=[KeyValueEnvVar [getValue()=deploy/.dazelrc, getKey()=RC_FILE]]}], annotations=[org.csanchez.jenkins.plugins.kubernetes.PodAnnotation@aab9c821]} io.fabric8.kubernetes.client.KubernetesClientTimeoutException: Timed out waiting for [100000] milliseconds for [Pod] with name:[branch_name] in namespace [default]. at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:130) at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:134) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:297) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Fixed but Unreleased [ 10203 ] |
Resolution | Original: Fixed [ 1 ] | |
Status | Original: Fixed but Unreleased [ 10203 ] | New: Reopened [ 4 ] |
I think I have found the issue, I'm using EKS and using SPOT instance to run my CI. When using spot this issue happens frequently, when using on demand it pass all the time.
The reason is that the Jenkins is getting the wrong instance ip to connect to the Jenkins master.
Example:
kubectl get pods -o wide --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default some-job-5-c328g-kd-k2pfz 0/2 ContainerCreating 0 2s <none> ip-172-26-18-44.ec2.internal <none>
As you can see the job run on instance "ip-172-26-18-44.ec2.internal"
Instance is in ready state in K8S:
This is the log from Jenkins console:
It try to connect to Jenkins master with "ip-172-26-30-207.ec2.internal" and this instance doesn't exist.
Seems like some bug in the K8S plugin and the communication to get the SPOT correct IP.