-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Jenkins 2.303.3 running on GKE 1.20
Kubernetes plugin >= 1.30.5
Issue started right after upgrading from 1.30.4 to any later version (1.30.5 to latest 1.30.10).
Log shows:
2021-11-20 13:42:17.978+0000 [id=108] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes jenkins/meta-mbdn9-9j8d3 2021-11-20 13:44:08.228+0000 [id=108] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: meta-mbdn9-9j8d3, temp late=PodTemplate{id='a54377cc-f22f-4767-9571-f9abf713d15f', name='meta-mbdn9', namespace='jenkins', idleMinutes=5, label='meta', serviceAccount='jenkins', nod eSelector='node_pool=build-pool', containers=[ContainerTemplate{name='gcloud', image='gcr.io/google.com/cloudsdktool/cloud-sdk:debian_component_based', comman d='cat', ttyEnabled=true}], annotations=[PodAnnotation{key='buildUrl', value='http://jenkins-ui:8080/job/k8s/job/reclaim-volumes/2675/'}, PodAnnotation{key='r unUrl', value='job/k8s/job/reclaim-volumes/2675/'}]} io.fabric8.kubernetes.client.KubernetesClientTimeoutException: Timed out waiting for [1000000] milliseconds for [Pod] with name:[meta-mbdn9-9j8d3] in namespac e [jenkins]. at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:96) at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:169) at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:293) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2021-11-20 13:44:08.230+0000 [id=108] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent meta-mbdn9-9j8d3 Terminated Kubernetes instance for agent jenkins/meta-mbdn9-9j8d3
This happens for every pod if its containers do no start in < 110 seconds (logs shows that after ~110 seconds the pod gets terminated by the plugin) even the error message is wrong: Timed out waiting for [1000000] milliseconds - it didn't wait 1000 seconds as it should.
I believe the issue comes from this commit: https://github.com/jenkinsci/kubernetes-plugin/commit/f95a604462fd7723ba8246b748c83dc90d65a9e3
I think these changed lines don't do the right thing:
- return periodicAwait(10, System.currentTimeMillis(), Math.max(remaining / 10, 1000L), remaining); + // Retry with 10% of the remaining time, with a min of 1s and a max of 10s + return periodicAwait(10, System.currentTimeMillis(), Math.min(10000L, Math.max(remaining / 10, 1000L)), remaining);
I've reverted that line and now my pods behave correctly like in 1.30.4. and agents are provisioned correctly.
- links to
Having same issue.