-
Bug
-
Resolution: Unresolved
-
Minor
-
None
Jenkins Version: 2.444
Kubernetes Plugin Version: 4306.vc91e951ea_eb_d
We encountered an issue where our Kubernetes plugin was under the impression that seven agents were running when no agents were actually running:
println("global count:" + org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().getGlobalCount("kubernetes"))
global count:7
Checking directly via kubectl I can see that no agents are running:
kubectl get pods -n jenkins-build-pod
NAME READY STATUS RESTARTS AGE container-builder-79c7b74d5-7sngb 1/1 Running 1 (6h14m ago) 24h jenkins-build-pod-app-0 2/2 Running 0 12d secret-server-proxy-85f69bb9d9-vzqjd 1/1 Running 1 (11d ago) 11d
Next, we can query Jenkins for all the pod templates to see which pods it thinks its running:
println("global count:" + org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().getGlobalCount("kubernetes")) println("template 1 count: "+ org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().getPodTemplateCount("13a60cc3d5d58f05368f87e250b6fc6514cc13ec064877202c791f08bd50e9f5")) println("template 2 count: "+ org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().getPodTemplateCount("d38f0c96-c2e7-4514-885a-13dfedec826e")) println("template 3 count: "+ org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().getPodTemplateCount("5c7a3b9b-6199-41cf-bc40-19ad94768b14"))
global count:7 template 1 count: 0 template 2 count: 0 template 3 count: 0
As you can see, Kubernetes/Jenkins thinks that no pods with these templates are running.
Finally we can see which Templates have running pods:
print(org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits.get().podTemplateCounts)
[d38f0c96-c2e7-4514-885a-13dfedec826e:0, 13a60cc3d5d58f05368f87e250b6fc6514cc13ec064877202c791f08bd50e9f5:0, fbf5b0f4-2727-4347-9df5-b171bfccca37:7, 5c7a3b9b-6199-41cf-bc40-19ad94768b14:0]
Here we can see a new ID that does not exist in our existing templates:
fbf5b0f4-2727-4347-9df5-b171bfccca37
When trying to navigate to this template it says it does not exist:
https://jenkins.${domain}/manage/cloud/kubernetes/template/fbf5b0f4-2727-4347-9df5-b171bfccca37/
Restarting the controller has resolved the issue. Is there a way to know why/how these seven agents were orphaned and not cleaned up? Is there also a way to kill the orphaned agents without restarting the controller?