-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major
-
Component/s: kubernetes-plugin
We recently upgraded our Jenkins instances to use kubernetes-plugin 1.30.1 (from 1.27.8). We have a concurrency limit set to 2 on our kubernetes cloud. As expected, two agents can spawn concurrently on Jenkins startup. This proper behavior continues for some time. But after a random number of days, only a single agent is spawn at a time. Concurrent jobs are added to the queue until the single agent is teared down. It is as if the concurrency limit was set to 1. Increasing it to 3 lets the plugin spawn 2 agents concurrently.Â
I've created a FINEST logger on org.csanchez.jenkins.plugins.kubernetes.KubernetesProvisioningLimits and I noticed that when the issue happen, the kubernetes global limit never goes back to 0/2. It stays to 1/2 even when no job is running (or 1/3 if I increase the concurrency limit to 3).
My guess is that https://github.com/jenkinsci/kubernetes-plugin/pull/939Â has a race condition and that, on specific timing, the global count https://github.com/jenkinsci/kubernetes-plugin/pull/939/files#diff-4877a6b83daf403574dc28dca505926a6b3ad326b84f891f278a4424a68f4b84R103Â is not properly decreased.
I'll try to write a test case that shows up the issue. Â