-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Jenkins 2.390 and (currently latest) K8s plugin 3896.v19b_160fd9589
We intend to use one-shot agents (K8s pods) but are observing overprovisioning of nearly double the required agents.
This minimal pipeline example has a parallel step with a fanout of 100
def branches = [:]for (int i = 0; i < 100; i++) { branches["branch${i}"] = { node('base-tools') { sh "echo Hello from a parallel branch" } } } parallel branches
When this pipeline runs, we see 199 agents (pods) created.
These are the logs from the NoDelayProvisionerStrategy
Mar 21, 2023 3:00:28 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Available capacity=0, currentDemand=1 Mar 21, 2023 3:00:28 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Planned 1 new nodes Mar 21, 2023 3:00:28 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply After provisioning, available capacity=1, currentDemand=1 Mar 21, 2023 3:00:28 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Suggesting NodeProvisioner review Mar 21, 2023 3:00:28 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Provisioning completed Mar 21, 2023 3:00:29 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Available capacity=1, currentDemand=100 Mar 21, 2023 3:00:29 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Planned 98 new nodes Mar 21, 2023 3:00:29 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply After provisioning, available capacity=99, currentDemand=100 Mar 21, 2023 3:00:29 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Suggesting NodeProvisioner review Mar 21, 2023 3:00:29 AM FINE io.jenkins.plugins.kubernetes.NoDelayProvisionerStrategy apply Provisioning not complete, consulting remaining strategies
The overprovisioned pods are eventually cleaned up, but they stress the controller and occupy valuable resources in the K8s cluster while sitting idle.
I feel that the NodeProvisioner shouldn't be involved here at all. However, for reference, these are system properties currently set on the controller:
-Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.8
This is a critical issue for us, and we would like to get to an actual "one-shot" node allocation behavior without overprovisioning. Thanks!
What happens if you use default values? These are aggressive values and should not be tweaked lightly.