• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • Jenkins ver. 2.176.3
      Kubernets-plugin: 1.19.3
    • kubernetes 1.27.1

      I create a lot of jobs 300 of them using jobdsl.

      I have set the Kubernetes-plugin Concurrency Limit set to 20

      Then Kubernetes plugin spins up a new node/pod for almost every job. All but a few get in pending state due to resource limit in my Kubernetes cluster, the pending pods is removed after a while and then recreated.

      Sometimes the concurrency Limit is respected, i see a lot of this in my jenkins log. but it should never get to to 184 running or pending.

      INFO: Maximum number of concurrently running agent pods (20) reached for Kubernetes Cloud kubernetes, not provisioning: 184 running or pending in namespace jenkins with Kubernetes labels {jenkins=slave}
      
      

       

          [JENKINS-59959] The Concurrency Limit is not always respected.

          Sergio Merino added a comment -

          I have a similar problem. The plugin execute as much pods as possible whenever it is under the limit. And stop when it is over the limit. But doesn't start just the proper amount of pods to don't exceed the limit. Example:

          If the status is this:

            Pods Limit: 30

            Running: 30

            Queued: 50

          As soon as one running pod finish, Jenkins is going to execute the 50 queued ones at the same time, so you will finish with:

            Pods Limit: 30

            Running: 30 -1 (finished) + 50 = 79

            Queued: 0

          So, the limit is not respected at all

          I was expecting that the plugin can manage the internal queue, even running the pods in the correct order

           

          Sergio Merino added a comment - I have a similar problem. The plugin execute as much pods as possible whenever it is under the limit. And stop when it is over the limit. But doesn't start just the proper amount of pods to don't exceed the limit. Example: If the status is this:   Pods Limit: 30   Running: 30   Queued: 50 As soon as one running pod finish, Jenkins is going to execute the 50 queued ones at the same time, so you will finish with:   Pods Limit: 30   Running: 30 -1 (finished) + 50 = 79   Queued: 0 So, the limit is not respected at all I was expecting that the plugin can manage the internal queue, even running the pods in the correct order  

          Hi!

          Same issue here

          Plugin : 1.24.1
          Core : 2.204.3

          2020-02-29 22:57:35.179+0000 [id=30] INFO o.c.j.p.k.KubernetesCloud#addProvisionedSlave: Maximum number of concurrently running agent pods (10) reached for Kubernetes Cloud kubernetes, not provisioning: 23 running or pending in namespace jenkins with Kubernetes labels {jenkins=slave}
          2020-02-29 22:57:35.179+0000 [id=30] INFO o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1
          

          Thanks

          Valentin Delaye added a comment - Hi! Same issue here Plugin : 1.24.1 Core : 2.204.3 2020-02-29 22:57:35.179+0000 [id=30] INFO o.c.j.p.k.KubernetesCloud#addProvisionedSlave: Maximum number of concurrently running agent pods (10) reached for Kubernetes Cloud kubernetes, not provisioning: 23 running or pending in namespace jenkins with Kubernetes labels {jenkins=slave} 2020-02-29 22:57:35.179+0000 [id=30] INFO o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1 Thanks

          I have noticed that the limit are better respected when not using the NoDelayProvisioningStrategy (using the system property -Dio.jenkins.plugins.kubernetes.disableNoDelayProvisioning=true)

          Allan BURDAJEWICZ added a comment - I have noticed that the limit are better respected when not using the NoDelayProvisioningStrategy (using the system property -Dio.jenkins.plugins.kubernetes.disableNoDelayProvisioning=true )

          Rene Schönlein added a comment - - edited

          We are also frequently running into the same issue.

          I agree, that in theory the container cap is checked, but as soon as one spot opens up, all jobs waiting in the queue are triggering a "node provision" action. Depending on the actual queue size, this will lead to a massive over-provisioning of agents/pods. As a side effect of the produced system-load, most of the provisioned pods end up in some kind of network-timeout and are never removed from jenkins or the cloud-environment. Afterwards we are forced to manually clean up the mess by hand. Changing the provisioning strategy to NoDelayProvisioningStrategy has no meaningful effect for us. For now, we are forced to deactivate all automatic triggering of our ~ 1.5k jobs.

          Plugin : 1.26.2
          Core : 2.235.1

          Rene Schönlein added a comment - - edited We are also frequently running into the same issue. I agree, that in theory the container cap is checked, but as soon as one spot opens up, all jobs waiting in the queue are triggering a "node provision" action. Depending on the actual queue size, this will lead to a massive over-provisioning of agents/pods. As a side effect of the produced system-load, most of the provisioned pods end up in some kind of network-timeout and are never removed from jenkins or the cloud-environment. Afterwards we are forced to manually clean up the mess by hand. Changing the provisioning strategy to NoDelayProvisioningStrategy has no meaningful effect for us. For now, we are forced to deactivate all automatic triggering of our ~ 1.5k jobs. Plugin : 1.26.2 Core : 2.235.1

          David Schott added a comment -

          I took a stab at fixing this and filed https://github.com/jenkinsci/kubernetes-plugin/pull/824 for review.

          Special thanks to vlatombe for the productive working session today.

          David Schott added a comment - I took a stab at fixing this and filed  https://github.com/jenkinsci/kubernetes-plugin/pull/824  for review. Special thanks to vlatombe for the productive working session today.

          Carl Dorbeus added a comment -

          Awesome folks, big thanks to everybody, really appreciate the help. 

          Carl Dorbeus added a comment - Awesome folks, big thanks to everybody, really appreciate the help. 

          David Schott added a comment -

          Reopened because https://github.com/jenkinsci/kubernetes-plugin/pull/824 caused a major side effect in 1.27.1: https://issues.jenkins-ci.org/browse/JENKINS-63705, and efforts are being made to revert the changes.

           

          David Schott added a comment - Reopened because https://github.com/jenkinsci/kubernetes-plugin/pull/824  caused a major side effect in 1.27.1: https://issues.jenkins-ci.org/browse/JENKINS-63705 , and efforts are being made to revert the changes.  

          David Schott added a comment -

          Reassigned to vlatombe because https://github.com/jenkinsci/kubernetes-plugin/pull/835 does a much more thorough job at fixing this.

          David Schott added a comment - Reassigned to vlatombe because https://github.com/jenkinsci/kubernetes-plugin/pull/835  does a much more thorough job at fixing this.

          any news?

          Denis Zakharov added a comment - any news?

          Lars Berntzon added a comment -

          Tested in kubernetes:1.30.0 and it still does not work.

          Lars Berntzon added a comment - Tested in kubernetes:1.30.0 and it still does not work.

            vlatombe Vincent Latombe
            dorbeus Carl Dorbeus
            Votes:
            10 Vote for this issue
            Watchers:
            17 Start watching this issue

              Created:
              Updated: