• Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • None

      https://plugins.jenkins.io/kubernetes/ supports defining multiple clouds (Kubernetes clusters) simultaneously.

      There doesn't appear to be any logic/mechanism for distributing load amongst the clouds however.

      Assuming that all defined clouds do not use the "Restrict pipeline support to authorized folder" option, all clouds are basically equivalent (when no specific "cloud" is targeted by name in the pipeline). The only way to discriminate between the clouds is by setting them in a specific order in the Jenkins UI.

      In practice I am seeing all loads directed to the first/top cloud as defined in ./manage/configureClouds/

      This seems logical at first, but it means that the 1+n clouds are never touched. Some mechanism to distribute agents among the clouds (actively balanced or random for instance) would be nice.

      But what is far more pressing (and the reason I'm submitting this ticket) is that the Kubernetes plugin will not switch to the next defined cloud when the first/primary cloud has reached its resource quota.

      It will saturate the entire first cloud en then start spamming:

      ERROR: Failed to launch project-task-os-test-1-wd2ww-ks1tv-p3nfg
      io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://my.corp.internal:6443/api/v1/namespaces/my-project/pods. Message: pods "project-task-os-test-1-r01cm-x525n-fppjr" is forbidden: exceeded quota: 

      Some help text reads:

      The Kubernetes cloud to use to schedule the pod.
      If unset, the first available Kubernetes cloud will be used.

      The key word here being "available". The Kubernetes plugin seems to think that since the cloud API is available, the cloud itself is available for work, which might not be the case when resource limits are hit.

      I suppose one way around this would be to set a "Concurrency Limit" per cloud, but this is a rather ham-fisted approach, since it doesn't take into account the different resource profiles of different types of Jenkins agents. I have some agents with 1CPU+2GB resources, and some with 4CPU+12GB. Setting an arbitrary concurrency limit could leave valuable resources unused and introduce (unnecessary) queuing, depending on how conservative the limit is set. And it simply seems inefficient when resource quotas exist and feedback on their status is readily available through the API.

      Long story short: 2 requests:

      1. major: When the resource quota of cloud 1 is reached, try cloud 2, 3, 4, etc.
      2. minor: Allow some intelligent distribution among equivalent (non restricted) clouds such as "random", "balanced", etc.)

          [JENKINS-71632] Multi-cloud load distribution

          Mark Waite added a comment -

          The ci.jenkins.io server is using multiple clouds successfully with a kubernetes cluster on each cloud. We have a cluster that is dedicated to evaluating the Jenkins plugin bill of materials. When it runs out of capacity, the jobs evaluating the Jenkins plugin bill of materials block until capacity is available. You might connect with the Jenkins infra team to get more details on the techniques that they are using to manage multiple clouds providing capacity to a single Jenkins controller.

          Mark Waite added a comment - The ci.jenkins.io server is using multiple clouds successfully with a kubernetes cluster on each cloud. We have a cluster that is dedicated to evaluating the Jenkins plugin bill of materials. When it runs out of capacity, the jobs evaluating the Jenkins plugin bill of materials block until capacity is available. You might connect with the Jenkins infra team to get more details on the techniques that they are using to manage multiple clouds providing capacity to a single Jenkins controller.

          Pay Bas added a comment -

          markewaite thanks for the suggestion and the quick reply.

          The way you describe the setup, it does sound like their different clouds are used for specific different tasks/workloads, whereas I'm trying to distribute a "generic" workload amongst different clouds.

          But I'll be sure to contact them and see what's what.

          Pay Bas added a comment - markewaite thanks for the suggestion and the quick reply. The way you describe the setup, it does sound like their different clouds are used for specific different tasks/workloads, whereas I'm trying to distribute a "generic" workload amongst different clouds. But I'll be sure to contact them and see what's what.

          Mark R added a comment -

          Here's a use case for load balancing: distributing across multiple kubernetes clusters. In our case we have a different kubernetes (actually openshift) cluster per datacenter. I want to be able to roughly spread the load of build agents across datacenters.

          Mark R added a comment - Here's a use case for load balancing: distributing across multiple kubernetes clusters. In our case we have a different kubernetes (actually openshift) cluster per datacenter. I want to be able to roughly spread the load of build agents across datacenters.

          Hi Pay Bas,
          Any update from Infra team or any findings to achieve the load balancing across multiple clusters using a single Jenkins controller?

          Mohan Krishan Joshi added a comment - Hi Pay Bas, Any update from Infra team or any findings to achieve the load balancing across multiple clusters using a single Jenkins controller?

          markewaite , paybas Could you please help with the above query? Also, is it possible to select a specific cloud ( assuming we have multiple clouds - AKS, GKE configured using the Jenkins Kubernetes plugin) in the agent definition in a Jenkins declarative pipeline?

           pipeline {
              agent {
                 kubernetes {
                yaml '''
                  apiVersion: v1
                  kind: Pod
                  metadata:
                    name: build-pod
                  spec:
                    containers:
                    - name: build-docker

          Mohan Krishan Joshi added a comment - markewaite , paybas Could you please help with the above query? Also, is it possible to select a specific cloud ( assuming we have multiple clouds - AKS, GKE configured using the Jenkins Kubernetes plugin) in the agent definition in a Jenkins declarative pipeline?  pipeline {     agent {        kubernetes {       yaml '''         apiVersion: v1         kind: Pod         metadata:           name: build-pod         spec:           containers:           - name: build-docker

          Mark Waite added a comment - - edited

          mkjkec2005 no, I can't help with the above query.

          is it possible to select a specific cloud ( assuming we have multiple clouds - AKS, GKE configured using the Jenkins Kubernetes plugin) in the agent definition in a Jenkins declarative pipeline?

          If you label your agent with the cloud provider, then the job could specify that it must run on the label matching a specific cloud provider.

          The ci.jenkins.io implementation intentionally avoids declaring any more agent details in the Pipeline definition than are absolutely necessary. It intentionally does not provide all the details of the kubernetes agent, in the Pipeline definition, but instead uses a few simple labels and then relies on the various plugins to allocate an agent with those labels. To request a Linux agent with Java 17 and Apache Maven, I use the label maven-17. To request a Windows agent with Java 21 and Apache Maven, I use the label maven-21-windows. That allows the ci.jenkins.io administrators to use multiple clouds to provide the maven-17 label.

          The "check agent availability" job on ci.jenkins.io may provide some hints of how it is done. The [source codehttps://github.com/jenkins-infra/acceptance-tests/blob/check-agent-availability/Jenkinsfile] of that job definition shows the use of labels to choose agents.

          Mark Waite added a comment - - edited mkjkec2005 no, I can't help with the above query. is it possible to select a specific cloud ( assuming we have multiple clouds - AKS, GKE configured using the Jenkins Kubernetes plugin) in the agent definition in a Jenkins declarative pipeline? If you label your agent with the cloud provider, then the job could specify that it must run on the label matching a specific cloud provider. The ci.jenkins.io implementation intentionally avoids declaring any more agent details in the Pipeline definition than are absolutely necessary. It intentionally does not provide all the details of the kubernetes agent, in the Pipeline definition, but instead uses a few simple labels and then relies on the various plugins to allocate an agent with those labels. To request a Linux agent with Java 17 and Apache Maven, I use the label maven-17 . To request a Windows agent with Java 21 and Apache Maven, I use the label maven-21-windows . That allows the ci.jenkins.io administrators to use multiple clouds to provide the maven-17 label. The "check agent availability" job on ci.jenkins.io may provide some hints of how it is done. The [source codehttps://github.com/jenkins-infra/acceptance-tests/blob/check-agent-availability/Jenkinsfile] of that job definition shows the use of labels to choose agents.

          Thank you. So there are no plans to implement the below requirement (#1) in the Kubernetes-plugin. 

          Long story short: 2 requests:

          1. major: When the resource quota of cloud 1 is reached, try cloud 2, 3, 4, etc.
          2. minor: Allow some intelligent distribution among equivalent (non restricted) clouds such as "random", "balanced", etc.)

           

          Also, does the below pipeline snippet to select a configured cloud not work?

          pipeline {
            agent {
              kubernetes {
                //cloud 'kubernetes'  ---> Cant we specify the other cloud name here?
                containerTemplate

          {         name 'maven'         image 'maven:3.8.1-jdk-8'         command 'sleep'         args '99d'       }

              }
            }
            stages

          {     …   }

          }

          Mohan Krishan Joshi added a comment - Thank you. So there are no plans to implement the below requirement (#1) in the Kubernetes-plugin.  Long story short: 2 requests: major: When the resource quota of cloud 1 is reached, try cloud 2, 3, 4, etc. minor: Allow some intelligent distribution among equivalent (non restricted) clouds such as "random", "balanced", etc.)   Also, does the below pipeline snippet to select a configured cloud not work? pipeline {   agent {     kubernetes {       //cloud 'kubernetes'  ---> Cant we specify the other cloud name here?       containerTemplate {         name 'maven'         image 'maven:3.8.1-jdk-8'         command 'sleep'         args '99d'       }     }   }   stages {     …   } }

          Mark Waite added a comment -

          So there are no plans to implement the below requirement (#1) in the Kubernetes-plugin.

          I'm not a maintainer of the Kubernetes plugin. I don't know the plans of the maintainers of the Kubernetes plugin. Jenkins is an open source project. If you want a specific feature to implement a requirement, you're encouraged to submit a pull request that implements the feature.

          does the below pipeline snippet to select a configured cloud not work?

          I don't know.

          Mark Waite added a comment - So there are no plans to implement the below requirement (#1) in the Kubernetes-plugin. I'm not a maintainer of the Kubernetes plugin. I don't know the plans of the maintainers of the Kubernetes plugin. Jenkins is an open source project. If you want a specific feature to implement a requirement, you're encouraged to submit a pull request that implements the feature. does the below pipeline snippet to select a configured cloud not work? I don't know.

            Unassigned Unassigned
            paybas Pay Bas
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: