[JENKINS-14923] Configurable build slave balancing modes

Type: Improvement
Resolution: Won't Fix
Priority: Minor
Component/s: remoting
Labels:
- balancing
- slave

Similar Issues:
Powered by SuggestiMate

Show

From what I can tell with two build slaves, it seems like Jenkins will consume all slots on slave 1 before sending jobs to slave 2. It would be nice if this behavior were configurable. For example:

Round-robin: alternate between free slots from 1 and 2
Resource-aware: use slot from slave with most available bandwidth (or fastest response time?)
Standard: current behavior

Damien Nozay added a comment - 2012-10-16 21:21 - edited

I've run into this issue where we got 10 nodes with 10 executors each and the same one gets full before it gets executed elsewhere. somehow the contents of the job can kill the agent (OOM), which means it is very bothersome to greedily fill the same node.

my coworker looked up the code:
https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/LoadBalancer.java

LoadBalancer.java

        @Override
        public Mapping map(Task task, MappingWorksheet ws) {
            // build consistent hash for each work chunk
            List<ConsistentHash<ExecutorChunk>> hashes = new ArrayList<ConsistentHash<ExecutorChunk>>(ws.works.size());
            for (int i=0; i<ws.works.size(); i++) {
                ConsistentHash<ExecutorChunk> hash = new ConsistentHash<ExecutorChunk>(new Hash<ExecutorChunk>() {
                    public String hash(ExecutorChunk node) {
                        return node.getName();
                    }
                });
                for (ExecutorChunk ec : ws.works(i).applicableExecutorChunks())
                    hash.add(ec,ec.size()*100);

                hashes.add(hash);
            }

            // do a greedy assignment
            Mapping m = ws.new Mapping();
            assert m.size()==ws.works.size();   // just so that you the reader of the source code don't get confused with the for loop index

            if (assignGreedily(m,task,hashes,0)) {
                assert m.isCompletelyValid();
                return m;
            } else
                return null;
        }

        private boolean assignGreedily(Mapping m, Task task, List<ConsistentHash<ExecutorChunk>> hashes, int i) {
            if (i==hashes.size())   return true;    // fully assigned

            String key = task.getFullDisplayName() + (i>0 ? String.valueOf(i) : "");

            for (ExecutorChunk ec : hashes.get(i).list(key)) {
                // let's attempt this assignment
                m.assign(i,ec);

                if (m.isPartiallyValid() && assignGreedily(m,task,hashes,i+1))
                    return true;    // successful greedily allocation

                // otherwise 'ec' wasn't a good fit for us. try next.
            }

            // every attempt failed
            m.assign(i,null);
            return false;
        }

Damien Nozay added a comment - 2012-10-16 21:21 - edited I've run into this issue where we got 10 nodes with 10 executors each and the same one gets full before it gets executed elsewhere. somehow the contents of the job can kill the agent (OOM), which means it is very bothersome to greedily fill the same node. my coworker looked up the code: https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/LoadBalancer.java LoadBalancer.java @Override public Mapping map(Task task, MappingWorksheet ws) { // build consistent hash for each work chunk List<ConsistentHash<ExecutorChunk>> hashes = new ArrayList<ConsistentHash<ExecutorChunk>>(ws.works.size()); for ( int i=0; i<ws.works.size(); i++) { ConsistentHash<ExecutorChunk> hash = new ConsistentHash<ExecutorChunk>( new Hash<ExecutorChunk>() { public String hash(ExecutorChunk node) { return node.getName(); } }); for (ExecutorChunk ec : ws.works(i).applicableExecutorChunks()) hash.add(ec,ec.size()*100); hashes.add(hash); } // do a greedy assignment Mapping m = ws. new Mapping(); assert m.size()==ws.works.size(); // just so that you the reader of the source code don't get confused with the for loop index if (assignGreedily(m,task,hashes,0)) { assert m.isCompletelyValid(); return m; } else return null ; } private boolean assignGreedily(Mapping m, Task task, List<ConsistentHash<ExecutorChunk>> hashes, int i) { if (i==hashes.size()) return true ; // fully assigned String key = task.getFullDisplayName() + (i>0 ? String .valueOf(i) : ""); for (ExecutorChunk ec : hashes.get(i).list(key)) { // let's attempt this assignment m.assign(i,ec); if (m.isPartiallyValid() && assignGreedily(m,task,hashes,i+1)) return true ; // successful greedily allocation // otherwise 'ec' wasn't a good fit for us. try next. } // every attempt failed m.assign(i, null ); return false ; }

Damien Nozay added a comment - 2012-10-16 21:26

there is a @Override extension point we could leverage to change the assignGreedily to an assignRoundRobin?

I'm not much of a java / jenkins plugin expert though.

Damien Nozay added a comment - 2012-10-16 21:26 there is a @Override extension point we could leverage to change the assignGreedily to an assignRoundRobin ? I'm not much of a java / jenkins plugin expert though.

Jesse Glick added a comment - 2012-10-17 16:09

Currently I think you need to use Queue.setLoadBalancer, though it would be nicer if this were an ExtensionPoint.

The CloudBees Even Scheduler Plugin (part of Jenkins Enterprise) offers a variant balancer which sends jobs to idle slaves when possible: http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/even-scheduler.html

Jesse Glick added a comment - 2012-10-17 16:09 Currently I think you need to use Queue.setLoadBalancer , though it would be nicer if this were an ExtensionPoint . The CloudBees Even Scheduler Plugin (part of Jenkins Enterprise) offers a variant balancer which sends jobs to idle slaves when possible: http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/even-scheduler.html

SCM/JIRA link daemon added a comment - 2012-10-17 16:11

Code changed in jenkins
User: Jesse Glick
Path:
core/src/main/java/jenkins/model/Jenkins.java
http://jenkins-ci.org/commit/jenkins/8bb7e572f756f080186d73defb2ff757b3379830
Log:
~~JENKINS-14923~~ Removed CONSISTENT_HASH flag to simplify code.

SCM/JIRA link daemon added a comment - 2012-10-17 16:11 Code changed in jenkins User: Jesse Glick Path: core/src/main/java/jenkins/model/Jenkins.java http://jenkins-ci.org/commit/jenkins/8bb7e572f756f080186d73defb2ff757b3379830 Log: JENKINS-14923 Removed CONSISTENT_HASH flag to simplify code.

dogfood added a comment - 2012-10-17 16:59

Integrated in jenkins_main_trunk #2011
~~JENKINS-14923~~ Removed CONSISTENT_HASH flag to simplify code. (Revision 8bb7e572f756f080186d73defb2ff757b3379830)

Result = SUCCESS
Jesse Glick : 8bb7e572f756f080186d73defb2ff757b3379830
Files :

core/src/main/java/jenkins/model/Jenkins.java

dogfood added a comment - 2012-10-17 16:59 Integrated in jenkins_main_trunk #2011 JENKINS-14923 Removed CONSISTENT_HASH flag to simplify code. (Revision 8bb7e572f756f080186d73defb2ff757b3379830) Result = SUCCESS Jesse Glick : 8bb7e572f756f080186d73defb2ff757b3379830 Files : core/src/main/java/jenkins/model/Jenkins.java

Damien Nozay added a comment - 2012-10-17 17:14

Thanks for your help Jesse,
I'll try to see if it works as is on the latest and greatest or LTS.

I don't imagine a lot of proprietary IP going into a round-robin algorithm or 'even scheduler plugin'. Maybe we can contact Cloudbees team and see if this plugin can go into community version. Otherwise I will see if someone more knowledgeable can help cook out a simple plugin to contribute.

Damien Nozay added a comment - 2012-10-17 17:14 Thanks for your help Jesse, I'll try to see if it works as is on the latest and greatest or LTS. I don't imagine a lot of proprietary IP going into a round-robin algorithm or 'even scheduler plugin'. Maybe we can contact Cloudbees team and see if this plugin can go into community version. Otherwise I will see if someone more knowledgeable can help cook out a simple plugin to contribute.

Dhawal Bhanushali added a comment - 2012-11-23 15:30 - edited

Damien,

dogfood's changes does not fix the issue. The behaviour is same. He just made the behaviour explicit and more easy to detect

Dhawal Bhanushali added a comment - 2012-11-23 15:30 - edited Damien, dogfood's changes does not fix the issue. The behaviour is same. He just made the behaviour explicit and more easy to detect

Brendan Nolan added a comment - 2013-06-11 14:48

I was facing the same issue and have developed a plugin to override the default Load Balancing behavior to one that uses the least loaded node - https://wiki.jenkins-ci.org/display/JENKINS/Least+Load+Plugin

Brendan Nolan added a comment - 2013-06-11 14:48 I was facing the same issue and have developed a plugin to override the default Load Balancing behavior to one that uses the least loaded node - https://wiki.jenkins-ci.org/display/JENKINS/Least+Load+Plugin

Jesse Glick added a comment - 2013-06-11 15:53

Closing since this is not intended to be supplied by core.

Jesse Glick added a comment - 2013-06-11 15:53 Closing since this is not intended to be supplied by core.

Assignee:: Unassigned

Reporter:: Devon Crouse

Votes:: 2 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2012-08-24 17:35

Updated:: 2013-06-11 15:53

Resolved:: 2013-06-11 15:53

Jenkins

Details

Description

Attachments

Activity

Collapse comment: Damien Nozay added a comment - 2012-10-16 21:21, Edited by Damien Nozay - 2012-10-16 21:24

Expand comment: Damien Nozay added a comment - 2012-10-16 21:21, Edited by Damien Nozay - 2012-10-16 21:24

Collapse comment: Damien Nozay added a comment - 2012-10-16 21:26

Expand comment: Damien Nozay added a comment - 2012-10-16 21:26

Collapse comment: Jesse Glick added a comment - 2012-10-17 16:09

Expand comment: Jesse Glick added a comment - 2012-10-17 16:09

Collapse comment: SCM/JIRA link daemon added a comment - 2012-10-17 16:11

Expand comment: SCM/JIRA link daemon added a comment - 2012-10-17 16:11

Collapse comment: dogfood added a comment - 2012-10-17 16:59

Expand comment: dogfood added a comment - 2012-10-17 16:59

Collapse comment: Damien Nozay added a comment - 2012-10-17 17:14

Expand comment: Damien Nozay added a comment - 2012-10-17 17:14

Collapse comment: Dhawal Bhanushali added a comment - 2012-11-23 15:30, Edited by Dhawal Bhanushali - 2012-11-23 15:32

Expand comment: Dhawal Bhanushali added a comment - 2012-11-23 15:30, Edited by Dhawal Bhanushali - 2012-11-23 15:32

Collapse comment: Brendan Nolan added a comment - 2013-06-11 14:48

Expand comment: Brendan Nolan added a comment - 2013-06-11 14:48

Collapse comment: Jesse Glick added a comment - 2013-06-11 15:53

Expand comment: Jesse Glick added a comment - 2013-06-11 15:53

People

Dates