Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-30203

When there are more jobs assigned to a cloud slave (once provisioned) than the computer can handle, then the cloud should be able to provision a new slave(s) to handle the remaining jobs needing execution

    • Icon: New Feature New Feature
    • Resolution: Fixed
    • Icon: Trivial Trivial
    • vsphere-cloud-plugin

      SEE COMMENTS (Looking for a CauseOfBlockage that will cause Jenkins to automatically ask the cloud for a new machine or another slave to accept a job because the currently assigned slave will not be available for that job.)

      I am enhancing the vsphere-cloud-plugin to work with slave templates so that the vSphereCloud can provision new slaves.

      In my first release I am going to have a newly provisioned slave run the jobs until the limited run limit is reached (right now 1 job execution) then it will be unprovisioned.

      When only working with 1 job this works perfectly. When I introduce a 2nd job then the newly provisioned slave is created, job 1 of 2 is executed, the slave is unprovisioned, another slave is provisioned, job 2 of 2 is executed, and the slave is unprovisioned.

      What I am expecting to see here is when (and yes I would have to implement a method that would control this which I can't find or doesn't exist) job 2 comes in and 1) the computer (or slave) tells jenkins it cannot handle the job (I tried canTake and reported all types of CauseOfBlockage) and 2) the cloud tells jenkins it can provision.

      I am wondering what I can do here. I would like the availability to notify jenkins that a slave/computer can handle a newly launched job request (even amongst the jobs already waiting to be ran)

          [JENKINS-30203] When there are more jobs assigned to a cloud slave (once provisioned) than the computer can handle, then the cloud should be able to provision a new slave(s) to handle the remaining jobs needing execution

          R. Tyler Croy added a comment -

          This isn't an INFRA issue, moving to JENKINS

          R. Tyler Croy added a comment - This isn't an INFRA issue, moving to JENKINS

          Kevin Smith added a comment -

          So, it seems, that given enough time and getting the slave to return false with canTake when the slave has accepted x number of jobs it is designed to handle, that the provisioning task will execute.

          Working Scenario:
          Job 1 set to run 10 minutes is requested to be built.
          Jenkins provisions Slave 1
          Slave 1 comes on-line
          Slave 1's slave.jar (channel) connects
          Slave 1 accepts and runs Job 1
          Job 2 set to run 1 minute is requested to be built.
          Jenkins sets Job 2 to be blocked on slave 1.
          X number of iterations where Jenkins asks Slave 1 if it canTake Job 2 (which I am now responding CauseOfBlockage.BecauseNodeIsOffline(this) for all BuildableItem(s) which the slave isn't looking to take) and slave 1 returns with a blockage message.
          JOB 1 IS STILL RUNNING
          Jenkins provisions Slave 2
          Slave 2 comes online
          Slave 2's slave.jar (channel) connects
          Slave 2 accepts and runs Job 2
          Job 2 finishes
          Slave 2 unprovisions
          Job 1 finishes
          Slave 1 unprovisions.

          So what I am looking for is a way to tell jenkins that the slave isn't going to accept that job in the near future (or at all) so that the provisioning can happen quicker.

          Kevin Smith added a comment - So, it seems, that given enough time and getting the slave to return false with canTake when the slave has accepted x number of jobs it is designed to handle, that the provisioning task will execute. Working Scenario: Job 1 set to run 10 minutes is requested to be built. Jenkins provisions Slave 1 Slave 1 comes on-line Slave 1's slave.jar (channel) connects Slave 1 accepts and runs Job 1 Job 2 set to run 1 minute is requested to be built. Jenkins sets Job 2 to be blocked on slave 1. X number of iterations where Jenkins asks Slave 1 if it canTake Job 2 (which I am now responding CauseOfBlockage.BecauseNodeIsOffline(this) for all BuildableItem(s) which the slave isn't looking to take) and slave 1 returns with a blockage message. JOB 1 IS STILL RUNNING Jenkins provisions Slave 2 Slave 2 comes online Slave 2's slave.jar (channel) connects Slave 2 accepts and runs Job 2 Job 2 finishes Slave 2 unprovisions Job 1 finishes Slave 1 unprovisions. So what I am looking for is a way to tell jenkins that the slave isn't going to accept that job in the near future (or at all) so that the provisioning can happen quicker.

          Code changed in jenkins
          User: Peter Darton
          Path:
          src/main/java/org/jenkinsci/plugins/vSphereCloud.java
          src/main/java/org/jenkinsci/plugins/vSphereCloudProvisionedSlave.java
          src/main/java/org/jenkinsci/plugins/vSphereCloudSlaveTemplate.java
          src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningAlgorithm.java
          src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningRecord.java
          src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningState.java
          src/test/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningAlgorithmTest.java
          src/test/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningStateTest.java
          http://jenkins-ci.org/commit/vsphere-cloud-plugin/4310c6cfa453fe59882406c821f0bb0c61173d26
          Log:
          Bugfix: JENKINS-36878: vSphere now respects per-slave instance cap.
          Bugfix: JENKINS-32112: NPE bug in vSphere.getTemplate().
          Enhancement: vSphere.java now distributes load over all matching
          templates. This satisfies JENKINS-30203 if Jenkins is configured with a
          template.
          Correction: Jenkins UI no longer offers facility to manually create a
          Cloud-provisioned slave (the cloud provisions those itself). Normal
          vSphere slaves are still manually provisionable.
          Cleaned up logging in vSphere.java.
          Typo in vSphereCloudSlaveTemplate: getNumberOfExceutors ->
          getNumberOfExecutors.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Peter Darton Path: src/main/java/org/jenkinsci/plugins/vSphereCloud.java src/main/java/org/jenkinsci/plugins/vSphereCloudProvisionedSlave.java src/main/java/org/jenkinsci/plugins/vSphereCloudSlaveTemplate.java src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningAlgorithm.java src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningRecord.java src/main/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningState.java src/test/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningAlgorithmTest.java src/test/java/org/jenkinsci/plugins/vsphere/tools/CloudProvisioningStateTest.java http://jenkins-ci.org/commit/vsphere-cloud-plugin/4310c6cfa453fe59882406c821f0bb0c61173d26 Log: Bugfix: JENKINS-36878 : vSphere now respects per-slave instance cap. Bugfix: JENKINS-32112 : NPE bug in vSphere.getTemplate(). Enhancement: vSphere.java now distributes load over all matching templates. This satisfies JENKINS-30203 if Jenkins is configured with a template. Correction: Jenkins UI no longer offers facility to manually create a Cloud-provisioned slave (the cloud provisions those itself). Normal vSphere slaves are still manually provisionable. Cleaned up logging in vSphere.java. Typo in vSphereCloudSlaveTemplate: getNumberOfExceutors -> getNumberOfExecutors.

          pjdarton added a comment -

          ksmith1874 I think that the plugin now meets this requirement.
          The 2.13 plugin had quite a bit of code that went some way towards doing this, and I've done some enhancements such that "it works for me" now. I'm reasonably pleased with how it's gone.
          I'd suggest that you grab a build of the bleeding-edge code, or wait for the 2.14 release, and try that - you might like what you see...

          ...and if it does everything this issue was asking for, I'd ask that you close it

          pjdarton added a comment - ksmith1874 I think that the plugin now meets this requirement. The 2.13 plugin had quite a bit of code that went some way towards doing this, and I've done some enhancements such that "it works for me" now. I'm reasonably pleased with how it's gone. I'd suggest that you grab a build of the bleeding-edge code, or wait for the 2.14 release, and try that - you might like what you see... ...and if it does everything this issue was asking for, I'd ask that you close it

          Kevin Smith added a comment - - edited

          pjdarton

          Thanks for the update. In fact those changes were done by me as there was no one else looking at them. I've been using them in-house for about 1 year now with excellent success.

          Regards,
          Kevin J. Smith

          Kevin Smith added a comment - - edited pjdarton Thanks for the update. In fact those changes were done by me as there was no one else looking at them. I've been using them in-house for about 1 year now with excellent success. Regards, Kevin J. Smith

          Kevin Smith added a comment -

          The changes needed for this defect were fixed once I correctly implemented the Cloud API.

          Kevin Smith added a comment - The changes needed for this defect were fixed once I correctly implemented the Cloud API.

            pjdarton pjdarton
            ksmith1874 Kevin Smith
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: