Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-27471

jClouds deletes nodes that are running flyweight builds

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • jclouds-plugin
    • Jenkins 1.599, jClouds-plugin 2.9-SNAPSHOT (private-3bd37c37-jenkins)

      When a matrix build begins, Jenkins creates a "flyweight" build that checks out the code and decides which (if any) of the project axes need to be rebuilt. The "flyweight" build starts the subordinate jobs and waits until they are complete before it finishes. However, Jenkins doesn't allocate an executor to the "flyweight" builds or consider them when reporting whether a node is "idle". Because JCloudsRetentionStrategy.check() only uses Computer.isIdle() to determine if the node is busy, it kills off nodes running "flyweight" builds.

      Our cloud is set to create instances with one executor each. In the scenario where a matrix build starts two subordinate builds, the "flyweight" will run alongside one build while the other goes to a second node. If the job on the second node takes more than the retention time to complete, the first node will be considered "idle" and jClouds-plugin will terminate it, which stops the "flyweight" build with an error, which causes the entire build to fail.

      "Flyweight" builds don't occupy executors, so the only way I've found to detect them one is by enumerating every job in the system, testing to see if it's building, then checking the name of the node it's building on. (If there's a better way, I would very much like to know about it!) Some Groovy code:
      for (node in Jenkins.getInstance().getComputers()) {
      num_jobs = 0;
      for (job in Jenkins.getInstance().getAllItems()) {
      if (job.isBuilding()) {
      if (job.getLastBuild().getBuiltOnStr() == node.getName())

      { num_jobs++; }

      }
      }

      if (num_jobs > 0)

      { println("Node " + node.getName() + " is running " + num_jobs + " - don't kill it!"); }

      }

      (I know this is really a bug in upstream Jenkins, but the developers there seem determined to treat "flyweight" builds as undetectable non-entities, so I think fixing jCloud-plugin's behavior will be a much easier fix.)

      Thanks for all your great work!

          [JENKINS-27471] jClouds deletes nodes that are running flyweight builds

          Code changed in jenkins
          User: Fritz Elfert
          Path:
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsComputer.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsRetentionStrategy.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsSlave.java
          http://jenkins-ci.org/commit/jclouds-plugin/f141bc7633987e691856e4c20af6dd262e27f9c1
          Log:
          Fixing JENKINS-28403 and JENKINS-27471 (WIP)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Fritz Elfert Path: jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsComputer.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsRetentionStrategy.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsSlave.java http://jenkins-ci.org/commit/jclouds-plugin/f141bc7633987e691856e4c20af6dd262e27f9c1 Log: Fixing JENKINS-28403 and JENKINS-27471 (WIP)

          Code changed in jenkins
          User: Fritz Elfert
          Path:
          jclouds-plugin/pom.xml
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/InstancesToRun.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsCloud.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsStartupHandler.java
          jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/internal/TerminateNodes.java
          jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/InstancesToRun/config.jelly
          jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper/help-instancesToRun.html
          jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper/help.html
          jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave/help.html
          jclouds-plugin/src/test/java/jenkins/plugins/jclouds/compute/internal/TerminateNodesTest.java
          http://jenkins-ci.org/commit/jclouds-plugin/3bc1200cba16f54c9bcad01cec6402bf29b2e6c6
          Log:
          Fixing JENKINS-28403 and JENKINS-27471 (WIP)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Fritz Elfert Path: jclouds-plugin/pom.xml jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/InstancesToRun.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsCloud.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/JCloudsStartupHandler.java jclouds-plugin/src/main/java/jenkins/plugins/jclouds/compute/internal/TerminateNodes.java jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/InstancesToRun/config.jelly jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper/help-instancesToRun.html jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsBuildWrapper/help.html jclouds-plugin/src/main/resources/jenkins/plugins/jclouds/compute/JCloudsOneOffSlave/help.html jclouds-plugin/src/test/java/jenkins/plugins/jclouds/compute/internal/TerminateNodesTest.java http://jenkins-ci.org/commit/jclouds-plugin/3bc1200cba16f54c9bcad01cec6402bf29b2e6c6 Log: Fixing JENKINS-28403 and JENKINS-27471 (WIP)

          Fritz Elfert added a comment -

          Fixed in git now.

          Fritz Elfert added a comment - Fixed in git now.

            felfert Fritz Elfert
            samsomething Sam Clippinger
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: