Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50656

Google Compute Engine Plugin uncontrolled creation of instances

      Case 1)
      jenkins-master test:
      cloud:
      Name: gce
      Project: <project>
      Instance Cap: (1|3|8)
      Service Account Credentials: <account>
      Templates:
      Name Prefix: template-test
      Descriptios: <some description>
      Launch Timeout: 300
      Node Retention Time: 6
      Usage: Only build jobs with label expressions matching this node
      Labels: gcp-tmp-test
      Run as user: jenkins
      RegionL: europe-west-1
      Zone: europe-west1-b
      Machine Type: (n1-standatd-32|any)
      Number of Executors: 4
      Preemptible: no
      Startup script: <no>
      GPUs: no
      Networking:
      Network: default
      Subnetwork: default
      Network tags: jenkins slave
      External IP?: yes
      Boot Disk:
      Image project: <project>
      Image name: <name>
      Disk Type: pd-standard
      Size: 150
      Delete on termination?: yes
      IAM: <bot_1>@<project>.iam.gserviceaccount.com
      Name Prefix: slave-gce
      Descriptios: <some description>
      Launch Timeout: 300
      Node Retention Time: 6
      Usage: Only build jobs with label expressions matching this node
      Labels: gcp-tmp-test-2
      Run as user: jenkins
      RegionL: europe-west-1
      Zone: europe-west1-b
      Machine Type: (n1-standatd-32|any)
      Number of Executors: 4
      Preemptible: no
      Startup script: <no>
      GPUs: no
      Networking:
      Network: default
      Subnetwork: default
      Network tags: jenkins slave
      External IP?: yes
      Boot Disk:
      Image project: <project>
      Image name: <name>
      Disk Type: pd-standard
      Size: 150
      Delete on termination?: yes
      IAM: <bot_1>@<project>.iam.gserviceaccount.com

      Jobs:
      1) Freestyle project
      Name: it-gce-test
      Restrict where this project can be run: gcp-tmp-test
      build shell:
      """
      uname -a
      ps -ef
      df -h
      docker ps
      docker pull ubuntu
      docker images
      ls -la /tmp
      cat /etc/passwd
      """
      2) Freestyle project
      Name: it-gce-test-double
      Restrict where this project can be run: gcp-tmp-test-2
      build shell:
      """
      uname -a
      ps -ef
      df -h
      docker ps
      docker pull ubuntu
      docker images
      ls -la /tmp
      cat /etc/passwd
      """

      Steps:

      • Run Job it-gce-test
      • Created jenkins slave template-test-******
      • Created one instance in GCE template-test-******
      • Build result: success.
      • slave template-test-****** still up
      • Run Job it-gce-test-double
      • 3 minutes nothing to change. 1 job in queue
      • slave template-test-****** removed
      • created jenkins slave slave-gce-******
      • build result: success.
      • Over 50 running instances in GCE with name: slave-gce-******

      Case 2)
      The same configuration, as a case 1, but i created 2 cluds with 1 template for each.
      Nothing to change. This plugin create only 1 jenkins slave and the second can't be created before first jenkins slave don't removed. But in GCE created instances without control.

      Case 3) I create 1 cloud in production jenkins and one cloud in test jenkins. One instance template for each cloud.
      Production jenkins master.
      Production cloud:
      cloud:
      Name: gce
      Project: <project>
      Instance Cap: (1|3|8)
      Service Account Credentials: <account>
      Templates:
      Name Prefix: template-prod
      Descriptios: <some description>
      Launch Timeout: 300
      Node Retention Time: 6
      Usage: Only build jobs with label expressions matching this node
      Labels: gcp-tmp-test
      Run as user: jenkins
      RegionL: europe-west-1
      Zone: europe-west1-b
      Machine Type: (n1-standatd-32|any)
      Number of Executors: 4
      Preemptible: no
      Startup script: <no>
      GPUs: no
      Networking:
      Network: default
      Subnetwork: default
      Network tags: jenkins slave
      External IP?: yes
      Boot Disk:
      Image project: <project>
      Image name: <name>
      Disk Type: pd-standard
      Size: 150
      Delete on termination?: yes
      IAM: <bot_1>@<project>.iam.gserviceaccount.com

      Test jenkins master
      Test cloud:
      cloud:
      Name: gce
      Project: <project>
      Instance Cap: (1|3|8)
      Service Account Credentials: <account>
      Templates:
      Name Prefix: template-test
      Descriptios: <some description>
      Launch Timeout: 300
      Node Retention Time: 6
      Usage: Only build jobs with label expressions matching this node
      Labels: gcp-tmp-test
      Run as user: jenkins
      RegionL: europe-west-1
      Zone: europe-west1-b
      Machine Type: (n1-standatd-32|any)
      Number of Executors: 4
      Preemptible: no
      Startup script: <no>
      GPUs: no
      Networking:
      Network: default
      Subnetwork: default
      Network tags: jenkins slave
      External IP?: yes
      Boot Disk:
      Image project: <project>
      Image name: <name>
      Disk Type: pd-standard
      Size: 150
      Delete on termination?: yes
      IAM: <bot_1>@<project>.iam.gserviceaccount.com

      I used one Google Project and One Service account for test and production cloud

      Jobs:
      production:
      Freestyle project
      Name: it-gce-test
      Restrict where this project can be run: gcp-tmp-test
      build shell:
      """
      uname -a
      ps -ef
      df -h
      docker ps
      docker pull ubuntu
      docker images
      ls -la /tmp
      cat /etc/passwd
      """

      Test:
      Freestyle project
      Name: it-gce-test
      Restrict where this project can be run: gcp-tmp-test
      build shell:
      """
      uname -a
      ps -ef
      df -h
      docker ps
      docker pull ubuntu
      docker images
      ls -la /tmp
      cat /etc/passwd
      """

      Steps:

      • Run Job it-gce-test on production and test jenkins
      • Created jenkins slave template-test-****** on test jenkins
      • Created jenkins slave template-prod-****** on production jenkins
      • Created one instance in GCE template-test-****** in GCE
      • Created one instance in GCE template-prod-****** in GCE
      • Build results: success.
      • slave template-test-****** up 6 minuts and destroyed with GCE instance
      • slave template-prod-****** up 6 minuts and destroyed with GCE instance

      I think the problem is in jenkins slaves. When one slave created by the plugin is upped, the second one can't created. But the second creates unlimited instances in the GCE.
      In this moment we can use only 1 template per jenkins master. Please fix this problem.

          [JENKINS-50656] Google Compute Engine Plugin uncontrolled creation of instances

          Evan Brown added a comment -

          Thanks for this report, ramzol. Based on your attached log, I believe this is related to https://issues.jenkins-ci.org/browse/JENKINS-50566. I'll have more time to dig into this tomorrow and will keep the issue updated.

          Evan Brown added a comment - Thanks for this report, ramzol . Based on your attached log, I believe this is related to https://issues.jenkins-ci.org/browse/JENKINS-50566 . I'll have more time to dig into this tomorrow and will keep the issue updated.

          dmitrii dudin added a comment -

          Hello Evan.
          Thank you for fast reply.
          I don't think it's related to https://issues.jenkins-ci.org/browse/JENKINS-50566

          I can try to reproduce this problem on jenkins master version 2.89.4 tomorrow

          dmitrii dudin added a comment - Hello Evan. Thank you for fast reply. I don't think it's related to  https://issues.jenkins-ci.org/browse/JENKINS-50566 I can try to reproduce this problem on jenkins master version 2.89.4  tomorrow

          Evan Brown added a comment - - edited

          Hi Dmitrii,

          I found a few issues thanks to your report.

          The first is a case where the plugin incorrectly calculates the available capacity by not using the correct instance status to determine if a VM is running.

          The second occurs when instances provisioned by two different clouds are counted together when determining available capacity.

          I just released 1.0.2. This uses proper instance statuses to count running VMs, and also buckets those VMs by cloud plugin so they obey the max capacity config for each plugin config.

          If you can confirm this fixes your issue, that'd be much appreciated. Thanks!

          Evan Brown added a comment - - edited Hi Dmitrii, I found a few issues thanks to your report. The first is a case where the plugin incorrectly calculates the available capacity by not using the correct instance status to determine if a VM is running. The second occurs when instances provisioned by two different clouds are counted together when determining available capacity. I just released 1.0.2 . This uses proper instance statuses to count running VMs, and also buckets those VMs by cloud plugin so they obey the max capacity config for each plugin config. If you can confirm this fixes your issue, that'd be much appreciated. Thanks!

          Rachel Yen added a comment -

          Closing as we haven't heard back. If there is a need to re-open, please open an issue in:

          https://github.com/jenkinsci/google-compute-engine-plugin/issues

           

          As we are discontinuing the use of JIRA.

          Rachel Yen added a comment - Closing as we haven't heard back. If there is a need to re-open, please open an issue in: https://github.com/jenkinsci/google-compute-engine-plugin/issues   As we are discontinuing the use of JIRA.

            evanbrown Evan Brown
            ramzol dmitrii dudin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: