Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73788

Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • 4288.v1719f9d0c854

      There appears to be an unhappy interaction between:

      Our https://jenkins.big.corp/prometheus/ Prometheus scrape endpoint is absolutely flooded with metrics like:

      # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
      # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
      kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0

      Around 50k lines in totaal. Which brings the scraping to a crawl.

      But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval.

      The culprit appears to be https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27

      Which is triggered by: https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602

      I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.

      The other metrics I have no problem with:

      kubernetes_cloud_pods_created 13475.0
      kubernetes_cloud_pods_creation_failed 3430.0
      kubernetes_cloud_pods_launched 9742.0
      kubernetes_cloud_pods_terminated 16909.0
      kubernetes_cloud_provision_nodes 16905.0 

      But those kubernetes_cloud_###_provision_request_total entries are killing.

      There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

       

          [JENKINS-73788] Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters

          Pay Bas added a comment - - edited

          I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do.

          But I'd like a go-ahead before doing so.

          Proposed patch:

          Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to
           provision.request---
           .../plugins/kubernetes/KubernetesCloud.java     |  2 --
           .../jenkins/plugins/kubernetes/MetricNames.java |  6 ------
           .../plugins/kubernetes/MetricNamesTest.java     | 17 -----------------
           .../pipeline/KubernetesPipelineTest.java        |  4 ----
           4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          index 10d30fcb..81f5091d 100644
          --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          @@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes;
           
           import static java.nio.charset.StandardCharsets.UTF_8;
           import static org.apache.commons.lang.StringUtils.isEmpty;
          -import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel;
           
           import com.cloudbees.plugins.credentials.CredentialsMatchers;
           import com.cloudbees.plugins.credentials.common.StandardCredentials;
          @@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup {
                       @NonNull final Cloud.CloudState state, final int excessWorkload) {
                   var limitRegistrationResults = new LimitRegistrationResults(this);
                   try {
          -            Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload);
                       Label label = state.getLabel();
                       // Planned nodes, will be launched on the next round of NodeProvisioner
                       int plannedCapacity = state.getAdditionalPlannedCapacity();
          diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          index ee586771..23a48bc4 100644
          --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          @@ -1,6 +1,5 @@
           package org.csanchez.jenkins.plugins.kubernetes;
           
          -import hudson.model.Label;
           import java.util.Locale;
           
           public class MetricNames {
          @@ -21,9 +20,4 @@ public class MetricNames {
                   String formattedStatus = status == null ? "null" : status.toLowerCase(Locale.getDefault());
                   return PREFIX + ".pods.launch.status." + formattedStatus;
               }
          -
          -    public static String metricNameForLabel(Label label) {
          -        String labelText = (label == null) ? "nolabel" : label.getDisplayName();
          -        return String.format("%s.%s.provision.request", PREFIX, labelText);
          -    }
           }
          diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          index 892125a1..68560458 100644
          --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          @@ -1,6 +1,5 @@
           package org.csanchez.jenkins.plugins.kubernetes;
           
          -import hudson.model.labels.LabelAtom;
           import org.junit.Assert;
           import org.junit.Test;
           
          @@ -29,20 +28,4 @@ public class MetricNamesTest {
           
                   Assert.assertEquals(expected, actual);
               }
          -
          -    @Test
          -    public void metricNameForLabelAddsNoLabelIfLabelIsNull() {
          -        String expected = "kubernetes.cloud.nolabel.provision.request";
          -        String actual = MetricNames.metricNameForLabel(null);
          -
          -        Assert.assertEquals(expected, actual);
          -    }
          -
          -    @Test
          -    public void metricNameForLabelAddsLabelValue() {
          -        String expected = "kubernetes.cloud.java.provision.request";
          -        String actual = MetricNames.metricNameForLabel(new LabelAtom("java"));
          -
          -        Assert.assertEquals(expected, actual);
          -    }
           }
          diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          index b5b23ac1..a7b43a91 100644
          --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          @@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest {
                           emptyIterable());
           
                   assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0);
          -        assertTrue(Metrics.metricRegistry()
          -                        .meter(MetricNames.metricNameForLabel(Label.parseExpression("runInPod")))
          -                        .getCount()
          -                > 0);
               }
           
               @Test
          -- 
          2.46.1 

          Pay Bas added a comment - - edited I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do. But I'd like a go-ahead before doing so. Proposed patch: Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to  provision.request---  .../plugins/kubernetes/KubernetesCloud.java     |  2 --  .../jenkins/plugins/kubernetes/MetricNames.java |  6 ------  .../plugins/kubernetes/MetricNamesTest.java     | 17 -----------------  .../pipeline/KubernetesPipelineTest.java        |  4 ----  4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java index 10d30fcb..81f5091d 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java @@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes;     import static java.nio.charset.StandardCharsets.UTF_8;   import static org.apache.commons.lang.StringUtils.isEmpty; - import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel;     import com.cloudbees.plugins.credentials.CredentialsMatchers;   import com.cloudbees.plugins.credentials.common.StandardCredentials; @@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup {              @NonNull final Cloud.CloudState state, final int excessWorkload) {           var limitRegistrationResults = new LimitRegistrationResults( this );           try { -            Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload);              Label label = state.getLabel();               // Planned nodes, will be launched on the next round of NodeProvisioner               int plannedCapacity = state.getAdditionalPlannedCapacity(); diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java index ee586771..23a48bc4 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java @@ -1,6 +1,5 @@   package org.csanchez.jenkins.plugins.kubernetes;   - import hudson.model.Label;   import java.util.Locale;     public class MetricNames { @@ -21,9 +20,4 @@ public class MetricNames {           String formattedStatus = status == null ? " null " : status.toLowerCase(Locale.getDefault());           return PREFIX + ".pods.launch.status." + formattedStatus;      } - -     public static String metricNameForLabel(Label label) { -         String labelText = (label == null ) ? "nolabel" : label.getDisplayName(); -         return String .format( "%s.%s.provision.request" , PREFIX, labelText); -    }  } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java index 892125a1..68560458 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java @@ -1,6 +1,5 @@   package org.csanchez.jenkins.plugins.kubernetes;   - import hudson.model.labels.LabelAtom;   import org.junit.Assert;   import org.junit.Test;   @@ -29,20 +28,4 @@ public class MetricNamesTest {            Assert.assertEquals(expected, actual);      } - -    @Test -     public void metricNameForLabelAddsNoLabelIfLabelIsNull() { -         String expected = "kubernetes.cloud.nolabel.provision.request" ; -         String actual = MetricNames.metricNameForLabel( null ); - -        Assert.assertEquals(expected, actual); -    } - -    @Test -     public void metricNameForLabelAddsLabelValue() { -         String expected = "kubernetes.cloud.java.provision.request" ; -         String actual = MetricNames.metricNameForLabel( new LabelAtom( "java" )); - -        Assert.assertEquals(expected, actual); -    }  } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java index b5b23ac1..a7b43a91 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java @@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest {                  emptyIterable());            assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0); -        assertTrue(Metrics.metricRegistry() -                        .meter(MetricNames.metricNameForLabel(Label.parseExpression( "runInPod" ))) -                        .getCount() -                > 0);      }        @Test --  2.46.1

          Mark Waite added a comment -

          paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

          Mark Waite added a comment - paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

          Pay Bas added a comment -

          Pay Bas added a comment - That's fine markewaite   Pull-request: https://github.com/jenkinsci/kubernetes-plugin/pull/1604

            paybas Pay Bas
            paybas Pay Bas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: