Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73788

Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • 4288.v1719f9d0c854

      There appears to be an unhappy interaction between:

      Our https://jenkins.big.corp/prometheus/ Prometheus scrape endpoint is absolutely flooded with metrics like:

      # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
      # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
      kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0

      Around 50k lines in totaal. Which brings the scraping to a crawl.

      But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval.

      The culprit appears to be https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27

      Which is triggered by: https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602

      I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.

      The other metrics I have no problem with:

      kubernetes_cloud_pods_created 13475.0
      kubernetes_cloud_pods_creation_failed 3430.0
      kubernetes_cloud_pods_launched 9742.0
      kubernetes_cloud_pods_terminated 16909.0
      kubernetes_cloud_provision_nodes 16905.0 

      But those kubernetes_cloud_###_provision_request_total entries are killing.

      There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

       

          [JENKINS-73788] Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters

          Pay Bas created issue -
          Pay Bas made changes -
          Environment Original: Jenkins: 2.462.1
          OS: Linux - 6.10.6-200.fc40.x86_64
          Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
          ---
          active-directory:2.36
          allure-jenkins-plugin:2.30.3
          ansicolor:1.0.4
          antisamy-markup-formatter:162.v0e6ec0fcfcf6
          apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
          asm-api:9.7-33.v4d23ef79fcc8
          audit-trail:361.v82cde86c784e
          authentication-tokens:1.119.v50285141b_7e1
          blueocean:1.27.14
          blueocean-autofavorite:1.2.5
          blueocean-bitbucket-pipeline:1.27.14
          blueocean-commons:1.27.14
          blueocean-config:1.27.14
          blueocean-core-js:1.27.14
          blueocean-dashboard:1.27.14
          blueocean-display-url:2.4.3
          blueocean-events:1.27.14
          blueocean-git-pipeline:1.27.14
          blueocean-github-pipeline:1.27.14
          blueocean-i18n:1.27.14
          blueocean-jira:1.27.14
          blueocean-jwt:1.27.14
          blueocean-personalization:1.27.14
          blueocean-pipeline-api-impl:1.27.14
          blueocean-pipeline-editor:1.27.14
          blueocean-pipeline-scm-api:1.27.14
          blueocean-rest:1.27.14
          blueocean-rest-impl:1.27.14
          blueocean-web:1.27.14
          bootstrap5-api:5.3.3-1
          bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
          branch-api:2.1178.v969d9eb_c728e
          build-failure-analyzer:2.5.2
          build-monitor-plugin:1.14-908.vd91a_186b_9121
          caffeine-api:3.1.8-133.v17b_1ff2e0599
          checks-api:2.2.0
          cloudbees-bitbucket-branch-source:888.v8e6d479a_1730
          cloudbees-disk-usage-simple:203.v3f46a_7462b_1a_
          cloudbees-folder:6.942.vb_43318a_156b_2
          command-launcher:115.vd8b_301cc15d0
          commons-compress-api:1.26.1-2
          commons-lang3-api:3.16.0-82.ve2b_07d659d95
          commons-text-api:1.12.0-129.v99a_50df237f7
          credentials:1371.vfee6b_095f0a_3
          credentials-binding:681.vf91669a_32e45
          custom-folder-icon:2.13
          dark-theme:479.v661b_1b_911c01
          data-tables-api:2.1.4-1
          dependency-check-jenkins-plugin:5.5.1
          display-url-api:2.204.vf6fddd8a_8b_e9
          docker-commons:443.v921729d5611d
          docker-workflow:580.vc0c340686b_54
          durable-task:568.v8fb_5c57e8417
          echarts-api:5.5.1-1
          eddsa-api:0.3.0-4.v84c6f0f4969e
          email-ext:1814.v404722f34263
          extended-read-permission:53.v6499940139e5
          favorite:2.221.v19ca_666b_62f5
          font-awesome-api:6.6.0-1
          git:5.4.1
          git-client:5.0.0
          git-server:126.v0d945d8d2b_39
          github:1.40.0
          github-api:1.321-468.v6a_9f5f2d5a_7e
          github-branch-source:1797.v86fdb_4d57d43
          groovy:457.v99900cb_85593
          gson-api:2.11.0-41.v019fcf6125dc
          handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
          htmlpublisher:1.36
          http_request:1.19
          instance-identity:185.v303dc7c645f9
          ionicons-api:74.v93d5eb_813d5f
          jackson2-api:2.17.0-379.v02de8ec9f64c
          jakarta-activation-api:2.1.3-1
          jakarta-mail-api:2.1.3-1
          javadoc:280.v050b_5c849f69
          javax-activation-api:1.2.0-7
          javax-mail-api:1.6.2-10
          jaxb:2.3.9-1
          jdk-tool:80.v8a_dee33ed6f0
          jenkins-design-language:1.27.14
          jersey2-api:2.44-151.v6df377fff741
          jira:3.13
          jjwt-api:0.11.5-112.ve82dfb_224b_a_d
          jobConfigHistory:1241.v07634fa_18896
          joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
          jquery3-api:3.7.1-2
          jsch:0.2.16-86.v42e010d9484b_
          json-api:20240303-41.v94e11e6de726
          json-path-api:2.9.0-58.v62e3e85b_a_655
          junit:1296.vb_f538b_c88630
          kubernetes:4285.v50ed5f624918
          kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
          kubernetes-credentials:189.v90a_488b_d1d65
          lockable-resources:1255.vf48745da_35d0
          logstash:2.5.0218.v0a_ff8fefc12b_
          mailer:472.vf7c289a_4b_420
          matrix-auth:3.2.2
          matrix-project:832.va_66e270d2946
          metrics:4.2.21-451.vd51df8df52ec
          mina-sshd-api-common:2.13.2-125.v200281b_61d59
          mina-sshd-api-core:2.13.2-125.v200281b_61d59
          nexus-jenkins-plugin:3.19.5-01
          okhttp-api:4.11.0-172.vda_da_1feeb_c6e
          openshift-client:1.1.0.424.v829cb_ccf8798
          pipeline-build-step:540.vb_e8849e1a_b_d8
          pipeline-graph-analysis:216.vfd8b_ece330ca_
          pipeline-groovy-lib:730.ve57b_34648c63
          pipeline-input-step:495.ve9c153f6067b_
          pipeline-milestone-step:119.vdfdc43fc3b_9a_
          pipeline-model-api:2.2214.vb_b_34b_2ea_9b_83
          pipeline-model-definition:2.2214.vb_b_34b_2ea_9b_83
          pipeline-model-extensions:2.2214.vb_b_34b_2ea_9b_83
          pipeline-rest-api:2.34
          pipeline-stage-step:312.v8cd10304c27a_
          pipeline-stage-tags-metadata:2.2214.vb_b_34b_2ea_9b_83
          pipeline-stage-view:2.34
          pipeline-utility-steps:2.17.0
          plain-credentials:183.va_de8f1dd5a_2b_
          plugin-util-api:4.1.0
          prometheus:784.vea_eca_f6592eb_
          prometheus-lockable-resources:1.0.0
          pubsub-light:1.18
          resource-disposer:0.23
          scm-api:696.v778d637b_a_762
          script-security:1354.va_70a_fe478c7f
          simple-theme-plugin:196.v96d9592f4efa_
          snakeyaml-api:2.2-121.v5a_68b_9300b_d4
          sonar:2.17.2
          sse-gateway:1.27
          ssh-agent:376.v8933585c69d3
          ssh-credentials:343.v884f71d78167
          ssh-steps:2.0.68.va_d21a_12a_6476
          sshd:3.330.vc866a_8389b_58
          structs:338.v848422169819
          testcafe:1.0
          theme-manager:262.vc57ee4a_eda_5d
          timestamper:1.27
          token-macro:400.v35420b_922dcb_
          trilead-api:2.147.vb_73cc728a_32e
          variant:60.v7290fc0eb_b_cd
          webhook-step:342.v620877effe14
          workflow-aggregator:600.vb_57cdd26fdd7
          workflow-api:1336.vee415d95c521
          workflow-basic-steps:1058.vcb_fc1e3a_21a_9
          workflow-cps:3953.v19f11da_8d2fa_
          workflow-durable-task-step:1371.vb_7cec8f3b_95e
          workflow-job:1436.vfa_244484591f
          workflow-multibranch:795.ve0cb_1f45ca_9a_
          workflow-scm-step:427.v4ca_6512e7df1
          workflow-step-api:678.v3ee58b_469476
          workflow-support:920.v59f71ce16f04
          ws-cleanup:0.46
          New: Jenkins: 2.462.1
          OS: Linux - 6.10.6-200.fc40.x86_64
          Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
          ---
          build-failure-analyzer:2.5.2
          kubernetes:4285.v50ed5f624918
          kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
          kubernetes-credentials:189.v90a_488b_d1d65
          lockable-resources:1255.vf48745da_35d0
          logstash:2.5.0218.v0a_ff8fefc12b_
          metrics:4.2.21-451.vd51df8df52ec
          prometheus:784.vea_eca_f6592eb_
          prometheus-lockable-resources:1.0.0
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a string feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          The other metrics I have no problem with:

           
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          The other metrics I have no problem with:

           
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

          *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          Pay Bas made changes -
          Description Original: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

          *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           
          New: There appears to be an unhappy interaction between:
           * [https://plugins.jenkins.io/metrics/]
           * [https://plugins.jenkins.io/prometheus/]
           * [https://plugins.jenkins.io/kubernetes/]

          Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
          {code:java}
          # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
          # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
          kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
          Around 50k lines in totaal. Which brings the scraping to a crawl.

          But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval.

          The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

          Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

          *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

          The other metrics I have no problem with:
          {code:java}
          kubernetes_cloud_pods_created 13475.0
          kubernetes_cloud_pods_creation_failed 3430.0
          kubernetes_cloud_pods_launched 9742.0
          kubernetes_cloud_pods_terminated 16909.0
          kubernetes_cloud_provision_nodes 16905.0 {code}
          But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

          There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

           

          Pay Bas added a comment - - edited

          I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do.

          But I'd like a go-ahead before doing so.

          Proposed patch:

          Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to
           provision.request---
           .../plugins/kubernetes/KubernetesCloud.java     |  2 --
           .../jenkins/plugins/kubernetes/MetricNames.java |  6 ------
           .../plugins/kubernetes/MetricNamesTest.java     | 17 -----------------
           .../pipeline/KubernetesPipelineTest.java        |  4 ----
           4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          index 10d30fcb..81f5091d 100644
          --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
          @@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes;
           
           import static java.nio.charset.StandardCharsets.UTF_8;
           import static org.apache.commons.lang.StringUtils.isEmpty;
          -import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel;
           
           import com.cloudbees.plugins.credentials.CredentialsMatchers;
           import com.cloudbees.plugins.credentials.common.StandardCredentials;
          @@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup {
                       @NonNull final Cloud.CloudState state, final int excessWorkload) {
                   var limitRegistrationResults = new LimitRegistrationResults(this);
                   try {
          -            Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload);
                       Label label = state.getLabel();
                       // Planned nodes, will be launched on the next round of NodeProvisioner
                       int plannedCapacity = state.getAdditionalPlannedCapacity();
          diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          index ee586771..23a48bc4 100644
          --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
          @@ -1,6 +1,5 @@
           package org.csanchez.jenkins.plugins.kubernetes;
           
          -import hudson.model.Label;
           import java.util.Locale;
           
           public class MetricNames {
          @@ -21,9 +20,4 @@ public class MetricNames {
                   String formattedStatus = status == null ? "null" : status.toLowerCase(Locale.getDefault());
                   return PREFIX + ".pods.launch.status." + formattedStatus;
               }
          -
          -    public static String metricNameForLabel(Label label) {
          -        String labelText = (label == null) ? "nolabel" : label.getDisplayName();
          -        return String.format("%s.%s.provision.request", PREFIX, labelText);
          -    }
           }
          diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          index 892125a1..68560458 100644
          --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
          @@ -1,6 +1,5 @@
           package org.csanchez.jenkins.plugins.kubernetes;
           
          -import hudson.model.labels.LabelAtom;
           import org.junit.Assert;
           import org.junit.Test;
           
          @@ -29,20 +28,4 @@ public class MetricNamesTest {
           
                   Assert.assertEquals(expected, actual);
               }
          -
          -    @Test
          -    public void metricNameForLabelAddsNoLabelIfLabelIsNull() {
          -        String expected = "kubernetes.cloud.nolabel.provision.request";
          -        String actual = MetricNames.metricNameForLabel(null);
          -
          -        Assert.assertEquals(expected, actual);
          -    }
          -
          -    @Test
          -    public void metricNameForLabelAddsLabelValue() {
          -        String expected = "kubernetes.cloud.java.provision.request";
          -        String actual = MetricNames.metricNameForLabel(new LabelAtom("java"));
          -
          -        Assert.assertEquals(expected, actual);
          -    }
           }
          diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          index b5b23ac1..a7b43a91 100644
          --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
          @@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest {
                           emptyIterable());
           
                   assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0);
          -        assertTrue(Metrics.metricRegistry()
          -                        .meter(MetricNames.metricNameForLabel(Label.parseExpression("runInPod")))
          -                        .getCount()
          -                > 0);
               }
           
               @Test
          -- 
          2.46.1 

          Pay Bas added a comment - - edited I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do. But I'd like a go-ahead before doing so. Proposed patch: Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to  provision.request---  .../plugins/kubernetes/KubernetesCloud.java     |  2 --  .../jenkins/plugins/kubernetes/MetricNames.java |  6 ------  .../plugins/kubernetes/MetricNamesTest.java     | 17 -----------------  .../pipeline/KubernetesPipelineTest.java        |  4 ----  4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java index 10d30fcb..81f5091d 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java @@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes;     import static java.nio.charset.StandardCharsets.UTF_8;   import static org.apache.commons.lang.StringUtils.isEmpty; - import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel;     import com.cloudbees.plugins.credentials.CredentialsMatchers;   import com.cloudbees.plugins.credentials.common.StandardCredentials; @@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup {              @NonNull final Cloud.CloudState state, final int excessWorkload) {           var limitRegistrationResults = new LimitRegistrationResults( this );           try { -            Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload);              Label label = state.getLabel();               // Planned nodes, will be launched on the next round of NodeProvisioner               int plannedCapacity = state.getAdditionalPlannedCapacity(); diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java index ee586771..23a48bc4 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java @@ -1,6 +1,5 @@   package org.csanchez.jenkins.plugins.kubernetes;   - import hudson.model.Label;   import java.util.Locale;     public class MetricNames { @@ -21,9 +20,4 @@ public class MetricNames {           String formattedStatus = status == null ? " null " : status.toLowerCase(Locale.getDefault());           return PREFIX + ".pods.launch.status." + formattedStatus;      } - -     public static String metricNameForLabel(Label label) { -         String labelText = (label == null ) ? "nolabel" : label.getDisplayName(); -         return String .format( "%s.%s.provision.request" , PREFIX, labelText); -    }  } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java index 892125a1..68560458 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java @@ -1,6 +1,5 @@   package org.csanchez.jenkins.plugins.kubernetes;   - import hudson.model.labels.LabelAtom;   import org.junit.Assert;   import org.junit.Test;   @@ -29,20 +28,4 @@ public class MetricNamesTest {            Assert.assertEquals(expected, actual);      } - -    @Test -     public void metricNameForLabelAddsNoLabelIfLabelIsNull() { -         String expected = "kubernetes.cloud.nolabel.provision.request" ; -         String actual = MetricNames.metricNameForLabel( null ); - -        Assert.assertEquals(expected, actual); -    } - -    @Test -     public void metricNameForLabelAddsLabelValue() { -         String expected = "kubernetes.cloud.java.provision.request" ; -         String actual = MetricNames.metricNameForLabel( new LabelAtom( "java" )); - -        Assert.assertEquals(expected, actual); -    }  } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java index b5b23ac1..a7b43a91 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java @@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest {                  emptyIterable());            assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0); -        assertTrue(Metrics.metricRegistry() -                        .meter(MetricNames.metricNameForLabel(Label.parseExpression( "runInPod" ))) -                        .getCount() -                > 0);      }        @Test --  2.46.1

          Mark Waite added a comment -

          paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

          Mark Waite added a comment - paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

            paybas Pay Bas
            paybas Pay Bas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: