-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Jenkins: 2.462.1
OS: Linux - 6.10.6-200.fc40.x86_64
Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
---
build-failure-analyzer:2.5.2
kubernetes:4285.v50ed5f624918
kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
kubernetes-credentials:189.v90a_488b_d1d65
lockable-resources:1255.vf48745da_35d0
logstash:2.5.0218.v0a_ff8fefc12b_
metrics:4.2.21-451.vd51df8df52ec
prometheus:784.vea_eca_f6592eb_
prometheus-lockable-resources:1.0.0Jenkins: 2.462.1 OS: Linux - 6.10.6-200.fc40.x86_64 Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM) --- build-failure-analyzer:2.5.2 kubernetes:4285.v50ed5f624918 kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2 kubernetes-credentials:189.v90a_488b_d1d65 lockable-resources:1255.vf48745da_35d0 logstash:2.5.0218.v0a_ff8fefc12b_ metrics:4.2.21-451.vd51df8df52ec prometheus:784.vea_eca_f6592eb_ prometheus-lockable-resources:1.0.0
-
-
4288.v1719f9d0c854
There appears to be an unhappy interaction between:
- https://plugins.jenkins.io/metrics/
- https://plugins.jenkins.io/prometheus/
- https://plugins.jenkins.io/kubernetes/
Our https://jenkins.big.corp/prometheus/ Prometheus scrape endpoint is absolutely flooded with metrics like:
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0
Around 50k lines in totaal. Which brings the scraping to a crawl.
But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval.
The culprit appears to be https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27
Which is triggered by: https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602
I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.
The other metrics I have no problem with:
kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0
But those kubernetes_cloud_###_provision_request_total entries are killing.
There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.
[JENKINS-73788] Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters
Environment |
Original:
Jenkins: 2.462.1
OS: Linux - 6.10.6-200.fc40.x86_64 Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM) --- active-directory:2.36 allure-jenkins-plugin:2.30.3 ansicolor:1.0.4 antisamy-markup-formatter:162.v0e6ec0fcfcf6 apache-httpcomponents-client-4-api:4.5.14-208.v438351942757 asm-api:9.7-33.v4d23ef79fcc8 audit-trail:361.v82cde86c784e authentication-tokens:1.119.v50285141b_7e1 blueocean:1.27.14 blueocean-autofavorite:1.2.5 blueocean-bitbucket-pipeline:1.27.14 blueocean-commons:1.27.14 blueocean-config:1.27.14 blueocean-core-js:1.27.14 blueocean-dashboard:1.27.14 blueocean-display-url:2.4.3 blueocean-events:1.27.14 blueocean-git-pipeline:1.27.14 blueocean-github-pipeline:1.27.14 blueocean-i18n:1.27.14 blueocean-jira:1.27.14 blueocean-jwt:1.27.14 blueocean-personalization:1.27.14 blueocean-pipeline-api-impl:1.27.14 blueocean-pipeline-editor:1.27.14 blueocean-pipeline-scm-api:1.27.14 blueocean-rest:1.27.14 blueocean-rest-impl:1.27.14 blueocean-web:1.27.14 bootstrap5-api:5.3.3-1 bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_ branch-api:2.1178.v969d9eb_c728e build-failure-analyzer:2.5.2 build-monitor-plugin:1.14-908.vd91a_186b_9121 caffeine-api:3.1.8-133.v17b_1ff2e0599 checks-api:2.2.0 cloudbees-bitbucket-branch-source:888.v8e6d479a_1730 cloudbees-disk-usage-simple:203.v3f46a_7462b_1a_ cloudbees-folder:6.942.vb_43318a_156b_2 command-launcher:115.vd8b_301cc15d0 commons-compress-api:1.26.1-2 commons-lang3-api:3.16.0-82.ve2b_07d659d95 commons-text-api:1.12.0-129.v99a_50df237f7 credentials:1371.vfee6b_095f0a_3 credentials-binding:681.vf91669a_32e45 custom-folder-icon:2.13 dark-theme:479.v661b_1b_911c01 data-tables-api:2.1.4-1 dependency-check-jenkins-plugin:5.5.1 display-url-api:2.204.vf6fddd8a_8b_e9 docker-commons:443.v921729d5611d docker-workflow:580.vc0c340686b_54 durable-task:568.v8fb_5c57e8417 echarts-api:5.5.1-1 eddsa-api:0.3.0-4.v84c6f0f4969e email-ext:1814.v404722f34263 extended-read-permission:53.v6499940139e5 favorite:2.221.v19ca_666b_62f5 font-awesome-api:6.6.0-1 git:5.4.1 git-client:5.0.0 git-server:126.v0d945d8d2b_39 github:1.40.0 github-api:1.321-468.v6a_9f5f2d5a_7e github-branch-source:1797.v86fdb_4d57d43 groovy:457.v99900cb_85593 gson-api:2.11.0-41.v019fcf6125dc handy-uri-templates-2-api:2.1.8-30.v7e777411b_148 htmlpublisher:1.36 http_request:1.19 instance-identity:185.v303dc7c645f9 ionicons-api:74.v93d5eb_813d5f jackson2-api:2.17.0-379.v02de8ec9f64c jakarta-activation-api:2.1.3-1 jakarta-mail-api:2.1.3-1 javadoc:280.v050b_5c849f69 javax-activation-api:1.2.0-7 javax-mail-api:1.6.2-10 jaxb:2.3.9-1 jdk-tool:80.v8a_dee33ed6f0 jenkins-design-language:1.27.14 jersey2-api:2.44-151.v6df377fff741 jira:3.13 jjwt-api:0.11.5-112.ve82dfb_224b_a_d jobConfigHistory:1241.v07634fa_18896 joda-time-api:2.12.7-29.v5a_b_e3a_82269a_ jquery3-api:3.7.1-2 jsch:0.2.16-86.v42e010d9484b_ json-api:20240303-41.v94e11e6de726 json-path-api:2.9.0-58.v62e3e85b_a_655 junit:1296.vb_f538b_c88630 kubernetes:4285.v50ed5f624918 kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2 kubernetes-credentials:189.v90a_488b_d1d65 lockable-resources:1255.vf48745da_35d0 logstash:2.5.0218.v0a_ff8fefc12b_ mailer:472.vf7c289a_4b_420 matrix-auth:3.2.2 matrix-project:832.va_66e270d2946 metrics:4.2.21-451.vd51df8df52ec mina-sshd-api-common:2.13.2-125.v200281b_61d59 mina-sshd-api-core:2.13.2-125.v200281b_61d59 nexus-jenkins-plugin:3.19.5-01 okhttp-api:4.11.0-172.vda_da_1feeb_c6e openshift-client:1.1.0.424.v829cb_ccf8798 pipeline-build-step:540.vb_e8849e1a_b_d8 pipeline-graph-analysis:216.vfd8b_ece330ca_ pipeline-groovy-lib:730.ve57b_34648c63 pipeline-input-step:495.ve9c153f6067b_ pipeline-milestone-step:119.vdfdc43fc3b_9a_ pipeline-model-api:2.2214.vb_b_34b_2ea_9b_83 pipeline-model-definition:2.2214.vb_b_34b_2ea_9b_83 pipeline-model-extensions:2.2214.vb_b_34b_2ea_9b_83 pipeline-rest-api:2.34 pipeline-stage-step:312.v8cd10304c27a_ pipeline-stage-tags-metadata:2.2214.vb_b_34b_2ea_9b_83 pipeline-stage-view:2.34 pipeline-utility-steps:2.17.0 plain-credentials:183.va_de8f1dd5a_2b_ plugin-util-api:4.1.0 prometheus:784.vea_eca_f6592eb_ prometheus-lockable-resources:1.0.0 pubsub-light:1.18 resource-disposer:0.23 scm-api:696.v778d637b_a_762 script-security:1354.va_70a_fe478c7f simple-theme-plugin:196.v96d9592f4efa_ snakeyaml-api:2.2-121.v5a_68b_9300b_d4 sonar:2.17.2 sse-gateway:1.27 ssh-agent:376.v8933585c69d3 ssh-credentials:343.v884f71d78167 ssh-steps:2.0.68.va_d21a_12a_6476 sshd:3.330.vc866a_8389b_58 structs:338.v848422169819 testcafe:1.0 theme-manager:262.vc57ee4a_eda_5d timestamper:1.27 token-macro:400.v35420b_922dcb_ trilead-api:2.147.vb_73cc728a_32e variant:60.v7290fc0eb_b_cd webhook-step:342.v620877effe14 workflow-aggregator:600.vb_57cdd26fdd7 workflow-api:1336.vee415d95c521 workflow-basic-steps:1058.vcb_fc1e3a_21a_9 workflow-cps:3953.v19f11da_8d2fa_ workflow-durable-task-step:1371.vb_7cec8f3b_95e workflow-job:1436.vfa_244484591f workflow-multibranch:795.ve0cb_1f45ca_9a_ workflow-scm-step:427.v4ca_6512e7df1 workflow-step-api:678.v3ee58b_469476 workflow-support:920.v59f71ce16f04 ws-cleanup:0.46 |
New:
Jenkins: 2.462.1
OS: Linux - 6.10.6-200.fc40.x86_64 Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM) --- build-failure-analyzer:2.5.2 kubernetes:4285.v50ed5f624918 kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2 kubernetes-credentials:189.v90a_488b_d1d65 lockable-resources:1255.vf48745da_35d0 logstash:2.5.0218.v0a_ff8fefc12b_ metrics:4.2.21-451.vd51df8df52ec prometheus:784.vea_eca_f6592eb_ prometheus-lockable-resources:1.0.0 |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a string feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602] The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602] *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.* The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
Description |
Original:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602] *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.* The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
New:
There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/] * [https://plugins.jenkins.io/prometheus/] * [https://plugins.jenkins.io/kubernetes/] Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like: {code:java} # HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter) # TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code} Around 50k lines in totaal. Which brings the scraping to a crawl. But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval. The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27] Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602] *I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.* The other metrics I have no problem with: {code:java} kubernetes_cloud_pods_created 13475.0 kubernetes_cloud_pods_creation_failed 3430.0 kubernetes_cloud_pods_launched 9742.0 kubernetes_cloud_pods_terminated 16909.0 kubernetes_cloud_provision_nodes 16905.0 {code} But those {{kubernetes_cloud_###_provision_request_total}} entries are killing. There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics. |
I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do.
But I'd like a go-ahead before doing so.
Proposed patch: