[JENKINS-73788] Kubernetes plugin creates massive amounts of Prometheus "kubernetes_cloud_###_provision_request_total" counters

Pay Bas created issue - 2024-09-17 09:52

Pay Bas made changes - 2024-09-17 09:55

Environment

Original: Jenkins: 2.462.1
OS: Linux - 6.10.6-200.fc40.x86_64
Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
---
active-directory:2.36
allure-jenkins-plugin:2.30.3
ansicolor:1.0.4
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
asm-api:9.7-33.v4d23ef79fcc8
audit-trail:361.v82cde86c784e
authentication-tokens:1.119.v50285141b_7e1
blueocean:1.27.14
blueocean-autofavorite:1.2.5
blueocean-bitbucket-pipeline:1.27.14
blueocean-commons:1.27.14
blueocean-config:1.27.14
blueocean-core-js:1.27.14
blueocean-dashboard:1.27.14
blueocean-display-url:2.4.3
blueocean-events:1.27.14
blueocean-git-pipeline:1.27.14
blueocean-github-pipeline:1.27.14
blueocean-i18n:1.27.14
blueocean-jira:1.27.14
blueocean-jwt:1.27.14
blueocean-personalization:1.27.14
blueocean-pipeline-api-impl:1.27.14
blueocean-pipeline-editor:1.27.14
blueocean-pipeline-scm-api:1.27.14
blueocean-rest:1.27.14
blueocean-rest-impl:1.27.14
blueocean-web:1.27.14
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
branch-api:2.1178.v969d9eb_c728e
build-failure-analyzer:2.5.2
build-monitor-plugin:1.14-908.vd91a_186b_9121
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.2.0
cloudbees-bitbucket-branch-source:888.v8e6d479a_1730
cloudbees-disk-usage-simple:203.v3f46a_7462b_1a_
cloudbees-folder:6.942.vb_43318a_156b_2
command-launcher:115.vd8b_301cc15d0
commons-compress-api:1.26.1-2
commons-lang3-api:3.16.0-82.ve2b_07d659d95
commons-text-api:1.12.0-129.v99a_50df237f7
credentials:1371.vfee6b_095f0a_3
credentials-binding:681.vf91669a_32e45
custom-folder-icon:2.13
dark-theme:479.v661b_1b_911c01
data-tables-api:2.1.4-1
dependency-check-jenkins-plugin:5.5.1
display-url-api:2.204.vf6fddd8a_8b_e9
docker-commons:443.v921729d5611d
docker-workflow:580.vc0c340686b_54
durable-task:568.v8fb_5c57e8417
echarts-api:5.5.1-1
eddsa-api:0.3.0-4.v84c6f0f4969e
email-ext:1814.v404722f34263
extended-read-permission:53.v6499940139e5
favorite:2.221.v19ca_666b_62f5
font-awesome-api:6.6.0-1
git:5.4.1
git-client:5.0.0
git-server:126.v0d945d8d2b_39
github:1.40.0
github-api:1.321-468.v6a_9f5f2d5a_7e
github-branch-source:1797.v86fdb_4d57d43
groovy:457.v99900cb_85593
gson-api:2.11.0-41.v019fcf6125dc
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
htmlpublisher:1.36
http_request:1.19
instance-identity:185.v303dc7c645f9
ionicons-api:74.v93d5eb_813d5f
jackson2-api:2.17.0-379.v02de8ec9f64c
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javadoc:280.v050b_5c849f69
javax-activation-api:1.2.0-7
javax-mail-api:1.6.2-10
jaxb:2.3.9-1
jdk-tool:80.v8a_dee33ed6f0
jenkins-design-language:1.27.14
jersey2-api:2.44-151.v6df377fff741
jira:3.13
jjwt-api:0.11.5-112.ve82dfb_224b_a_d
jobConfigHistory:1241.v07634fa_18896
joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-58.v62e3e85b_a_655
junit:1296.vb_f538b_c88630
kubernetes:4285.v50ed5f624918
kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
kubernetes-credentials:189.v90a_488b_d1d65
lockable-resources:1255.vf48745da_35d0
logstash:2.5.0218.v0a_ff8fefc12b_
mailer:472.vf7c289a_4b_420
matrix-auth:3.2.2
matrix-project:832.va_66e270d2946
metrics:4.2.21-451.vd51df8df52ec
mina-sshd-api-common:2.13.2-125.v200281b_61d59
mina-sshd-api-core:2.13.2-125.v200281b_61d59
nexus-jenkins-plugin:3.19.5-01
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
openshift-client:1.1.0.424.v829cb_ccf8798
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-groovy-lib:730.ve57b_34648c63
pipeline-input-step:495.ve9c153f6067b_
pipeline-milestone-step:119.vdfdc43fc3b_9a_
pipeline-model-api:2.2214.vb_b_34b_2ea_9b_83
pipeline-model-definition:2.2214.vb_b_34b_2ea_9b_83
pipeline-model-extensions:2.2214.vb_b_34b_2ea_9b_83
pipeline-rest-api:2.34
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2214.vb_b_34b_2ea_9b_83
pipeline-stage-view:2.34
pipeline-utility-steps:2.17.0
plain-credentials:183.va_de8f1dd5a_2b_
plugin-util-api:4.1.0
prometheus:784.vea_eca_f6592eb_
prometheus-lockable-resources:1.0.0
pubsub-light:1.18
resource-disposer:0.23
scm-api:696.v778d637b_a_762
script-security:1354.va_70a_fe478c7f
simple-theme-plugin:196.v96d9592f4efa_
snakeyaml-api:2.2-121.v5a_68b_9300b_d4
sonar:2.17.2
sse-gateway:1.27
ssh-agent:376.v8933585c69d3
ssh-credentials:343.v884f71d78167
ssh-steps:2.0.68.va_d21a_12a_6476
sshd:3.330.vc866a_8389b_58
structs:338.v848422169819
testcafe:1.0
theme-manager:262.vc57ee4a_eda_5d
timestamper:1.27
token-macro:400.v35420b_922dcb_
trilead-api:2.147.vb_73cc728a_32e
variant:60.v7290fc0eb_b_cd
webhook-step:342.v620877effe14
workflow-aggregator:600.vb_57cdd26fdd7
workflow-api:1336.vee415d95c521
workflow-basic-steps:1058.vcb_fc1e3a_21a_9
workflow-cps:3953.v19f11da_8d2fa_
workflow-durable-task-step:1371.vb_7cec8f3b_95e
workflow-job:1436.vfa_244484591f
workflow-multibranch:795.ve0cb_1f45ca_9a_
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:678.v3ee58b_469476
workflow-support:920.v59f71ce16f04
ws-cleanup:0.46

New: Jenkins: 2.462.1
OS: Linux - 6.10.6-200.fc40.x86_64
Java: 11.0.24 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
---
build-failure-analyzer:2.5.2
kubernetes:4285.v50ed5f624918
kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
kubernetes-credentials:189.v90a_488b_d1d65
lockable-resources:1255.vf48745da_35d0
logstash:2.5.0218.v0a_ff8fefc12b_
metrics:4.2.21-451.vd51df8df52ec
prometheus:784.vea_eca_f6592eb_
prometheus-lockable-resources:1.0.0

Pay Bas made changes - 2024-09-17 09:56

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a string feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Pay Bas made changes - 2024-09-17 10:24

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

The other metrics I have no problem with:

{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

Pay Bas made changes - 2024-09-17 10:25

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

The other metrics I have no problem with:

{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

Pay Bas made changes - 2024-09-17 10:33

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

Pay Bas made changes - 2024-09-17 10:41

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

*I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

Pay Bas made changes - 2024-09-17 10:46

Description

Original: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

*I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

New: There appears to be an unhappy interaction between:
* [https://plugins.jenkins.io/metrics/]
* [https://plugins.jenkins.io/prometheus/]
* [https://plugins.jenkins.io/kubernetes/]

Our [https://jenkins.big.corp/prometheus/] Prometheus scrape endpoint is absolutely flooded with metrics like:
{code:java}
# HELP kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total Generated from Dropwizard metric import (metric=kubernetes.cloud.myproj_master_7116-zvf9t.provision.request, type=com.codahale.metrics.Meter)
# TYPE kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total counter
kubernetes_cloud_myproj_master_7116_zvf9t_provision_request_total 1.0{code}
Around 50k lines in totaal. Which brings the scraping to a crawl.

But more importantly: I have a strong feeling that it is actually slowing Jenkins down as well, since the data is passed to the Metrics plugin, which in turn feeds the Prometheus plugin. All this data is processed every generation interval.

The culprit appears to be [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java#L27]

Which is triggered by: [https://github.com/jenkinsci/kubernetes-plugin/blob/4287.v73451380b_576/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java#L602]

*I feel this should simply be removed unless there is a clear reason to keep it. It has the potential to balloon the active Metrics to astronomical proportions, which could lead to stability issues.*

The other metrics I have no problem with:
{code:java}
kubernetes_cloud_pods_created 13475.0
kubernetes_cloud_pods_creation_failed 3430.0
kubernetes_cloud_pods_launched 9742.0
kubernetes_cloud_pods_terminated 16909.0
kubernetes_cloud_provision_nodes 16905.0 {code}
But those {{kubernetes_cloud_###_provision_request_total}} entries are killing.

There doesn't appear to be a mechanism to prevent the Kubernetes plugin from generating these metrics.

Pay Bas added a comment - 2024-09-17 11:00 - edited

I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do.

But I'd like a go-ahead before doing so.

Proposed patch:

Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to
 provision.request---
 .../plugins/kubernetes/KubernetesCloud.java     |  2 --
 .../jenkins/plugins/kubernetes/MetricNames.java |  6 ------
 .../plugins/kubernetes/MetricNamesTest.java     | 17 -----------------
 .../pipeline/KubernetesPipelineTest.java        |  4 ----
 4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
index 10d30fcb..81f5091d 100644
--- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
+++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java
@@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes;
 
 import static java.nio.charset.StandardCharsets.UTF_8;
 import static org.apache.commons.lang.StringUtils.isEmpty;
-import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel;
 
 import com.cloudbees.plugins.credentials.CredentialsMatchers;
 import com.cloudbees.plugins.credentials.common.StandardCredentials;
@@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup {
             @NonNull final Cloud.CloudState state, final int excessWorkload) {
         var limitRegistrationResults = new LimitRegistrationResults(this);
         try {
-            Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload);
             Label label = state.getLabel();
             // Planned nodes, will be launched on the next round of NodeProvisioner
             int plannedCapacity = state.getAdditionalPlannedCapacity();
diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
index ee586771..23a48bc4 100644
--- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
+++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java
@@ -1,6 +1,5 @@
 package org.csanchez.jenkins.plugins.kubernetes;
 
-import hudson.model.Label;
 import java.util.Locale;
 
 public class MetricNames {
@@ -21,9 +20,4 @@ public class MetricNames {
         String formattedStatus = status == null ? "null" : status.toLowerCase(Locale.getDefault());
         return PREFIX + ".pods.launch.status." + formattedStatus;
     }
-
-    public static String metricNameForLabel(Label label) {
-        String labelText = (label == null) ? "nolabel" : label.getDisplayName();
-        return String.format("%s.%s.provision.request", PREFIX, labelText);
-    }
 }
diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
index 892125a1..68560458 100644
--- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
+++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java
@@ -1,6 +1,5 @@
 package org.csanchez.jenkins.plugins.kubernetes;
 
-import hudson.model.labels.LabelAtom;
 import org.junit.Assert;
 import org.junit.Test;
 
@@ -29,20 +28,4 @@ public class MetricNamesTest {
 
         Assert.assertEquals(expected, actual);
     }
-
-    @Test
-    public void metricNameForLabelAddsNoLabelIfLabelIsNull() {
-        String expected = "kubernetes.cloud.nolabel.provision.request";
-        String actual = MetricNames.metricNameForLabel(null);
-
-        Assert.assertEquals(expected, actual);
-    }
-
-    @Test
-    public void metricNameForLabelAddsLabelValue() {
-        String expected = "kubernetes.cloud.java.provision.request";
-        String actual = MetricNames.metricNameForLabel(new LabelAtom("java"));
-
-        Assert.assertEquals(expected, actual);
-    }
 }
diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
index b5b23ac1..a7b43a91 100644
--- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
+++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java
@@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest {
                 emptyIterable());
 
         assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0);
-        assertTrue(Metrics.metricRegistry()
-                        .meter(MetricNames.metricNameForLabel(Label.parseExpression("runInPod")))
-                        .getCount()
-                > 0);
     }
 
     @Test
-- 
2.46.1

Pay Bas added a comment - 2024-09-17 11:00 - edited I'm happy to submit a PR which removes the problematic functionality, since it's trivial to do. But I'd like a go-ahead before doing so. Proposed patch: Subject: [PATCH] JENKINS-73788 Reduce metrics bloat relating to provision.request--- .../plugins/kubernetes/KubernetesCloud.java | 2 -- .../jenkins/plugins/kubernetes/MetricNames.java | 6 ------ .../plugins/kubernetes/MetricNamesTest.java | 17 ----------------- .../pipeline/KubernetesPipelineTest.java | 4 ---- 4 files changed, 29 deletions(-)diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java index 10d30fcb..81f5091d 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud.java @@ -2,7 +2,6 @@ package org.csanchez.jenkins.plugins.kubernetes; import static java.nio.charset.StandardCharsets.UTF_8; import static org.apache.commons.lang.StringUtils.isEmpty; - import static org.csanchez.jenkins.plugins.kubernetes.MetricNames.metricNameForLabel; import com.cloudbees.plugins.credentials.CredentialsMatchers; import com.cloudbees.plugins.credentials.common.StandardCredentials; @@ -599,7 +598,6 @@ public class KubernetesCloud extends Cloud implements PodTemplateGroup { @NonNull final Cloud.CloudState state, final int excessWorkload) { var limitRegistrationResults = new LimitRegistrationResults( this ); try { - Metrics.metricRegistry().meter(metricNameForLabel(state.getLabel())).mark(excessWorkload); Label label = state.getLabel(); // Planned nodes, will be launched on the next round of NodeProvisioner int plannedCapacity = state.getAdditionalPlannedCapacity(); diff --git a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java index ee586771..23a48bc4 100644 --- a/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java +++ b/src/main/java/org/csanchez/jenkins/plugins/kubernetes/MetricNames.java @@ -1,6 +1,5 @@ package org.csanchez.jenkins.plugins.kubernetes; - import hudson.model.Label; import java.util.Locale; public class MetricNames { @@ -21,9 +20,4 @@ public class MetricNames { String formattedStatus = status == null ? " null " : status.toLowerCase(Locale.getDefault()); return PREFIX + ".pods.launch.status." + formattedStatus; } - - public static String metricNameForLabel(Label label) { - String labelText = (label == null ) ? "nolabel" : label.getDisplayName(); - return String .format( "%s.%s.provision.request" , PREFIX, labelText); - } } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java index 892125a1..68560458 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/MetricNamesTest.java @@ -1,6 +1,5 @@ package org.csanchez.jenkins.plugins.kubernetes; - import hudson.model.labels.LabelAtom; import org.junit.Assert; import org.junit.Test; @@ -29,20 +28,4 @@ public class MetricNamesTest { Assert.assertEquals(expected, actual); } - - @Test - public void metricNameForLabelAddsNoLabelIfLabelIsNull() { - String expected = "kubernetes.cloud.nolabel.provision.request" ; - String actual = MetricNames.metricNameForLabel( null ); - - Assert.assertEquals(expected, actual); - } - - @Test - public void metricNameForLabelAddsLabelValue() { - String expected = "kubernetes.cloud.java.provision.request" ; - String actual = MetricNames.metricNameForLabel( new LabelAtom( "java" )); - - Assert.assertEquals(expected, actual); - } } diff --git a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java index b5b23ac1..a7b43a91 100644 --- a/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java +++ b/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/KubernetesPipelineTest.java @@ -252,10 +252,6 @@ public class KubernetesPipelineTest extends AbstractKubernetesPipelineTest { emptyIterable()); assertTrue(Metrics.metricRegistry().counter(MetricNames.PODS_LAUNCHED).getCount() > 0); - assertTrue(Metrics.metricRegistry() - .meter(MetricNames.metricNameForLabel(Label.parseExpression( "runInPod" ))) - .getCount() - > 0); } @Test -- 2.46.1

Mark Waite added a comment - 2024-09-17 11:58

paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

Mark Waite added a comment - 2024-09-17 11:58 paybas I think that the best "go-ahead" will come from a review of the pull request. I'm not sure if the maintainers of the Kubernetes plugin are regularly reviewing Jira issues. I believe they are more likely to be regularly reviewing pull requests than Jira issues.

Details

Description

Attachments

Issue Links

Activity

Collapse comment: Pay Bas added a comment - 2024-09-17 11:00, Edited by Pay Bas - 2024-09-17 11:57

Expand comment: Pay Bas added a comment - 2024-09-17 11:00, Edited by Pay Bas - 2024-09-17 11:57

Collapse comment: Mark Waite added a comment - 2024-09-17 11:58

Expand comment: Mark Waite added a comment - 2024-09-17 11:58

People

Dates