[JENKINS-70388] Jenkins OOM when agent nodes alway keep running - Jenkins Jira

Type: Bug
Resolution: Unresolved
Priority: Major
Component/s: workflow-api-plugin
Labels:
None
Environment:

Hide
Jenkins version: 2.319.1
Kubernets version: v1.22.4

Installed Plugins:
jdk-tool:1.0
command-launcher:1.2
ace-editor:1.1
apache-httpcomponents-client-4-api:4.5.13-1.0
authentication-tokens:1.4
authorize-project:1.3.0
bootstrap4-api:4.6.0-3
bouncycastle-api:2.20
branch-api:2.6.2
caffeine-api:2.9.1-23.v51c4e2c879c8
checks-api:1.7.0
cloudbees-folder:6.15
configuration-as-code:1.50
credentials-binding:1.24
credentials:2.4.1
display-url-api:2.3.4
durable-task:1.36
echarts-api:5.1.0-2
font-awesome-api:5.15.3-2
git-client:3.6.0
git-server:1.9
git:3.11.0
handlebars:3.0.8
jackson2-api:2.12.3
jira:3.1.3
jquery3-api:3.6.0-1
jsch:0.1.55.2
junit:1.49
kubernetes-client-api:4.13.3-1
kubernetes-credentials:0.8.0
kubernetes:1.27.2
lockable-resources:2.10
mailer:1.34
matrix-auth:2.6.4
matrix-project:1.18
momentjs:1.1.1
pipeline-build-step:2.13
pipeline-graph-analysis:1.10
pipeline-input-step:2.12
pipeline-milestone-step:1.3.2
pipeline-model-api:1.8.4
pipeline-model-definition:1.7.2
pipeline-model-extensions:1.8.4
pipeline-rest-api:2.19
pipeline-stage-step:2.5
pipeline-stage-tags-metadata:1.8.4
pipeline-stage-view:2.19
plain-credentials:1.7
plugin-util-api:2.2.0
popper-api:1.16.1-2
scm-api:2.6.4
script-security:1.77
snakeyaml-api:1.27.0
ssh-credentials:1.18.1
structs:1.23
trilead-api:1.0.13
variant:1.4
workflow-aggregator:2.6
workflow-api:2.42
workflow-basic-steps:2.22
workflow-cps-global-lib:2.19
workflow-cps:2.90
workflow-durable-task-step:2.39
workflow-job:2.40
workflow-multibranch:2.24
workflow-scm-step:2.12
workflow-step-api:2.23
workflow-support:3.8

Show
Jenkins version: 2.319.1 Kubernets version: v1.22.4 Installed Plugins: jdk-tool:1.0 command-launcher:1.2 ace-editor:1.1 apache-httpcomponents-client-4-api:4.5.13-1.0 authentication-tokens:1.4 authorize-project:1.3.0 bootstrap4-api:4.6.0-3 bouncycastle-api:2.20 branch-api:2.6.2 caffeine-api:2.9.1-23.v51c4e2c879c8 checks-api:1.7.0 cloudbees-folder:6.15 configuration-as-code:1.50 credentials-binding:1.24 credentials:2.4.1 display-url-api:2.3.4 durable-task:1.36 echarts-api:5.1.0-2 font-awesome-api:5.15.3-2 git-client:3.6.0 git-server:1.9 git:3.11.0 handlebars:3.0.8 jackson2-api:2.12.3 jira:3.1.3 jquery3-api:3.6.0-1 jsch:0.1.55.2 junit:1.49 kubernetes-client-api:4.13.3-1 kubernetes-credentials:0.8.0 kubernetes:1.27.2 lockable-resources:2.10 mailer:1.34 matrix-auth:2.6.4 matrix-project:1.18 momentjs:1.1.1 pipeline-build-step:2.13 pipeline-graph-analysis:1.10 pipeline-input-step:2.12 pipeline-milestone-step:1.3.2 pipeline-model-api:1.8.4 pipeline-model-definition:1.7.2 pipeline-model-extensions:1.8.4 pipeline-rest-api:2.19 pipeline-stage-step:2.5 pipeline-stage-tags-metadata:1.8.4 pipeline-stage-view:2.19 plain-credentials:1.7 plugin-util-api:2.2.0 popper-api:1.16.1-2 scm-api:2.6.4 script-security:1.77 snakeyaml-api:1.27.0 ssh-credentials:1.18.1 structs:1.23 trilead-api:1.0.13 variant:1.4 workflow-aggregator:2.6 workflow-api:2.42 workflow-basic-steps:2.22 workflow-cps-global-lib:2.19 workflow-cps:2.90 workflow-durable-task-step:2.39 workflow-job:2.40 workflow-multibranch:2.24 workflow-scm-step:2.12 workflow-step-api:2.23 workflow-support:3.8

Similar Issues:
Powered by SuggestiMate

Show

We are using Jenkins master-slave framework in our project to run 80 tasks for every 15 minutes. Jenkins is running in Kubenetes cluster and using Kubernetes plugin to run dynamic agent.

We hope agent nodes alway keep running to process those tasks. So we set idleMinutes to 15 minutes. But it seems make Jenkins OOM. Below is the memory info of Jenkins after running 4 days. The young generation is growing very fast. If there is no task, it grows about 1MB-2MB every second. But if running 80 tasks at the same time, it will grow about 6GB in one minute, and trigger 2 ygc. Full GC happens about every 40-50 minutes. But it only can release very little memory for every fgc. It’s clear that Jenkins is leaking memory.

We dump memory snapshot of Jenkins. Below is memory leak suspects. Object DelayBufferedOutputStream takes more then 700MB space, and Object CpsFlowExecution takes 100MB space. Both these two objects are referenced from instance java.util.HashMap$Node[]. We suspect one node represents one slave node. If the slave node is not destroyed, related objects cannot be released.

We also try to set idleMinutes to 5 minutes, then agent nodes will be destroyed after completing tasks. Jenkins memory is normal.

Will Jenkins agent nodes always keep running cause memory leaks? Anyone can help on this? Thanks in advance.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2023-01-09-16-41-46-739.png
84 kB
2023-01-09 08:41
image-2023-01-09-16-42-21-216.png
227 kB
2023-01-09 08:42

relates to

JENKINS-71970 Memory leak due to channel listeners that are never cleared

Closed

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates