-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
When a pipeline schedule an pod agent that is not immediately schedulable (due to node selectors, available resources, ...) the pod is pending and the Jenkins node is created / suspended waiting for the agent to connect.
Now if the pipeline is aborted, then the Jenkins agent is not terminated yet. At least not until either the slaveConnectTimeout (Timeout in seconds for Jenkins connection, default to 1000 seconds since https://github.com/jenkinsci/kubernetes-plugin/commit/252f8d1e2cf3ed71f3a2c3694eff08b7fa1004c7 / https://github.com/jenkinsci/kubernetes-plugin/releases/tag/kubernetes-1.29.6) or the agent retentionTimeout (Container Cleanup Timeout, default to 5 minutes) kicks off.
- If the agent cannot be schedule, it will never be connected / idle and therefore the retentionTimeout does not apply. The agent would be terminated after the slaveConnectTimeout.
- If the agent is eventually scheduled, it connect and the retentionTimeout eventually terminates it.
This issue is about improving this behavior and try to terminate the agent earlier when the requesting pipeline build is aborted / completed. Especially for dynamic pod templates that have a reference to the build.
Reproduce
- Run a pipeline like the following:
// Uses Declarative syntax to run commands inside a container. pipeline { agent { kubernetes { cloud 'local' yaml ''' apiVersion: v1 kind: Pod spec: nodeSelector: dedicated: doesnotexist ''' } } stages { stage('Main') { steps { sh 'echo "OK"' } } } } * Abort the pipeline after a few seconds * Notice that the pod / agent will hang around for a while before being deleted. Per the default configuration it will take >16min (1000s after being created).
- links to