[JENKINS-67979] Durable task fails to stop (cleanly) in case of disk full - Jenkins Jira

Type: Bug
Resolution: Unresolved
Priority: Major
Component/s: durable-task-plugin
Labels:
None
Environment:
Jenkins LTS 2.319.3
Plugin: Durable Task 493.v195aefbb0ff2

Similar Issues:
Powered by SuggestiMate

Show

A job using docker is running and consuming more and more disk space.
This results that Jenkins disk partition fills up.
Affected Jenkins setup: no separate build slave, identical partition for docker (/var/lib/docker) and jenkins (/var/lib/jenkins)

If the job's step would exit and the script terminated, the next step would have freed disk space.
Likely this is NOT a trivial problem to solve, as I guess executing the next step might fail (no disk space to create temp-script for next command?).
As far as I can tell the process was stopped properly, BUT Jenkins did not acknowledges that the process failed and instead logged a lot of errors that consumed more log/disk space.

Improvement: Better disk-space-nearly-full management avoiding starting build-jobs if there is less than x GiB of disk space.
Docker was called on command line, so the Jenkins docker plugin does not matter (IMHO).

I'll now change the Jenkins setup, and isolate the docker daemon's disk usage. Just thought it might help to report the problem.

Of the build job's log showing the problem:

wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
23:52:49 (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
23:52:49 Cannot contact : java.io.IOException: No space left on device
23:57:49 wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
23:57:49 (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
00:02:49 wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
 ...
09:57:51  wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
09:57:51  (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)

I don't think this is security critical, but could allow denying service on Jenkins instances building public PR-requests (I don't assume that is usual use case or covered in the thread model). E.g. trust your developers to not be evil

I can provide more information on request.

Assignee:: Unassigned

Reporter:: jo da

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022-03-08 12:31

Updated:: 2022-03-08 18:17

Details

Description

Attachments

Activity

People

Dates