Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-67979

Durable task fails to stop (cleanly) in case of disk full

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • durable-task-plugin
    • None
    • Jenkins LTS 2.319.3
      Plugin: Durable Task 493.v195aefbb0ff2

      A job using docker is running and consuming more and more disk space.
      This results that Jenkins disk partition fills up.
      Affected Jenkins setup: no separate build slave, identical partition for docker (/var/lib/docker) and jenkins (/var/lib/jenkins)

      If the job's step would exit and the script terminated, the next step would have freed disk space.
      Likely this is NOT a trivial problem to solve, as I guess executing the next step might fail (no disk space to create temp-script for next command?).
      As far as I can tell the process was stopped properly, BUT Jenkins did not acknowledges that the process failed and instead logged a lot of errors that consumed more log/disk space.

      Improvement: Better disk-space-nearly-full management avoiding starting build-jobs if there is less than x GiB of disk space.
      Docker was called on command line, so the Jenkins docker plugin does not matter (IMHO).

      I'll now change the Jenkins setup, and isolate the docker daemon's disk usage. Just thought it might help to report the problem.

      Of the build job's log showing the problem:

      wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
      23:52:49 (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
      23:52:49 Cannot contact : java.io.IOException: No space left on device
      23:57:49 wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
      23:57:49 (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
      00:02:49 wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
       ...
      09:57:51  wrapper script does not seem to be touching the log file in /var/lib/jenkins/jobs/jobname/workspace/source/docker/builder@tmp/durable-e1e330b5
      09:57:51  (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
      

      I don't think this is security critical, but could allow denying service on Jenkins instances building public PR-requests (I don't assume that is usual use case or covered in the thread model). E.g. trust your developers to not be evil

      I can provide more information on request.

            Unassigned Unassigned
            joda jo da
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: