Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37057

Node that run out of disk space for slave jar cache is never reported as such

      Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

      Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
      WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
      java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
      	at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
      	at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: No space left on device
      	at java.io.UnixFileSystem.createFileExclusively(Native Method)
      	at java.io.File.createTempFile(File.java:2001)
      	at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
      	... 5 more
      

      What makes it worse, the attached cause does not point to No space left on device most of the time (RHEL 7) and report generic No such file or directory. This is likely related to JENKINS-36947: when there i no disk space to create the directory No such file or directory is reported. If the directory exists or there is just enough room for empty dir, No space left on device is reported when file is created.

          [JENKINS-37057] Node that run out of disk space for slave jar cache is never reported as such

          Imho the slave jar cache should be able to skip caching in case the disk is full (or there is some other problem preventing it to cache jars). So the slave can at least operate in degraded mode. It would help in reported situation as it would allow Jenkins to find out about the problem.

          Oliver Gondža added a comment - Imho the slave jar cache should be able to skip caching in case the disk is full (or there is some other problem preventing it to cache jars). So the slave can at least operate in degraded mode. It would help in reported situation as it would allow Jenkins to find out about the problem.

            Unassigned Unassigned
            olivergondza Oliver Gondža
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: