Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37057

Node that run out of disk space for slave jar cache is never reported as such

      Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

      Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
      WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
      java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
      	at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
      	at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: No space left on device
      	at java.io.UnixFileSystem.createFileExclusively(Native Method)
      	at java.io.File.createTempFile(File.java:2001)
      	at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
      	... 5 more
      

      What makes it worse, the attached cause does not point to No space left on device most of the time (RHEL 7) and report generic No such file or directory. This is likely related to JENKINS-36947: when there i no disk space to create the directory No such file or directory is reported. If the directory exists or there is just enough room for empty dir, No space left on device is reported when file is created.

          [JENKINS-37057] Node that run out of disk space for slave jar cache is never reported as such

          Oliver Gondža created issue -
          Oliver Gondža made changes -
          Description Original: Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

          {noformat}
          Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
          WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
          java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
          at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
          at java.lang.Thread.run(Thread.java:745)
          Caused by: java.io.IOException: No space left on device
          at java.io.UnixFileSystem.createFileExclusively(Native Method)
          at java.io.File.createTempFile(File.java:2001)
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
          ... 5 more
          {noformat}
          New: Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

          {noformat}
          Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
          WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
          java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
          at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
          at java.lang.Thread.run(Thread.java:745)
          Caused by: java.io.IOException: No space left on device
          at java.io.UnixFileSystem.createFileExclusively(Native Method)
          at java.io.File.createTempFile(File.java:2001)
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
          ... 5 more
          {noformat}

          What makes it worse, the attached cause does not point to *No space left on device* most of the time (RHEL 7) and report generic *No such file or directory*. This is likely related to JENKINS-36947: when there i no disk space to create the directory *No such file or directory* is reported. If the directory exists or there is just enough room for empty dir, *No space left on device*
          Oliver Gondža made changes -
          Description Original: Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

          {noformat}
          Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
          WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
          java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
          at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
          at java.lang.Thread.run(Thread.java:745)
          Caused by: java.io.IOException: No space left on device
          at java.io.UnixFileSystem.createFileExclusively(Native Method)
          at java.io.File.createTempFile(File.java:2001)
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
          ... 5 more
          {noformat}

          What makes it worse, the attached cause does not point to *No space left on device* most of the time (RHEL 7) and report generic *No such file or directory*. This is likely related to JENKINS-36947: when there i no disk space to create the directory *No such file or directory* is reported. If the directory exists or there is just enough room for empty dir, *No space left on device*
          New: Putting aside the partition is not monitored explicitly by Jenkins, it will not be detected even if it happen to be. My cache is on workspace partition but as there is no more disk space, remoting fails to cache the jar and procedd with any request essentially. This can actually prevent disk space monitor to run and take the slave temporarily offline so it appears up in Jenkins for weeks even though not able to do much of anything.

          {noformat}
          Jul 16, 2016 4:02:42 AM hudson.remoting.JarCacheSupport$1 run
          WARNING: Failed to resolve a jar 52667741e0b2a0765f4c585875cef3de
          java.io.IOException: Failed to write to /mnt/hudson_workspace/.slave-jar-cache/52/667741E0B2A0765F4C585875CEF3DE.jar
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:112)
          at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
          at java.lang.Thread.run(Thread.java:745)
          Caused by: java.io.IOException: No space left on device
          at java.io.UnixFileSystem.createFileExclusively(Native Method)
          at java.io.File.createTempFile(File.java:2001)
          at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:69)
          ... 5 more
          {noformat}

          What makes it worse, the attached cause does not point to _No space left on device_ most of the time (RHEL 7) and report generic _No such file or directory_. This is likely related to JENKINS-36947: when there i no disk space to create the directory _No such file or directory_ is reported. If the directory exists or there is just enough room for empty dir, _No space left on device_ is reported when file is created.
          Daniel Beck made changes -
          Link New: This issue is related to JENKINS-36947 [ JENKINS-36947 ]
          Oleg Nenashev made changes -
          Labels New: diagnostics

            Unassigned Unassigned
            olivergondza Oliver Gondža
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: