Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-66197

cloudbees-disk-usage-simple-plugin doesn't work on read-only JENKINS_HOME

    • 203.v3f46a_7462b_1a_

      When running Jenkins from a read-only JENKINS_HOME, disk usage collection won't work, since it wants to touch the JENKINS_HOME directory.

      This happens when running Jenkins in a containerized environment with a read-only root filesystem, where all the writable paths (i.e. workspace, cache, tmp etc.) are mounted to writable filesystems.

      Is it maybe a possibility to opt-out of that touch to JENKINS_HOME?

          [JENKINS-66197] cloudbees-disk-usage-simple-plugin doesn't work on read-only JENKINS_HOME

          Tom Wieczorek added a comment - - edited

          hareldev I guess you could simply run the Jenkins Docker image using docker run --read-only=true ... and mount the subdirs that need to be writable via multiple -v /path/to/writable/subdir:/var/lib/jenkins/subdir:rw. That's sort of the setup I've used back in the day when I encountered this problem.

          Edit: Maybe just mounting the whole JENKINS_HOME dir should be enough to trigger this IIRC. So docker run -v /pat/to/local/jenkins/dir:/var/lib/jenkins:rw ... should already do to reproduce this. The dir itself will then be read-only, but all the files and folders inside are writable...

          Tom Wieczorek added a comment - - edited hareldev I guess you could simply run the Jenkins Docker image using docker run --read-only=true ... and mount the subdirs that need to be writable via multiple -v /path/to/writable/subdir:/var/lib/jenkins/subdir:rw . That's sort of the setup I've used back in the day when I encountered this problem. Edit: Maybe just mounting the whole JENKINS_HOME dir should be enough to trigger this IIRC. So docker run -v /pat/to/local/jenkins/dir:/var/lib/jenkins:rw ... should already do to reproduce this. The dir itself will then be read-only, but all the files and folders inside are writable...

          Harel Hadad added a comment -

          I tried running it using these commands:
          https://github.com/hareldev/cmd-utils/blob/main/jenkins/running-jenkins-locally.md

          It seems like /var/jenkins_home is still writable by user jenkins, and "touchable" -

          jenkins@9de1183d81f0:/$ touch /var/jenkins_home/
          jenkins@9de1183d81f0:/$ 

          Harel Hadad added a comment - I tried running it using these commands: https://github.com/hareldev/cmd-utils/blob/main/jenkins/running-jenkins-locally.md It seems like /var/jenkins_home is still writable by user jenkins, and "touchable" - jenkins@9de1183d81f0:/$ touch / var /jenkins_home/ jenkins@9de1183d81f0:/$

          Tom Wieczorek added a comment - - edited

          The environment where the issue was observed was a Jenkins controller in a Kubernetes pod that had readOnlyRootFilesystem: true, securityContext.fsGroup: 1000 and /var/jenkins_home was a volume mount to a PersistentVolume backed by ceph-csi. That somehow resulted in a setup in which touching /var/jenkins_home wasn't possible, but everything else worked flawlessly. I don't have access to that environment anymore and I cannot reproduce it locally using a hostpath provisioner instead of a real ceph volume. Using the hostpath provisoner yields a writable JENKINS_HOME... ¯_(ツ)_/¯

          Tom Wieczorek added a comment - - edited The environment where the issue was observed was a Jenkins controller in a Kubernetes pod that had readOnlyRootFilesystem: true , securityContext.fsGroup: 1000 and /var/jenkins_home was a volume mount to a PersistentVolume backed by ceph-csi. That somehow resulted in a setup in which touching /var/jenkins_home wasn't possible, but everything else worked flawlessly. I don't have access to that environment anymore and I cannot reproduce it locally using a hostpath provisioner instead of a real ceph volume. Using the hostpath provisoner yields a writable JENKINS_HOME... ¯_(ツ)_/¯

          pierrebtz would it possible to use something like the 

          config.xml

           instead of the root directory? Or possibly another common file? This works without a problem.  

           

          new hudson.FilePath(Jenkins.instance.getRootPath(), 'config.xml').touch(System.currentTimeMillis()) 

           

           

           

          Steve Boardwell added a comment - pierrebtz would it possible to use something like the  config.xml  instead of the root directory? Or possibly another common file? This works without a problem.     new hudson.FilePath(Jenkins.instance.getRootPath(), 'config.xml' ).touch( System .currentTimeMillis())      

          Pierre Beitz added a comment -

          sboardwell AFAIU the code is here to detect as fast as possible if there is an issue writing to the JENKINS_HOME, so I assume any file under the directory is good enough.

          I'd rather create a new file dedicated to this purpose instead of fiddling with a file that this plugin doesn't own though.

          The main blocker for this task has always been that we do not have an reproduction environment to test the fix, do you happen to have one?

          Pierre Beitz added a comment - sboardwell AFAIU the code is here to detect as fast as possible if there is an issue writing to the JENKINS_HOME, so I assume any file under the directory is good enough. I'd rather create a new file dedicated to this purpose instead of fiddling with a file that this plugin doesn't own though. The main blocker for this task has always been that we do not have an reproduction environment to test the fix, do you happen to have one?

          This issue is a pain point for us since migrating from a fixed instance to using the Jenkins Helm chart in a K8s environment; we've already had the Jenkins controller disk going full once which was a hassle to resolve after the fact, so it would be good to be able to use this plugin to proactively monitor things better!

          twz123 how did you apply your remove-fs-freeze-check.patch - did you compile the plugin locally, or is there some superior approach I haven't yet thought of?

          pierrebtz I have a reproduction environment with which I can test the fix, and am happy to do the testing on your behalf, if you supply an appropriate candidate plugin version for me to try.

           

          Oliver Lockwood added a comment - This issue is a pain point for us since migrating from a fixed instance to using the Jenkins Helm chart in a K8s environment; we've already had the Jenkins controller disk going full once which was a hassle to resolve after the fact, so it would be good to be able to use this plugin to proactively monitor things better! twz123 how did you apply your remove-fs-freeze-check.patch - did you compile the plugin locally, or is there some superior approach I haven't yet thought of? pierrebtz I have a reproduction environment with which I can test the fix, and am happy to do the testing on your behalf, if you supply an appropriate candidate plugin version for me to try.  

          Pierre Beitz added a comment -

          oliverlockwood sboardwell created https://github.com/jenkinsci/cloudbees-disk-usage-simple-plugin/pull/99, feel free to give it a try. I'll wait on feedback to merge since I don't have a reproducer.

          Pierre Beitz added a comment - oliverlockwood sboardwell created https://github.com/jenkinsci/cloudbees-disk-usage-simple-plugin/pull/99, feel free to give it a try. I'll wait on feedback to merge since I don't have a reproducer.

          pierrebtz Merry Christmas to you
          Thanks for providing the PR. I built and deployed it to our Jenkins manually, and it works a treat - no longer any errors on the Jenkins pod's console log, and I can now usefully analyse the disk usage of Jenkins through the UI of this plugin.

          It would be fantastic if this PR could be merged and a new version of the plugin released to include this change.

          Thanks again.

          Oliver Lockwood added a comment - pierrebtz Merry Christmas to you Thanks for providing the PR. I built and deployed it to our Jenkins manually, and it works a treat - no longer any errors on the Jenkins pod's console log, and I can now usefully analyse the disk usage of Jenkins through the UI of this plugin. It would be fantastic if this PR could be merged and a new version of the plugin released to include this change. Thanks again.

          Pierre Beitz added a comment -

          oliverlockwood Merry Christmas to you too! Thanks for taking the time to test. Fix was just released as part of version https://github.com/jenkinsci/cloudbees-disk-usage-simple-plugin/releases/tag/203.v3f46a_7462b_1a_. It should be visible in your Update Center in 2 to 3 hours.

          Pierre Beitz added a comment - oliverlockwood Merry Christmas to you too! Thanks for taking the time to test. Fix was just released as part of version https://github.com/jenkinsci/cloudbees-disk-usage-simple-plugin/releases/tag/203.v3f46a_7462b_1a_. It should be visible in your Update Center in 2 to 3 hours.

          Confirmed as fixed with the above version installed from the Update Center.  Thanks again.

          Oliver Lockwood added a comment - Confirmed as fixed with the above version installed from the Update Center.  Thanks again.

            pierrebtz Pierre Beitz
            twz123 Tom Wieczorek
            Votes:
            3 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: