Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-12753

PluginManager.dynamicLoad on installed plugin fails on NFS with IOException not RestartRequiredException

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • core
    • linux, nfs, jenkins 1.450, any plugin

      Got the question from the irc channel, this was initially discussed on the fosdem jenkins booth.

      Updating plugins doesn't work when the jenkins home directory is located on nfs, the failure looks like this

      Caused by: java.io.IOException: Unable to delete /var/lib/jenkins/plugins/maven-plugin/WEB-INF/lib/.nfs00000000000e897d00000096

      The .nfs files appear when deleting a file on nfs that is still open, in this case the jar files in the lib directory are still open and cannot be deleted.

          [JENKINS-12753] PluginManager.dynamicLoad on installed plugin fails on NFS with IOException not RestartRequiredException

          Alex Lehmann created issue -

          This appears to indicate that we are trying to overwrite a plugin currently in use, and that won't work regardless, NFS or not.

          Kohsuke Kawaguchi added a comment - This appears to indicate that we are trying to overwrite a plugin currently in use, and that won't work regardless, NFS or not.

          Being the one who raised this issue on IRC (thanks to Alex for reporting it here): this seems to be a locking issue caused by NFS. I can't reproduce the very same issue without NFS and the open .nfs000000... file handle indicates that there's something going wrong with file deletion.

          Michael Prokop added a comment - Being the one who raised this issue on IRC (thanks to Alex for reporting it here): this seems to be a locking issue caused by NFS. I can't reproduce the very same issue without NFS and the open .nfs000000... file handle indicates that there's something going wrong with file deletion.

          Alex Lehmann added a comment - - edited

          the way open files are handled is different on local filesystems and nfs, the issue may be nfs specific, however if Kohsuke is right, it doesn't work on local storage either but it doesn't give an obvious error message in that case.

          if the plugin is in use, the different jar files for the plugin are all loaded into the jvm and are open (this is visible with lsof), if the plugin is updated, the files are still open even though they are already delete (this is perfectly ok on local filesystems), e.g. like this (this is the output of lsof|grep java.*xunit.*guice as one example file):

          (removed some columns from the lsof output to fit into the page width)

          before:

          mem       REG     8,1   807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar
          237r      REG     8,1   807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar
          

          after:

          DEL       REG     8,1          4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar
          237r      REG     8,1   807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar (deleted)
          

          the jars will not be reloaded even if they were written to the directory in a new version.

          on nfs, open deleted files are not directly possible, so instead each file is changed to a .nfs file which causes the exception when trying to delete the directory, but reloading wouldn't work either if the files could be deleted (or e.g. are moved to another dir and delete later).

          When updating a plugin when the directory is on a local filesystem, the update page still says "xunit plugin is already installed. Jenkins needs to be restarted for the update to take effect" with a yellow ball, so the update process notices that it cannot update the files and forces suggests to restart. With nfs he never gets to the stage where the message appears, but the problem is the same in both cases.

          Alex Lehmann added a comment - - edited the way open files are handled is different on local filesystems and nfs, the issue may be nfs specific, however if Kohsuke is right, it doesn't work on local storage either but it doesn't give an obvious error message in that case. if the plugin is in use, the different jar files for the plugin are all loaded into the jvm and are open (this is visible with lsof), if the plugin is updated, the files are still open even though they are already delete (this is perfectly ok on local filesystems), e.g. like this (this is the output of lsof|grep java.*xunit.*guice as one example file): (removed some columns from the lsof output to fit into the page width) before: mem REG 8,1 807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar 237r REG 8,1 807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar after: DEL REG 8,1 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar 237r REG 8,1 807021 4067923 .../plugins/xunit/WEB-INF/lib/guice-2.0.1.jar (deleted) the jars will not be reloaded even if they were written to the directory in a new version. on nfs, open deleted files are not directly possible, so instead each file is changed to a .nfs file which causes the exception when trying to delete the directory, but reloading wouldn't work either if the files could be deleted (or e.g. are moved to another dir and delete later). When updating a plugin when the directory is on a local filesystem, the update page still says "xunit plugin is already installed. Jenkins needs to be restarted for the update to take effect" with a yellow ball, so the update process notices that it cannot update the files and forces suggests to restart. With nfs he never gets to the stage where the message appears, but the problem is the same in both cases.
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-8550 [ JENKINS-8550 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-22205 [ JENKINS-22205 ]

          Jesse Glick added a comment -

          It seems that the problem has been misstated because the bug itself masks the problem. When you already have a plugin installed, you cannot dynamically load its update, ever. Normally Jenkins stops you from trying to do so by throwing a RestartRequiredException. Unfortunately when PluginManager.dynamicLoad is checking whether the plugin is already installed, it calls ClassicPluginStrategy.createPluginWrapper which actually unpacks the *.hpi right away, and when on a filesystem with locks, that fails—and the IOException then is reported as if this were the real error, when the error is simply a user misunderstanding.

          Jesse Glick added a comment - It seems that the problem has been misstated because the bug itself masks the problem. When you already have a plugin installed, you cannot dynamically load its update, ever. Normally Jenkins stops you from trying to do so by throwing a RestartRequiredException . Unfortunately when PluginManager.dynamicLoad is checking whether the plugin is already installed, it calls ClassicPluginStrategy.createPluginWrapper which actually unpacks the *.hpi right away, and when on a filesystem with locks, that fails—and the IOException then is reported as if this were the real error, when the error is simply a user misunderstanding.
          Jesse Glick made changes -
          Component/s New: core [ 15593 ]
          Component/s Original: plugin [ 15491 ]
          Labels New: lock plugin
          Summary Original: Instant update of plugins doesn't work in nfs directory New: PluginManager.dynamicLoad on installed plugin fails on NFS with IOException not RestartRequiredException

          Jesse Glick added a comment -

          Here is a typical stack trace:

          java.nio.file.FileSystemException: /…/plugins/credentials/WEB-INF/lib/.nfs000000000000…: Device or resource busy 
          	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) 
          	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
          	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
          	at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244) 
          	at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) 
          	at java.nio.file.Files.delete(Files.java:1077) 
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
          	at java.lang.reflect.Method.invoke(Method.java:606) 
          	at hudson.Util.deleteFile(Util.java:238) 
          	at hudson.Util.deleteRecursive(Util.java:301) 
          	at hudson.Util.deleteContentsRecursive(Util.java:203) 
          	at hudson.Util.deleteRecursive(Util.java:300) 
          	at hudson.Util.deleteContentsRecursive(Util.java:203) 
          	at hudson.Util.deleteRecursive(Util.java:292) 
          	at hudson.Util.deleteContentsRecursive(Util.java:203) 
          	at hudson.Util.deleteRecursive(Util.java:292) 
          	at hudson.ClassicPluginStrategy.explode(ClassicPluginStrategy.java:423) 
          	at hudson.ClassicPluginStrategy.createPluginWrapper(ClassicPluginStrategy.java:128) 
          	at hudson.PluginManager.dynamicLoad(PluginManager.java:412) 
          	at hudson.model.UpdateCenter$InstallationJob._run(UpdateCenter.java:1300)
          

          Jesse Glick added a comment - Here is a typical stack trace: java.nio.file.FileSystemException: /…/plugins/credentials/WEB-INF/lib/.nfs000000000000…: Device or resource busy at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1077) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at hudson.Util.deleteFile(Util.java:238) at hudson.Util.deleteRecursive(Util.java:301) at hudson.Util.deleteContentsRecursive(Util.java:203) at hudson.Util.deleteRecursive(Util.java:300) at hudson.Util.deleteContentsRecursive(Util.java:203) at hudson.Util.deleteRecursive(Util.java:292) at hudson.Util.deleteContentsRecursive(Util.java:203) at hudson.Util.deleteRecursive(Util.java:292) at hudson.ClassicPluginStrategy.explode(ClassicPluginStrategy.java:423) at hudson.ClassicPluginStrategy.createPluginWrapper(ClassicPluginStrategy.java:128) at hudson.PluginManager.dynamicLoad(PluginManager.java:412) at hudson.model.UpdateCenter$InstallationJob._run(UpdateCenter.java:1300)
          Jesse Glick made changes -
          Assignee New: Jesse Glick [ jglick ]

            jglick Jesse Glick
            alexlehm Alex Lehmann
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: