Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22434

jgit clean after checkout fails with unmappable chars in path

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Major Major
    • git-client-plugin
    • None
    • Ubuntu 12.04 LTS amd64, using jdk7 from the distribution, jenkins v1.556

      Started by user anonymous
      Building in workspace /var/lib/jenkins/jobs/ext-jgit-3.3.0/workspace
      Cloning the remote Git repository
      Cloning repository https://github.com/eclipse/jgit.git
      ERROR: Failed to clean the workspace
      java.io.IOException: java.lang.reflect.InvocationTargetException
      at hudson.Util.isSymlinkJava7(Util.java:360)
      at hudson.Util.isSymlink(Util.java:325)
      at hudson.Util.deleteRecursive(Util.java:291)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:327)
      at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:845)
      at hudson.plugins.git.GitSCM.checkout(GitSCM.java:878)
      at hudson.model.AbstractProject.checkout(AbstractProject.java:1320)
      at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
      at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
      at hudson.model.Run.execute(Run.java:1688)
      at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:519)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:231)
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at hudson.Util.isSymlinkJava7(Util.java:355)
      ... 20 more
      Caused by: java.nio.file.InvalidPathException: Malformed input or input contains unmappable chacraters: /var/lib/jenkins/jobs/ext-jgit-3.3.0/workspace/org.eclipse.jgit.java7.test/target/tmp_7536346264520663349/??
      at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
      at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
      at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
      at java.io.File.toPath(File.java:2186)
      ... 24 more
      ERROR: Error cloning remote repo 'origin'
      hudson.plugins.git.GitException: Failed to delete workspace
      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:330)
      at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:845)
      at hudson.plugins.git.GitSCM.checkout(GitSCM.java:878)
      at hudson.model.AbstractProject.checkout(AbstractProject.java:1320)
      at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
      at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
      at hudson.model.Run.execute(Run.java:1688)
      at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:519)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:231)
      Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
      at hudson.Util.isSymlinkJava7(Util.java:360)
      at hudson.Util.isSymlink(Util.java:325)
      at hudson.Util.deleteRecursive(Util.java:291)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at hudson.Util.deleteRecursive(Util.java:292)
      at hudson.Util.deleteContentsRecursive(Util.java:203)
      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:327)
      ... 10 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at hudson.Util.isSymlinkJava7(Util.java:355)
      ... 20 more
      Caused by: java.nio.file.InvalidPathException: Malformed input or input contains unmappable chacraters: /var/lib/jenkins/jobs/ext-jgit-3.3.0/workspace/org.eclipse.jgit.java7.test/target/tmp_7536346264520663349/??
      at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
      at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
      at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
      at java.io.File.toPath(File.java:2186)
      ... 24 more
      ERROR: null
      Retrying after 10 seconds

          [JENKINS-22434] jgit clean after checkout fails with unmappable chars in path

          Mr Cinquero added a comment - - edited

          Yeah, I just checked it again, and it seems java.io.File always operates on the basis of a java.lang.String, ie. with an internal representation-independent form and it always uses the environment settings (or file.encoding sys prop) when choosing the actual filename repesentation when interacting with the filesystem layer. That is nuts because it seemingly cannot handle files with bad name encodings, ie. one could think of falling back to a generic charset like iso8859-1 to handle such cases, at least when the filename's correct encoding is not an issue.....

          Mr Cinquero added a comment - - edited Yeah, I just checked it again, and it seems java.io.File always operates on the basis of a java.lang.String, ie. with an internal representation-independent form and it always uses the environment settings (or file.encoding sys prop) when choosing the actual filename repesentation when interacting with the filesystem layer. That is nuts because it seemingly cannot handle files with bad name encodings, ie. one could think of falling back to a generic charset like iso8859-1 to handle such cases, at least when the filename's correct encoding is not an issue.....

          Mr Cinquero added a comment -

          I have created a test case to verify the issue that is possibly at hand here:

          https://github.com/jjYBdx4IL/filenameenc

          There is also a topic at stackoverflow, but the people there don't get it yet, hopefully they do with the example at github.

          Mr Cinquero added a comment - I have created a test case to verify the issue that is possibly at hand here: https://github.com/jjYBdx4IL/filenameenc There is also a topic at stackoverflow, but the people there don't get it yet, hopefully they do with the example at github.

          Mark Waite added a comment -

          Doesn't your test case show that the problem is in the JDK, rather than in Jenkins or in JGit or the git-client-plugin?

          Mark Waite added a comment - Doesn't your test case show that the problem is in the JDK, rather than in Jenkins or in JGit or the git-client-plugin?

          Mr Cinquero added a comment - - edited

          Maybe. Or it may just be an indication that we need a workaround because otherwise the problem won't go away, possibly never because the JLS guys may decide it to be a feature

          I propose to use some platform-specific means for removing such badly encoded files, ie. run "rm -f $strange-filename" on Linux, and the equivalent on Windows. I guess that there are probably already libraries around doing exactly that, though I'm not aware of any.

          Another solution would be to let Jenkins always run with a full 8-bit charset as the default charset, ie. one that has no invalid characters/holes in the binary representation per se, though I'm not sure what undesirable side-effects that would bring about. Thinking about it further, I guess that this was no problem in the old days before Unicode, and it is rather a side-effect now that had to be introduced through the introducation of Unicode...

          Mr Cinquero added a comment - - edited Maybe. Or it may just be an indication that we need a workaround because otherwise the problem won't go away, possibly never because the JLS guys may decide it to be a feature I propose to use some platform-specific means for removing such badly encoded files, ie. run "rm -f $strange-filename" on Linux, and the equivalent on Windows. I guess that there are probably already libraries around doing exactly that, though I'm not aware of any. Another solution would be to let Jenkins always run with a full 8-bit charset as the default charset, ie. one that has no invalid characters/holes in the binary representation per se, though I'm not sure what undesirable side-effects that would bring about. Thinking about it further, I guess that this was no problem in the old days before Unicode, and it is rather a side-effect now that had to be introduced through the introducation of Unicode...

          Mr Cinquero added a comment - - edited

          OK, I found a possible solution: is the old java.io.* API still being used? If so, upgrade to java.nio.*, especially when reading filenames from disk the java.io.File.listFiles() method replaces invalid chars with the default unknown char character of the active charset, whereas java.nio.Files.newDirectoryStream() does not do such a thing (at least apparently, see https://github.com/jjYBdx4IL/filenameenc/blob/master/src/main/java/filenameenc/Test.java).

          Mr Cinquero added a comment - - edited OK, I found a possible solution: is the old java.io.* API still being used? If so, upgrade to java.nio.*, especially when reading filenames from disk the java.io.File.listFiles() method replaces invalid chars with the default unknown char character of the active charset, whereas java.nio.Files.newDirectoryStream() does not do such a thing (at least apparently, see https://github.com/jjYBdx4IL/filenameenc/blob/master/src/main/java/filenameenc/Test.java ).

          Mark Waite added a comment -

          Unfortunately, I don't think we can use java.nio.Files.newDirectoryStream() since it seems to be limited to JDK 7. Jenkins continues to support JDK 6 so that we don't drop users from platforms they are using, even though those platforms are only supported by open source versions of Java. JDK 5 support was only dropped from Jenkins about a year ago, and there are still many users running Java 6 on their Jenkins servers.

          Mark Waite added a comment - Unfortunately, I don't think we can use java.nio.Files.newDirectoryStream() since it seems to be limited to JDK 7. Jenkins continues to support JDK 6 so that we don't drop users from platforms they are using, even though those platforms are only supported by open source versions of Java. JDK 5 support was only dropped from Jenkins about a year ago, and there are still many users running Java 6 on their Jenkins servers.

          Code changed in jenkins
          User: Mark Waite
          Path:
          src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java
          http://jenkins-ci.org/commit/git-client-plugin/de4a3b2d46cab70613b8606ff364efeb630279b9
          Log:
          Use interesting UTF-8 characters in the "clean" test

          Attempting to confirm JENKINS-20410 and JENKINS-22434.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Mark Waite Path: src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java http://jenkins-ci.org/commit/git-client-plugin/de4a3b2d46cab70613b8606ff364efeb630279b9 Log: Use interesting UTF-8 characters in the "clean" test Attempting to confirm JENKINS-20410 and JENKINS-22434 .

          Code changed in jenkins
          User: Mark Waite
          Path:
          src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java
          http://jenkins-ci.org/commit/git-client-plugin/18e4719b593a873fcf3b6d7b7a7c4861b146ec22
          Log:
          Another test of JENKINS-22434, shows failure if LANG=C

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Mark Waite Path: src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java http://jenkins-ci.org/commit/git-client-plugin/18e4719b593a873fcf3b6d7b7a7c4861b146ec22 Log: Another test of JENKINS-22434 , shows failure if LANG=C

          Mark Waite added a comment -

          As far as I can tell from the tests I've written, the only case I've found to show a failure mode like this is if I run with LANG=C, use a hudson.FilePath object to represent the Unicode named directory on the disc, and attempt to delete recursively. You mentioned that "always run with a full 8-bit charset as the default charset" might be a viable work around, and if I use that work around (by running with my usual en_US.UTF-8 locale), then the failure does not appear.

          Is that sufficient to close this bug report?

          Mark Waite added a comment - As far as I can tell from the tests I've written, the only case I've found to show a failure mode like this is if I run with LANG=C, use a hudson.FilePath object to represent the Unicode named directory on the disc, and attempt to delete recursively. You mentioned that "always run with a full 8-bit charset as the default charset" might be a viable work around, and if I use that work around (by running with my usual en_US.UTF-8 locale), then the failure does not appear. Is that sufficient to close this bug report?

          Mark Waite added a comment -

          After a month without a response, I'm closing this as "Won't fix". The test cases have been added to the plugin, but the work around (use a UTF-8 capable locale) seems sufficient without attempting other code changes.

          Mark Waite added a comment - After a month without a response, I'm closing this as "Won't fix". The test cases have been added to the plugin, but the work around (use a UTF-8 capable locale) seems sufficient without attempting other code changes.

            ndeloof Nicolas De Loof
            marc321 Mr Cinquero
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: