When displaying summary of changes unicode characters are displayed incorrectly.

      Browser encoding is set o Unicode.

        1. Capture.PNG
          Capture.PNG
          13 kB
        2. Jenkins status.PNG
          Jenkins status.PNG
          8 kB
        3. jenkins summary.PNG
          jenkins summary.PNG
          3 kB
        4. mercurial.hpi
          107 kB

          [JENKINS-17353] problem with summary encoding

          Jesse Glick added a comment -

          Unfortunately Mercurial does not define any encoding (for file contents, filenames, or metadata such as user names and comments), so it is impossible for Jenkins to know what encoding was “meant” by the people committing changes, or indeed if they all agree on what encoding to use. My only advice is to use UTF-8 everywhere. Not sure if this is fixable.

          Jesse Glick added a comment - Unfortunately Mercurial does not define any encoding (for file contents, filenames, or metadata such as user names and comments), so it is impossible for Jenkins to know what encoding was “meant” by the people committing changes, or indeed if they all agree on what encoding to use. My only advice is to use UTF-8 everywhere. Not sure if this is fixable.

          i commit through netbeans IDE and not sure what encoding it does use, i presume it uses utf-8, as it would be a wise choice and mercurial's rhodecode UI displays these messages correctly so i think the problem is on Jenkins side and you should definitely use UTF-8 by default.

          also mercurial does define encoding it defaults to UTF-8:

          UTF-8 strings are used to store most repository metadata. Unlike repository contents, repository metadata is 'owned and managed' by Mercurial and can be made to conform to its rules. In particular, this includes:

          commit messages stored in the changelog
          user names
          tags
          branches

          The following files are stored in UTF-8:

          .hgtags
          .hg/branch
          .hg/branchheads.cache
          .hg/tags.cache
          .hg/bookmarks

          Justinas Urbanavicius added a comment - i commit through netbeans IDE and not sure what encoding it does use, i presume it uses utf-8, as it would be a wise choice and mercurial's rhodecode UI displays these messages correctly so i think the problem is on Jenkins side and you should definitely use UTF-8 by default. also mercurial does define encoding it defaults to UTF-8: UTF-8 strings are used to store most repository metadata. Unlike repository contents, repository metadata is 'owned and managed' by Mercurial and can be made to conform to its rules. In particular, this includes: commit messages stored in the changelog user names tags branches The following files are stored in UTF-8: .hgtags .hg/branch .hg/branchheads.cache .hg/tags.cache .hg/bookmarks

          Jesse Glick added a comment -

          The current system seems to date to https://github.com/jenkinsci/mercurial-plugin/commit/a0ff0f7 started by http://jenkins.361315.n4.nabble.com/Mercurial-plugin-MalformedByteSequenceException-while-parsing-changelog-xml-td367631.html but I am not sure what it was intended to do.

          Dealing with Unicode characters in filenames correctly in all cases is probably impossible, but sounds like it should be possible to handle Unicode in user names and commit messages.

          Jesse Glick added a comment - The current system seems to date to https://github.com/jenkinsci/mercurial-plugin/commit/a0ff0f7 started by http://jenkins.361315.n4.nabble.com/Mercurial-plugin-MalformedByteSequenceException-while-parsing-changelog-xml-td367631.html but I am not sure what it was intended to do. Dealing with Unicode characters in filenames correctly in all cases is probably impossible, but sounds like it should be possible to handle Unicode in user names and commit messages.

          sound good, will wait for a fix
          file-names are not my main concern as all file-names in my repository have only latin alphabet chars.

          Justinas Urbanavicius added a comment - sound good, will wait for a fix file-names are not my main concern as all file-names in my repository have only latin alphabet chars.

          Jesse Glick added a comment -

          Using NetBeans in (US-English) Windows XP I made a commit using Czech characters. They were not preserved in hg log on Ubuntu or even on the same XP machine. So I doubt I can reproduce your setup. I can only run the hg log command under the assumption that everything it produces is UTF-8. Probably this will cause problems for someone else, but too bad.

          Jesse Glick added a comment - Using NetBeans in (US-English) Windows XP I made a commit using Czech characters. They were not preserved in hg log on Ubuntu or even on the same XP machine. So I doubt I can reproduce your setup. I can only run the hg log command under the assumption that everything it produces is UTF-8. Probably this will cause problems for someone else, but too bad.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/hudson/plugins/mercurial/MercurialSCM.java
          http://jenkins-ci.org/commit/mercurial-plugin/20116c4e40be73e4146af27931c54c5ef57be4e2
          Log:
          [FIXED JENKINS-17353] Assume UTF-8 encoding for metadata in changelog.xml.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/hudson/plugins/mercurial/MercurialSCM.java http://jenkins-ci.org/commit/mercurial-plugin/20116c4e40be73e4146af27931c54c5ef57be4e2 Log: [FIXED JENKINS-17353] Assume UTF-8 encoding for metadata in changelog.xml.

          hmm, that's strange, as i said, mercurial rhodecode reads commit messages just fine, and displays them correctly, and it runs on Linux Debian.

          Justinas Urbanavicius added a comment - hmm, that's strange, as i said, mercurial rhodecode reads commit messages just fine, and displays them correctly, and it runs on Linux Debian.

          Jesse Glick added a comment -

          See if the attached update works for you.

          Jesse Glick added a comment - See if the attached update works for you.

          Works as expected, see attached screen shots, thanks.

          Justinas Urbanavicius added a comment - Works as expected, see attached screen shots, thanks.

          Jesse Glick added a comment -

          Good. I tried to make the plugin ask Mercurial to emit the changelog in UTF-8, replacing unknown characters if unsupported; not sure what effect this will have. Notoriously difficult to test this sort of thing, especially on Windows, because it is hard to tell where and how characters and being converted to bytes or vice-versa.

          Jesse Glick added a comment - Good. I tried to make the plugin ask Mercurial to emit the changelog in UTF-8, replacing unknown characters if unsupported; not sure what effect this will have. Notoriously difficult to test this sort of thing, especially on Windows, because it is hard to tell where and how characters and being converted to bytes or vice-versa.

            jglick Jesse Glick
            gameshas Justinas Urbanavicius
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: