Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23091

Git. Problem with encoding in the list changes

      Everything works correctly, but the list of changes is not displayed correctly (01.png).
      In the description of the repository is displayed correctly (02.png)
      In IDE correctly too (03.png)
      Previously used a very old version of GIT plugin (1.5.0) - was no such problem, but now have upgraded to the latest version.

      Log:

       
      Building in workspace C:\.jenkins\jobs\build_test\workspace
       > C:\Program Files (x86)\Git\bin\git.exe rev-parse --is-inside-work-tree
      Fetching changes from the remote Git repository
       > C:\Program Files (x86)\Git\bin\git.exe config remote.origin.url http://jenkins:jenkinsgit@localhost:8080/gitblit/git/Selenium.git
      Fetching upstream changes from http://jenkins@localhost:8080/gitblit/git/Selenium.git
       > C:\Program Files (x86)\Git\bin\git.exe --version
      using .gitcredentials to set credentials
       > C:\Program Files (x86)\Git\bin\git.exe config --local credential.helper store --file=\"C:\tomcat\temp\git2127178133677507983.credentials\"
       > C:\Program Files (x86)\Git\bin\git.exe fetch --tags --progress http://jenkins@localhost:8080/gitblit/git/Selenium.git +refs/heads/*:refs/remotes/origin/*
       > C:\Program Files (x86)\Git\bin\git.exe config --local --remove-section credential
       > C:\Program Files (x86)\Git\bin\git.exe rev-parse "origin/master^{commit}"
      Checking out Revision 7c81cb7e96e8359d2c1e45bffcef296c0e8a9051 (origin/master)
       > C:\Program Files (x86)\Git\bin\git.exe config core.sparsecheckout
       > C:\Program Files (x86)\Git\bin\git.exe checkout -f 7c81cb7e96e8359d2c1e45bffcef296c0e8a9051
       > C:\Program Files (x86)\Git\bin\git.exe rev-list 7c81cb7e96e8359d2c1e45bffcef296c0e8a9051
       > C:\Program Files (x86)\Git\bin\git.exe tag -a -f -m Jenkins Build #224 jenkins-build_test-224
      Parsing POMs
      

        1. 00_build_test Changes [Jenkins].htm
          36 kB
        2. 01.png
          01.png
          36 kB
        3. 02.png
          02.png
          30 kB
        4. 03.png
          03.png
          31 kB
        5. 2014-09-01 12-56-12 Скриншот экрана.png
          2014-09-01 12-56-12 Скриншот экрана.png
          25 kB
        6. result.htm
          29 kB

          [JENKINS-23091] Git. Problem with encoding in the list changes

          Mark Waite added a comment - - edited

          As far as I can tell from the git commit man page, the contents of the git commit log are assumed to be UTF-8. That assumed character encoding is overridden if i18n.commitencoding is set.

          The value of i18n.logoutputencoding is allowed to alter the encoding coming from the "git log", "git show" and "git blame" commands, allowing an override of the i18n.commitencoding value.

          Unfortunately, I don't think there are currently any tests in the git-client-plugin or the git-plugin which test interesting combinations of those variables. It seems like a good area for a parameterized JUnit test.

          Mark Waite added a comment - - edited As far as I can tell from the git commit man page , the contents of the git commit log are assumed to be UTF-8. That assumed character encoding is overridden if i18n.commitencoding is set. The value of i18n.logoutputencoding is allowed to alter the encoding coming from the "git log", "git show" and "git blame" commands, allowing an override of the i18n.commitencoding value. Unfortunately, I don't think there are currently any tests in the git-client-plugin or the git-plugin which test interesting combinations of those variables. It seems like a good area for a parameterized JUnit test.

          hayarobi Park added a comment - - edited

          I'm not comfortable to English. Just excuse me.

          The simplest walkaround is choosing JGit(not git.exe)to Git installations.

          CliGitAPIImpl launches external git.exe executable and get log messages via stdout stream.(in CliGitAPIImpl.java,line 648~660)

          the "git.exe" print changelog with UTF-8 encoding. It seems to that the launcher (or something in MS Windows) assumes that message is encoded in OS's current encoding (CP949 in my case) and try to convert it from CP949 to Unicode, so making normal UTF8 text to garbled text. Finally, CliGitAPIImpl get this garbled text.

          I don't know exactly which module is doing that bad converting yet, but some test might give me a tiny clues.

          In Powershell, doing like below get the same garble text.
          D:\git\cloned> git.exe whatchanged --no-abbrev -M --pretty=raw > Idontlikethis.txt

          Adding --encoding option make git.ext to print with given encoding, and doing so like below will get correct message. Test with your launages' encoding.
          D:\git\cloned> git.exe whatchanged --no-abbrev -M --pretty=raw --encoding=cp949 > IwantThisResult.txt

          hayarobi Park added a comment - - edited I'm not comfortable to English. Just excuse me. The simplest walkaround is choosing JGit(not git.exe)to Git installations. CliGitAPIImpl launches external git.exe executable and get log messages via stdout stream.(in CliGitAPIImpl.java,line 648~660) the "git.exe" print changelog with UTF-8 encoding. It seems to that the launcher (or something in MS Windows) assumes that message is encoded in OS's current encoding (CP949 in my case) and try to convert it from CP949 to Unicode, so making normal UTF8 text to garbled text. Finally, CliGitAPIImpl get this garbled text. I don't know exactly which module is doing that bad converting yet, but some test might give me a tiny clues. In Powershell, doing like below get the same garble text. D:\git\cloned> git.exe whatchanged --no-abbrev -M --pretty=raw > Idontlikethis.txt Adding --encoding option make git.ext to print with given encoding, and doing so like below will get correct message. Test with your launages' encoding. D:\git\cloned> git.exe whatchanged --no-abbrev -M --pretty=raw --encoding=cp949 > IwantThisResult.txt

          hayarobi Park added a comment -

          I found some blog posting that seems to be related to this issue.

          https://rkeithhill.wordpress.com/2010/05/26/handling-native-exe-output-encoding-in-utf8-with-no-bom/

          hayarobi Park added a comment - I found some blog posting that seems to be related to this issue. https://rkeithhill.wordpress.com/2010/05/26/handling-native-exe-output-encoding-in-utf8-with-no-bom/

          Era Tolekov added a comment -

          I guess this issue JENKINS-6203 is same.

          Era Tolekov added a comment - I guess this issue JENKINS-6203 is same.

          Era Tolekov added a comment -

          Another workaround is set i18n.commitEncoding and i18n.logOutputEncoding. For example,

          git config --global i18n.commitEncoding cp949
          git config --global i18n.logOutputEncoding cp949

          Era Tolekov added a comment - Another workaround is set i18n.commitEncoding and i18n.logOutputEncoding. For example, git config --global i18n.commitEncoding cp949 git config --global i18n.logOutputEncoding cp949

          I might have a working fix for this issue. Here is the fix: https://github.com/gennady/git-client-plugin/commit/aef7fff3ff765e2f8fd2b270d89e3f6b462cc2de

          Give it a try if you don't mind.

          You can compile the plugin yourself with

          mvn package
          

          or try already compiled version https://github.com/gennady/git-client-plugin/raw/8383bd7c222b52e26b0d1b395b2eb26766f86cf7/compiled-plugin/git-client.hpi

          How to try:

          • stop jenkins
          • remove git-client, git-client.hpi, git-client.jpi from the plugins folder
          • copy git-client.hpi to the plugins folder
          • start jenkins

          Gennady Trafimenkov added a comment - I might have a working fix for this issue. Here is the fix: https://github.com/gennady/git-client-plugin/commit/aef7fff3ff765e2f8fd2b270d89e3f6b462cc2de Give it a try if you don't mind. You can compile the plugin yourself with mvn package or try already compiled version https://github.com/gennady/git-client-plugin/raw/8383bd7c222b52e26b0d1b395b2eb26766f86cf7/compiled-plugin/git-client.hpi How to try: stop jenkins remove git-client, git-client.hpi, git-client.jpi from the plugins folder copy git-client.hpi to the plugins folder start jenkins

          Vasily Pupkin added a comment -

          Gennady, thanks for the fix.
          But I can not try, because I work in a different place, perhaps someone else will try yours fix.

          Vasily Pupkin added a comment - Gennady, thanks for the fix. But I can not try, because I work in a different place, perhaps someone else will try yours fix.

          The fix works perfect for me with Czech commit messages. I hope the plugin will be officially available soon.

          Jiří Engelthaler added a comment - The fix works perfect for me with Czech commit messages. I hope the plugin will be officially available soon.

          Code changed in jenkins
          User: Gennady Trafimenkov
          Path:
          src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java
          src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java
          src/test/resources/unicodeCharsInChangelogRepo.zip
          src/test/resources/unicodeCharsInChangelogRepoCreate.sh
          http://jenkins-ci.org/commit/git-client-plugin/c99c91fcf497e784204398761be5c10f438d0e55
          Log:
          Fixed garbled commit messages on Windows

          On windows changelog commit messages with unicode characters are
          not saved correctly to changelog.xml when CliGitAPI
          implementation is in use.

          That happens because "git whatchanged" gives byte stream of data.
          Commit messages in that stream are encoded in UTF-8. It is
          necessary to explicitly decode bytestream to strings using UTF-8
          encoding, otherwise default system encoding will be used.

          This should fix issues:
          https://issues.jenkins-ci.org/browse/JENKINS-6203
          https://issues.jenkins-ci.org/browse/JENKINS-14798
          https://issues.jenkins-ci.org/browse/JENKINS-23091

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gennady Trafimenkov Path: src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java src/test/java/org/jenkinsci/plugins/gitclient/GitAPITestCase.java src/test/resources/unicodeCharsInChangelogRepo.zip src/test/resources/unicodeCharsInChangelogRepoCreate.sh http://jenkins-ci.org/commit/git-client-plugin/c99c91fcf497e784204398761be5c10f438d0e55 Log: Fixed garbled commit messages on Windows On windows changelog commit messages with unicode characters are not saved correctly to changelog.xml when CliGitAPI implementation is in use. That happens because "git whatchanged" gives byte stream of data. Commit messages in that stream are encoded in UTF-8. It is necessary to explicitly decode bytestream to strings using UTF-8 encoding, otherwise default system encoding will be used. This should fix issues: https://issues.jenkins-ci.org/browse/JENKINS-6203 https://issues.jenkins-ci.org/browse/JENKINS-14798 https://issues.jenkins-ci.org/browse/JENKINS-23091

          Mark Waite added a comment -

          Fix included in git client plugin 1.19.3, released 6 Feb 2016

          Mark Waite added a comment - Fix included in git client plugin 1.19.3, released 6 Feb 2016

            Unassigned Unassigned
            babyroot Vasily Pupkin
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: