Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36637

GIT changelog invalid char 0x1b (escape) in XML

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • git-plugin
    •  jenkins-2.9-1.1
      Git plugin 2.5.2
      Oracle Linux

      Our Jenkins server is running jenkins-2.9-1.1. Git plugin 2.5.2 is installed.

      I have a Jenkins job that I use to get the changelog of our GIT repos.

      In the job's Source Code Management section, I selected multiple SCMs. I added 4 GIT repos (horizon, keystone, glance & base). In each GIT section, I added the GIT repo info (repo URL, credentials, branches to build).

      For Additional Behaviors, I added
      Calculate changelog against a specific branch
      Check out to a sub-directory
      Custom SCM name

      E.g.

      • Branch to build refs/heads/next
      • Calculate changelog against a specific branch
        Name of repository: refs
        Name of branch: tags/<tag_name>

      Issue:
      If I ran the job with just 1 GIT repo, it would return the changelog.
      If I included just 3 of the 4 repos (when run individually or together in the same job), it would return the changelog for each repo.
      If I included all 4 repos, no changelog would return.

      So, it looked like 1 of these 4 repos (openstack-base in this case) couldn't get the changelog.

      What I found out:
      In the build dir, I could see the changelog.xml file. It contained all the changelog of all 4 repos.
      <sub-log scm="SCM_NAME_HORIZON">
      <sub-log scm="SCM_NAME_KEYSTONE">
      <sub-log scm="SCM_NAME_GLANCE">
      <sub-log scm="SCM_NAME_BASE">

      The problem is that when I added all 4 repos to the job, it should return the changelog of
      3 repos, but the job returned no changelog for all 4. Not even the 3 repos that suppose to return something when ran individually or together. It looked like the one job that has changelog issue (openstack-base) caused the other 3 to not report the changelogs also.

      I went to the Jenkins log (/var/log/jenkins/jenkins.log) and it complained about the changelog having an invalid XML character:

      Jul 08, 2016 5:11:41 PM hudson.model.AbstractBuild calcChangeSet
      WARNING: Failed to parse /var/lib/jenkins/jobs/kt_test1/builds/112/changelog.xml
      org.xml.sax.SAXParseException; systemId: file:/var/lib/jenkins/jobs/kt_test1/builds/112/changelog.xml; lineNumber: 7241; columnNumber: 18; An invalid XML character (Unicode: 0x1b) was found in the CDATA section.
          at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
          at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
          at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
          at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
          at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1438)
          at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(XMLDocumentFragmentScannerImpl.java:1691)
          at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3017)
          at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
          at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
          at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
          at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
          at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
          at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
          at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
          at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
          at javax.xml.parsers.SAXParser.parse(SAXParser.java:328)
          at org.jenkinsci.plugins.multiplescms.MultiSCMChangeLogParser.parse(MultiSCMChangeLogParser.java:158)
          at hudson.model.AbstractBuild.calcChangeSet(AbstractBuild.java:911)
          at hudson.model.AbstractBuild.access$600(AbstractBuild.java:105)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:616)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
          at hudson.model.Run.execute(Run.java:1720)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:98)
          at hudson.model.Executor.run(Executor.java:410)
      

      The line it is complaining is

          meta:release:^[[1;35;40m^[[KCommit:^[[m^[[K Davanum Srinivas &lt;davanum@gmail.com&gt;
      

      It looked like those ^ &, etc. char are invalid xml characters. There are other lines with these invalid char also.

      We could write a script to remove those "invalid characters" after the job is finished, but how do we display the changelog the same way we see them in the job build?

      What is the best way to get the GIT plugin to clean up/remove/ignore those invalid char when calculating the changelog so that when the job is done, it will display the changelog properly without having to run additional tasks to "cleanup"/fix the problem?

          [JENKINS-36637] GIT changelog invalid char 0x1b (escape) in XML

          Mark Waite added a comment - - edited

          I think I've seen a similar problem with a single repository using interesting strings in the changelogs. Refer to jenkins-bugs branch JENKINS-36637 for the sample that I create.

          That job definition is not complete enough yet to detect the failure itself, but when I run that job and view the changes, it shows a single row in the list of changes, even though there were hundreds of changes (as constructed from the sample strings repository).

          I don't think there is anything you can do to fix it. It will need a fix in the git plugin or the git client plugin.

          Mark Waite added a comment - - edited I think I've seen a similar problem with a single repository using interesting strings in the changelogs. Refer to jenkins-bugs branch JENKINS-36637 for the sample that I create. That job definition is not complete enough yet to detect the failure itself, but when I run that job and view the changes, it shows a single row in the list of changes, even though there were hundreds of changes (as constructed from the sample strings repository ). I don't think there is anything you can do to fix it. It will need a fix in the git plugin or the git client plugin.

          Mark Waite added a comment -

          With further (recent) exploring, I've found that the JGit implementation seems to handle my test case better than the command line git implementation.

          If you need to show exotic characters in a change log, you may want to consider the JGit implementation instead of command line git.

          Mark Waite added a comment - With further (recent) exploring, I've found that the JGit implementation seems to handle my test case better than the command line git implementation . If you need to show exotic characters in a change log, you may want to consider the JGit implementation instead of command line git.

            Unassigned Unassigned
            kittywtam Kitty Tam
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: