Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48463

Jenkins should create xml 1.1 output in order to support control characters that are illegal in xml 1.0

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Minor Minor
    • core

      The current implementation of XmlFile.java emits an XML 1.0 header, which breaks things like Move/Copy/Promote if the user has included any characters that are illegal in XML 1.0 (such as Control-XX, etc) in their jobs.
      XStream, which is used for serialization/deserialization, deals with XML fragments, and doesn't have an issue reading/writing this non well-formed XML. Changing XmlFile.java so that it creates xml 1.1  here will allow jenkins config files to support these special characters. This also requires updating the underlying XML Pull Parser being used by XStream to something that support XML v1.1

          [JENKINS-48463] Jenkins should create xml 1.1 output in order to support control characters that are illegal in xml 1.0

          mikecirioli, yeah.  That's the "workaround" mentioned most places.  I was trying to set a good example to some junior employees.  But, it seems that XML1.1 is just not widely supported and has even been declared "dead".

          I had a look at the code for this change in Jenkins and they are just hard-coding in the version number.  So, we are probably pretty safe to just remedy this "hard-coded quirk" with a "hard-coded undo" in our code.

          Thanks for attempting to help.

          P.S. For people who are not attempting to do what we are doing (backing up configs), there is a JSON API that can be used in place of the XML.

          Andrew Lamonica added a comment - mikecirioli , yeah.  That's the "workaround" mentioned most places.  I was trying to set a good example to some junior employees.  But, it seems that XML1.1 is just not widely supported and has even been declared " dead ". I had a look at the code for this change in Jenkins and they are just hard-coding in the version number.  So, we are probably pretty safe to just remedy this "hard-coded quirk" with a "hard-coded undo" in our code. Thanks for attempting to help. P.S. For people who are not attempting to do what we are doing (backing up configs), there is a JSON API that can be used in place of the XML.

          Robert Pitt added a comment -

          As another Windows/.NET organisation the move to XML 1.1 has impacting us as well.
          After a google it seems that XML 1.1 is not well supported; moving Jenkins to 1.1 is regrettable.
          Realistically though I don't imagine this change to be rolled back, we're going to have to hack our code.

          Robert Pitt added a comment - As another Windows/.NET organisation the move to XML 1.1 has impacting us as well. After a google it seems that XML 1.1 is not well supported; moving Jenkins to 1.1 is regrettable. Realistically though I don't imagine this change to be rolled back, we're going to have to hack our code.

          Not sure how and why this is impacting anything, but I would recommend you possibly take that as an opportunity to take a step back.

          AKAIK, the XML format of Jenkins has always been considered an internal format, and could pretty well change anytime again in the near future, as for instance architectural changes are happening currently to evolve Jenkins.

          So rpitt and rlamoni, I think this would be great if you people can come to the mailing lists, or explain what you are currently doing that impacted you. 

          For instance, if you are generating jobs, then you want to look at the Job DSL plugin. If you want to configure Jenkins, possibly look at the very active https://github.com/jenkinsci/configuration-as-code-plugin

          In any case, again, I would strongly recommend people to avoid generating Jenkins' XML, or come to explain what the use case is so that it can be supported durably, for the benefit of everyone.

          Because, well, in that very case, updating to XML 1.1, a technology published in 2004 was probably long overdue anyway.

          Baptiste Mathus added a comment - Not sure how and why this is impacting anything, but I would recommend you possibly take that as an opportunity to take a step back. AKAIK, the XML format of Jenkins has always been considered an internal format, and could pretty well change anytime again  in the near future, as for instance architectural changes are happening currently to evolve Jenkins. So rpitt and rlamoni , I think this would be great if you people can come to the mailing lists, or explain what you are currently doing that impacted you.  For instance, if you are generating jobs, then you want to look at the Job DSL plugin . If you want to configure Jenkins, possibly look at the very active https://github.com/jenkinsci/configuration-as-code-plugin In any case, again, I would strongly recommend people to avoid generating Jenkins' XML, or come to explain what the use case is so that it can be supported durably, for the benefit of everyone. Because, well, in that very case, updating to XML 1.1, a technology published in 2004 was probably long overdue anyway.

          batmat, I'm not sure you are helping the XML1.1-argument when you remind people that XML 1.1 was published in 2004. Any software standard that is not well-supported after more than a decade is not likely to suddenly gain widespread acceptance.

          But, as I mentioned before, it looks like Jenkins isn't really doing XML 1.1, anyway.  The code is just emitting the header, then doing business-as-usual for the body.  That's weird, but not a deal breaker.  No one is going to switch away from a great technology like Jenkins just because they have to do a little text-replacing when using one of the APIs.

          I'm not sure what mailing list you think I should join.  But, I can give a quick explanation of what we are using the XML for.

          My company tracks changes to configurations of most of our off-the-shelf and open-source software (as well as some hardware) in something we call the "recovery system".  This recovery system is a little like an automated backup, except that it can be used to easily view changes made to confirmations. The system is often the first place people go where there is a mysterious service disruption since it aggregates all changes together into a single timeline.

          The code our "recovery system" uses to extract Jenkins' would be more-or-less agnostic to XML-type (since it largely doesn't care about the format of the configuration it is backing up) except that it gets a lists of things to track by inspecting the output of other XML APIs. (such as jankins.hostname/api/xml).  At time of writing this, I'm not 100% sure which API calls were failing to parse, since the person on my team who fixed this did so in a very-general way and it's unlikely all of them failed (given my example doesn't have a header at all).  But, maybe I can provide more details after locating the mailing-list.

          Cheers.

          Andrew Lamonica added a comment - batmat , I'm not sure you are helping the XML1.1-argument when you remind people that XML 1.1 was published in 2004. Any software standard that is not well-supported after more than a decade is not likely to suddenly gain widespread acceptance. But, as I mentioned before, it looks like Jenkins isn't really doing XML 1.1, anyway.  The code is just emitting the header, then doing business-as-usual for the body.  That's weird, but not a deal breaker.  No one is going to switch away from a great technology like Jenkins just because they have to do a little text-replacing when using one of the APIs. I'm not sure what mailing list you think I should join.  But, I can give a quick explanation of what we are using the XML for. My company tracks changes to configurations of most of our off-the-shelf and open-source software (as well as some hardware) in something we call the "recovery system".  This recovery system is a little like an automated backup, except that it can be used to easily view changes made to confirmations. The system is often the first place people go where there is a mysterious service disruption since it aggregates all changes together into a single timeline. The code our "recovery system" uses to extract Jenkins' would be more-or-less agnostic to XML-type (since it largely doesn't care about the format of the configuration it is backing up) except that it gets a lists of things to track by inspecting the output of other XML APIs. (such as jankins.hostname/api/xml).  At time of writing this, I'm not 100% sure which API calls were failing to parse, since the person on my team who fixed this did so in a very-general way and it's unlikely all of them failed (given my example doesn't have a header at all).  But, maybe I can provide more details after locating the mailing-list. Cheers.

          Aaron Jensen added a comment -

          Please roll this back. We automate Jenkins using .NET and it doesn't support XML 1.1. It won't even open/parse any XML 1.1 files.

          Aaron Jensen added a comment - Please roll this back. We automate Jenkins using .NET and it doesn't support XML 1.1. It won't even open/parse any XML 1.1 files.

          Aaron Jensen added a comment -

          If the config.xml file format is considered internal, Jenkins shouldn't have APIs that expose it. That's not internal.

          Aaron Jensen added a comment - If the config.xml file format is considered internal, Jenkins shouldn't have APIs that expose it. That's not internal.

          Aaron Jensen added a comment -

          Here's our use-case: Developers have certain golden paths for creating specific types of applications. We have templates in Jenkins for each type. Developers run a build to create a new application. We grab the config.xml from the template, update it to use values specific to the new job, then post that config.xml to Jenkins to create the job.

          Switching to the two plug-ins mentioned is a big under-taking. We have to learn Groovy and/or export our entire Jenkins configuration as YAML and write automation to apply it. Given enough warning, we could do it. This feature should be rolled back, then announced that it's coming with instructions on how to migrate to some way of managing job configurations without touching config.xml files. Then the config.xml file format can become internal and you can change to 1.1 if you want and this could get refiled as "allow job configuration to include control characters".

          Aaron Jensen added a comment - Here's our use-case: Developers have certain golden paths for creating specific types of applications. We have templates in Jenkins for each type. Developers run a build to create a new application. We grab the config.xml from the template, update it to use values specific to the new job, then post that config.xml to Jenkins to create the job. Switching to the two plug-ins mentioned is a big under-taking. We have to learn Groovy and/or export our entire Jenkins configuration as YAML and write automation to apply it. Given enough warning, we could do it. This feature should be rolled back, then announced that it's coming with instructions on how to migrate to some way of managing job configurations without touching config.xml files. Then the config.xml file format can become internal and you can change to 1.1 if you want and this could get refiled as "allow job configuration to include control characters".

          Andrew Lamonica added a comment - splatteredbits , here is what we ended up doing.  https://paste.ofcode.org/3aPEN9YyVDEcCGM2JmX8zw

          You must note that most editors that work fine with XML 1.0 can't parse XML 1.1. After more than a decade successfully using the free MS utility XML Notepad for all XML tasks I may have to look for another editor. How much of a demand there was for this change?

          Ioannis Moutsatsos added a comment - You must note that most editors that work fine with XML 1.0 can't parse XML 1.1. After more than a decade successfully using the free MS utility XML Notepad for all XML tasks I may have to look for another editor. How much of a demand there was for this change?

          James Howe added a comment - - edited

          Just to add, as far as I can tell you only need 1.1 if you've got control characters in names (i.e. of tags and attributes).
          Surely if people are putting them in their configuration it's in text content, where 1.0 works just fine with standard escapes.

          Not only does .NET not support xml 1.1 but neither does Firefox, nor any of my editors.

          James Howe added a comment - - edited Just to add, as far as I can tell you only need 1.1 if you've got control characters in names (i.e. of tags and attributes). Surely if people are putting them in their configuration it's in text content, where 1.0 works just fine with standard escapes. Not only does .NET not support xml 1.1 but neither does Firefox, nor any of my editors.

            mikecirioli mike cirioli
            mikecirioli mike cirioli
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: