Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53901

Using readFile does not handle UTF-8 with BOM files

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • None
    • Jenkins 2.121.2 and Jenkins 2.81 Pipeline Groovy Plugin 2.54

      I'm extracting xml file (nuspec) from some nuget packages and trying to parse it. In most cases it works fine, but in some the xml was written using UTF-8 with BOM encoding, and then parser gets upset and reports:

      org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
      

      The way I'm parsing xml is:

      @NonCPS
      def parsePackage(packageName, packageVersion) {
          def packageFullName = "${packageName}.${packageVersion}"
        bat """curl -L https://www.nuget.org/api/v2/package/${packageName}/${packageVersion} -o ${packageFullName}.nupkg"""
        bat """unzip ${packageFullName}.nupkg -d ${packageFullName}"""
      
        def nuspecPath = """${packageFullName}\\${packageName}.nuspec"""
        def nuspecContent = readFile file:nuspecPath
        def nuspecXML = new XmlSlurper( false, false ).parseText(nuspecContent)
        println nuspecXML.metadata.version
        
        def newXml = XmlUtil.serialize(nuspecXML)
        return newXml
      }
      

      It looks like readFile is not supporting UTF-8 with BOM as it is passing leading BOM characters into returned string.

       

      I tried to replicate it directly in groovy doing 

      def xmldata = new File("Newtonsoft.Json.nuspec").text
      def pkg = new XmlSlurper().parseText(xmldata) 
      println pkg.metadata.version.text()
      

      But here the leading BOM characters are not passed into xmldata variable

       

      Attached example nuspec with BOM in it.

       

       

          [JENKINS-53901] Using readFile does not handle UTF-8 with BOM files

          Jakub Pawlinski created issue -
          Jakub Pawlinski made changes -
          Description Original: The readFile step, when used inside a environment closure, whether top-level or in a stage, causes the following error:
          an exception which occurred:
          in field com.cloudbees.groovy.cps.impl.BlockScopeEnv.locals
          in object com.cloudbees.groovy.cps.impl.LoopBlockScopeEnv@29044815
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@25c9f135
          in field com.cloudbees.groovy.cps.impl.CallEnv.caller
          in object com.cloudbees.groovy.cps.impl.FunctionCallEnv@307ab985
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@5a92c230
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@37a0a42f
          in field com.cloudbees.groovy.cps.impl.CallEnv.caller
          in object com.cloudbees.groovy.cps.impl.ClosureCallEnv@184a6ff5
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@676c6c8d
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@19f01356
          in field com.cloudbees.groovy.cps.impl.CallEnv.caller
          in object com.cloudbees.groovy.cps.impl.ClosureCallEnv@74d1467b
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@4d098490
          in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
          in object com.cloudbees.groovy.cps.impl.BlockScopeEnv@28223d82
          in field com.cloudbees.groovy.cps.impl.CallEnv.caller
          in object com.cloudbees.groovy.cps.impl.FunctionCallEnv@6e27611b
          in field com.cloudbees.groovy.cps.Continuable.e
          in object org.jenkinsci.plugins.workflow.cps.SandboxContinuable@78ff9c41
          in field org.jenkinsci.plugins.workflow.cps.CpsThread.program
          in object org.jenkinsci.plugins.workflow.cps.CpsThread@7841b6fe
          in field org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.threads
          in object org.jenkinsci.plugins.workflow.cps.CpsThreadGroup@4d2d90ce
          in object org.jenkinsci.plugins.workflow.cps.CpsThreadGroup@4d2d90ce
          Caused: java.io.NotSerializableException: java.util.TreeMap$Entry
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:860)
          at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65)
          at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56)
          at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50)
          at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179)
          at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
          at java.util.HashMap.internalWriteEntries(HashMap.java:1785)
          at java.util.HashMap.writeObject(HashMap.java:1362)
          at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:273)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65)
          at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56)
          at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50)
          at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179)
          at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
          at java.util.TreeMap.writeObject(TreeMap.java:2438)
          at sun.reflect.GeneratedMethodAccessor176.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:273)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
          at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
          at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.writeObject(RiverWriter.java:140)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:458)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:434)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgramIfPossible(CpsThreadGroup.java:422)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:362)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242)
          at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230)
          at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          at java.lang.Thread.run(Thread.java:748)

          A test repo was created to replicate this.

          https://github.com/sflynn-dell/pipeline-test

          Branches:
               declarative-script - readFile is successful when used inside a script closure.
               declarative-env - readFile fails when used inside an environment enclosure.
          New: Using Jenkins ver. 2.121.2 I'm extracting xml file (nuspec) from some nuget packages and trying to parse it. In most cases it works fine, but in some the xml was written using UTF-8 with BOM encoding, and then parser gets upset and reports:
          {code:java}
          org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
          {code}
          The way I'm parsing xml is:
          {code:java}
          @NonCPS
          def parsePackage(packageName, packageVersion) {
              def packageFullName = "${packageName}.${packageVersion}"
            bat """curl -L https://www.nuget.org/api/v2/package/${packageName}/${packageVersion} -o ${packageFullName}.nupkg"""
            bat """unzip ${packageFullName}.nupkg -d ${packageFullName}"""

            def nuspecPath = """${packageFullName}\\${packageName}.nuspec"""
            def nuspecContent = readFile file:nuspecPath
            def nuspecXML = new XmlSlurper( false, false ).parseText(nuspecContent)
            println nuspecXML.metadata.version
            
            def newXml = XmlUtil.serialize(nuspecXML)
            return newXml
          }
          {code}
          It looks like readFile is not supporting UTF-8 with BOM as it is passing leading BOM characters into returned string.

           

          I tried to replicate it directly in groovy doing 
          {code:java}
          def xmldata = new File("Newtonsoft.Json.nuspec").text
          def pkg = new XmlSlurper().parseText(xmldata)
          println pkg.metadata.version.text()
          {code}
          But here the leading BOM characters are not passed into xmldata variable

           

          Attached example nuspec with BOM in it.

           

           
          Jakub Pawlinski made changes -
          Attachment New: Newtonsoft.Json.nuspec [ 44656 ]
          Jakub Pawlinski made changes -
          Environment Original: Jenkins 2.73.1 and Jenkins 2.81 Pipeline Groovy Plugin 2.40 New: Jenkins 2.121.2 and Jenkins 2.81 Pipeline Groovy Plugin 2.54
          Jakub Pawlinski made changes -
          Description Original: Using Jenkins ver. 2.121.2 I'm extracting xml file (nuspec) from some nuget packages and trying to parse it. In most cases it works fine, but in some the xml was written using UTF-8 with BOM encoding, and then parser gets upset and reports:
          {code:java}
          org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
          {code}
          The way I'm parsing xml is:
          {code:java}
          @NonCPS
          def parsePackage(packageName, packageVersion) {
              def packageFullName = "${packageName}.${packageVersion}"
            bat """curl -L https://www.nuget.org/api/v2/package/${packageName}/${packageVersion} -o ${packageFullName}.nupkg"""
            bat """unzip ${packageFullName}.nupkg -d ${packageFullName}"""

            def nuspecPath = """${packageFullName}\\${packageName}.nuspec"""
            def nuspecContent = readFile file:nuspecPath
            def nuspecXML = new XmlSlurper( false, false ).parseText(nuspecContent)
            println nuspecXML.metadata.version
            
            def newXml = XmlUtil.serialize(nuspecXML)
            return newXml
          }
          {code}
          It looks like readFile is not supporting UTF-8 with BOM as it is passing leading BOM characters into returned string.

           

          I tried to replicate it directly in groovy doing 
          {code:java}
          def xmldata = new File("Newtonsoft.Json.nuspec").text
          def pkg = new XmlSlurper().parseText(xmldata)
          println pkg.metadata.version.text()
          {code}
          But here the leading BOM characters are not passed into xmldata variable

           

          Attached example nuspec with BOM in it.

           

           
          New: I'm extracting xml file (nuspec) from some nuget packages and trying to parse it. In most cases it works fine, but in some the xml was written using UTF-8 with BOM encoding, and then parser gets upset and reports:
          {code:java}
          org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
          {code}
          The way I'm parsing xml is:
          {code:java}
          @NonCPS
          def parsePackage(packageName, packageVersion) {
              def packageFullName = "${packageName}.${packageVersion}"
            bat """curl -L https://www.nuget.org/api/v2/package/${packageName}/${packageVersion} -o ${packageFullName}.nupkg"""
            bat """unzip ${packageFullName}.nupkg -d ${packageFullName}"""

            def nuspecPath = """${packageFullName}\\${packageName}.nuspec"""
            def nuspecContent = readFile file:nuspecPath
            def nuspecXML = new XmlSlurper( false, false ).parseText(nuspecContent)
            println nuspecXML.metadata.version
            
            def newXml = XmlUtil.serialize(nuspecXML)
            return newXml
          }
          {code}
          It looks like readFile is not supporting UTF-8 with BOM as it is passing leading BOM characters into returned string.

           

          I tried to replicate it directly in groovy doing 
          {code:java}
          def xmldata = new File("Newtonsoft.Json.nuspec").text
          def pkg = new XmlSlurper().parseText(xmldata)
          println pkg.metadata.version.text()
          {code}
          But here the leading BOM characters are not passed into xmldata variable

           

          Attached example nuspec with BOM in it.

           

           
          Andrew Bayer made changes -
          Component/s New: workflow-basic-steps-plugin [ 21712 ]
          Component/s Original: pipeline-model-definition-plugin [ 21706 ]
          Assignee Original: Andrew Bayer [ abayer ]
          Sam Van Oort made changes -
          Resolution New: Not A Defect [ 7 ]
          Status Original: Open [ 1 ] New: Closed [ 6 ]
          Jakub Pawlinski made changes -
          Resolution Original: Not A Defect [ 7 ]
          Status Original: Closed [ 6 ] New: Reopened [ 4 ]

            Unassigned Unassigned
            quas Jakub Pawlinski
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: