Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-66103

Windows workers locked to UTF-8 code pages, causing issues with subshells



    • Similar Issues:


      Everything that follows takes place within a scripted pipeline.

      Shells created on Windows workers using either the bat or powershell commands are locked down to the 65001 (UTF-8) code page, and using chcp or similar tools to attempt to change to anything else results in an "Invalid code page" error.

      Normally this wouldn't matter in the slightest, however in some cases, subshells created by programs called within bat or powershell commands incorrectly believe themselves to have successfully changed code pages, resulting in issues that are very difficult to track down. In my case, a program tried to send ASCII text to a server, but because of the unexpected code page issues, submitted text that was incorrectly prepended with an extra UTF-8 BOM character:


      Tracking this down took a very long time and we went through a lot of tests, so bear with me if I miss something, but I'm pretty sure we tried every conceivable workaround. Trying to change the Java options for either the master or worker nodes or explicitly setting the encoding for the individual bat or powershell commands would result in the console output hiding the erroneous character, but didn't change the behavior internally, so the server we were contacting still received corrupted data. This was all happening within the Kubernetes plugin, and if a new interactive shell is created on the container (and this can be either the JNLP container or another container in the pod) that shell can change code pages normally, but not the shells invoked directly by Jenkins. The only workaround we've found is to make the server explicitly watch for UTF-8 characters when contacted by Jenkins workers and strip them out, which is not great.

      A simple test of this whole issue can be performed by invoking a step with the following:

      bat 'chcp 437'

      It appears to us that this was a deliberate decision, presumably to protect people from accidentally sending Unicode while in an ASCII code page, but the non-standard behavior is a significant problem. If this behavior could be manipulated within a job or in Jenkins management, that would be sufficient, but it seems to me that it might be best to just generally fix it to allow code page changes.

      Apologies if any of this was unclear, or if I got anything wrong. Neither Unicode or Windows are normally my forte, but I'm relatively confident in our conclusions here. Please let me know if you are unable to reproduce and I'll try to come up with more relevant specifics.



          There are no comments yet on this issue.


            escoem Emilio Escobar
            structurefall Eric
            1 Vote for this issue
            2 Start watching this issue