• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • None
    • Platform: All, OS: All

      The character 'Â' appears in several places on the Hudson GUI. Most noticeably
      to the left of the "New Job" and "Manage Hudson" links in the sidebar and to
      the right of the search box at the top. It also occasionally appears in the
      top of the third column of the "Build Executor Status" table in the sidebar.

      I am running the latest CVS snapshot of Hudson (later than 167 but before the
      168 release), using Jetty 5.1.10 on Ubuntu 7. This is the first time I've
      tried this particular combination, so I don't know if it was an issue in
      previous builds.

      I guess this is some kind of encoding issue, but I'm not sure exactly what.

      This does not appear to be a browser issue as it is the same in both Safari and
      Opera.

          [JENKINS-1166] Strange characters appear on pages

          dwdyer added a comment -

          Created an attachment (id=158)
          Screenshot (from Safari, Opera is identical)

          dwdyer added a comment - Created an attachment (id=158) Screenshot (from Safari, Opera is identical)

          This happens because Hudson is sending pages in UTF-8 but the browser is
          interpreting the page as iso-8859-1.

          The page Hudson sends often includes Unicode "non-breaking space" character
          (written as   in HTML), which has Unicode code point U+A0.

          When this character is sent as UTF-8, the encoding will turn this single
          character into two bytes, 0xC2 followed by 0xA0.

          If this byte sequence is decoded as iso-8859-1, you'll get "A circumflex"
          followed by "non-breaking whitespace"

          So the thing I'd like you to find out is why browser is decoding this as
          iso-8859-1. There are several places to check:

          1. if your browser can display HTTP response headers, please check if that
          contains "Content-type: text/html; charset=UTF-8". Hudson is supposed to be
          doing this for all its pages.

          2. Make sure you are not forcing browsers to interpret every page in iso-8859-1.

          3. If you have a tool like wget, that can capture the HTML response
          byte-by-byte, please use that, zip the result up, and attach it here. Using zip
          makes sure that java.net won't mess up encoding.

          Perhaps we can also send <meta http-equiv="..."> tag to further insist that we
          really do mean UTF-8.

          Kohsuke Kawaguchi added a comment - This happens because Hudson is sending pages in UTF-8 but the browser is interpreting the page as iso-8859-1. The page Hudson sends often includes Unicode "non-breaking space" character (written as   in HTML), which has Unicode code point U+A0. When this character is sent as UTF-8, the encoding will turn this single character into two bytes, 0xC2 followed by 0xA0. If this byte sequence is decoded as iso-8859-1, you'll get "A circumflex" followed by "non-breaking whitespace" So the thing I'd like you to find out is why browser is decoding this as iso-8859-1. There are several places to check: 1. if your browser can display HTTP response headers, please check if that contains "Content-type: text/html; charset=UTF-8". Hudson is supposed to be doing this for all its pages. 2. Make sure you are not forcing browsers to interpret every page in iso-8859-1. 3. If you have a tool like wget, that can capture the HTML response byte-by-byte, please use that, zip the result up, and attach it here. Using zip makes sure that java.net won't mess up encoding. Perhaps we can also send <meta http-equiv="..."> tag to further insist that we really do mean UTF-8.

          dwdyer added a comment -

          The JVM was picking up the default encoding as ANSI_X3.4-1968. I fixed up the
          server's locale and encoding so that it is now en_GB.UTF-8, but the problem
          persists. The server is using Apache 2.2.4 in front of Jetty, connected by
          mod_jk and ajp13.

          I also checked in Firefox to see if it was any different, but it wasn't so
          that's a trio of browsers (all on OS X) that exhibit the problem.

          The Content-Header returned by the server does not include the charset:

          Content-Type: text/html

          I can't find anywhere in the Hudson source where content-type is set (other
          than in the Japex plugin and a couple of Javascript files).

          dwdyer added a comment - The JVM was picking up the default encoding as ANSI_X3.4-1968. I fixed up the server's locale and encoding so that it is now en_GB.UTF-8, but the problem persists. The server is using Apache 2.2.4 in front of Jetty, connected by mod_jk and ajp13. I also checked in Firefox to see if it was any different, but it wasn't so that's a trio of browsers (all on OS X) that exhibit the problem. The Content-Header returned by the server does not include the charset: Content-Type: text/html I can't find anywhere in the Hudson source where content-type is set (other than in the Japex plugin and a couple of Javascript files).

          dwdyer added a comment -

          OK, I found where Hudson is setting Content-Type (I was searching for "Content-
          Type" rather than "contentType") - it's in layout.jelly. The expires header
          set in the same place appears to work, but the content type has the charset
          stripped from it.

          dwdyer added a comment - OK, I found where Hudson is setting Content-Type (I was searching for "Content- Type" rather than "contentType") - it's in layout.jelly. The expires header set in the same place appears to work, but the content type has the charset stripped from it.

          dwdyer added a comment -

          The more I investigate this, the more I think it is a bug in Jetty 5.1.10 (I've
          eliminated Apache by using Jetty directly).

          The exact same version of Hudson works in Jetty 6.1.7 via the Maven plugin
          (albeit on a different machine - OS X instead of Ubuntu).

          With no better ideas, I tried using Stapler's header tag instead of the
          contentType tag, and surprisingly that fixes the problem (I've looked at the
          Stapler source and can't understand why this would be - it must be a Jetty bug).

          Are you happy for me to commit this fix, or would you like to change Stapler
          instead (perhaps the contentType tag would work if it explicitly called
          setCharacterEncoding() - though this is just a guess)?

          dwdyer added a comment - The more I investigate this, the more I think it is a bug in Jetty 5.1.10 (I've eliminated Apache by using Jetty directly). The exact same version of Hudson works in Jetty 6.1.7 via the Maven plugin (albeit on a different machine - OS X instead of Ubuntu). With no better ideas, I tried using Stapler's header tag instead of the contentType tag, and surprisingly that fixes the problem (I've looked at the Stapler source and can't understand why this would be - it must be a Jetty bug). Are you happy for me to commit this fix, or would you like to change Stapler instead (perhaps the contentType tag would work if it explicitly called setCharacterEncoding() - though this is just a guess)?

          dwdyer added a comment -

          I've committed the change to layout.jelly (revision 1.40) since it's only a
          single-line and I don't think it is going to cause problems elsewhere. I think
          it's worthwhile since it makes Hudson work nicely with the latest stable
          version of Jetty.

          If you come up with something better (either in Hudson or Stapler) you can
          revert my change.

          dwdyer added a comment - I've committed the change to layout.jelly (revision 1.40) since it's only a single-line and I don't think it is going to cause problems elsewhere. I think it's worthwhile since it makes Hudson work nicely with the latest stable version of Jetty. If you come up with something better (either in Hudson or Stapler) you can revert my change.

          With an upgrade to version 1.598, the A letters appeard on our system (ubuntu 14.04) next to nbsp. We use an apache 2.4.7 as a reverse proxy. I checked the problem with different browser (chromium, firefox) and the charset is always set on UTF-8. How can I investigate this issues further?

          Fabian Mutzbauer added a comment - With an upgrade to version 1.598, the A letters appeard on our system (ubuntu 14.04) next to nbsp. We use an apache 2.4.7 as a reverse proxy. I checked the problem with different browser (chromium, firefox) and the charset is always set on UTF-8. How can I investigate this issues further?

            Unassigned Unassigned
            dwdyer dwdyer
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: