Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7397

HTML email token content special characters are not HTML-escaped

    XMLWordPrintable

Details

    Description

      Extended email format content tokens can break the HTML when the token text contains characters <, >, &.

      For example, if the build log output contains XML-tags, then ${BUILD_LOG} (usually displayed within <pre> -tags) can break the HTML email. Similarly if commit message contains escape chars, then the %m in format-parameter of ${CHANGES} will break the HTML.

      Attachments

        Activity

          Another case where escaping is required: $FAILED_TESTS included a traceback containing this text:
          at java.lang.System.loadLibrary(System.java:1028)
          at org.nexusformat.NexusFile.<clinit>(NexusFile.java:99)
          at gda.data.nexus.extractor.NexusExtractor.runLoop(NexusExtractor.java:408)

          The <clinit> caused problems.

          mwebber Matthew Webber added a comment - Another case where escaping is required: $FAILED_TESTS included a traceback containing this text: at java.lang.System.loadLibrary(System.java:1028) at org.nexusformat.NexusFile.<clinit>(NexusFile.java:99) at gda.data.nexus.extractor.NexusExtractor.runLoop(NexusExtractor.java:408) The <clinit> caused problems.
          ashlux ashlux added a comment - - edited

          I'm thinking it would be sufficient to automatically escape HTML when the content type is HTML for all content types and just remove the escapeHtml argument altogether. Does this seem reasonable? Would this possibly break anyone?

          Edit: After some thinking on this, this would certainly impact the ${JELLY} and could impact some users of $CHANGES}, ${CHANGES_SINCE_LAST_SUCCESSFUL}, and ${CHANGES_SINCE_LAST_UNSTABLE}.

          What bothers me is forcing users to know about and remember to use escapeHtml is not very user friendly; I want a solution for that.

          ashlux ashlux added a comment - - edited I'm thinking it would be sufficient to automatically escape HTML when the content type is HTML for all content types and just remove the escapeHtml argument altogether. Does this seem reasonable? Would this possibly break anyone? Edit: After some thinking on this, this would certainly impact the ${JELLY} and could impact some users of $CHANGES}, ${CHANGES_SINCE_LAST_SUCCESSFUL}, and ${CHANGES_SINCE_LAST_UNSTABLE}. What bothers me is forcing users to know about and remember to use escapeHtml is not very user friendly; I want a solution for that.

          I agree that it is desirable to remove the "escapeHtml" parameter. However, you can't simply escape everything. Consider a specification like this:

          ${CHANGES_SINCE_LAST_UNSTABLE, reverse=true, format="====\n<b>Changes</b> for Build #%n\n", changesFormat="[%r] %d %a %m %p\n"}
          

          Actually, last time I tried that, it didn't work, but anyhow that's a separate issue. The point here is that the <b> in the format should not be escaped.

          I think the following should work:

          • do not escape any output that is explicitly provided by the user (Hudson admin). This means
            BUILD_LOG_REGEX: substText
            CHANGES: format, pathFormat (why doesn't this support changesFormat?)
            CHANGES_SINCE_LAST_SUCCESS: format, changesFormat, pathFormat
            CHANGES_SINCE_LAST_UNSTABLE: format, changesFormat, pathFormat
          • escape everything else. This includes change text, failed test details, environment variables, build log, etc.

          The Jelly templates for html are a little more tricky. Following the same rule, the template itself should not be escaped, but anything inserted into it should be. So, looking at the default Jelly template, when I see a line like this:

          <j:forEach var="cs" items="${changeSet.logs}" varStatus="loop">
          

          Clearly whatever ${changeSet.logs} gets expanded to needs to be escaped, but the <:j does not.

          I hope that makes sense.

          mwebber Matthew Webber added a comment - I agree that it is desirable to remove the "escapeHtml" parameter. However, you can't simply escape everything. Consider a specification like this: ${CHANGES_SINCE_LAST_UNSTABLE, reverse=true, format="====\n<b>Changes</b> for Build #%n\n", changesFormat="[%r] %d %a %m %p\n"} Actually, last time I tried that, it didn't work, but anyhow that's a separate issue. The point here is that the <b> in the format should not be escaped. I think the following should work: do not escape any output that is explicitly provided by the user (Hudson admin). This means BUILD_LOG_REGEX: substText CHANGES: format, pathFormat (why doesn't this support changesFormat?) CHANGES_SINCE_LAST_SUCCESS: format, changesFormat, pathFormat CHANGES_SINCE_LAST_UNSTABLE: format, changesFormat, pathFormat escape everything else. This includes change text, failed test details, environment variables, build log, etc. The Jelly templates for html are a little more tricky. Following the same rule, the template itself should not be escaped, but anything inserted into it should be. So, looking at the default Jelly template, when I see a line like this: <j:forEach var="cs" items="${changeSet.logs}" varStatus="loop"> Clearly whatever ${changeSet.logs} gets expanded to needs to be escaped, but the <:j does not. I hope that makes sense.
          slide_o_mix Alex Earl added a comment -

          Is this still an issue?

          slide_o_mix Alex Earl added a comment - Is this still an issue?
          slide_o_mix Alex Earl added a comment -

          Please reopen if this is still an issue

          slide_o_mix Alex Earl added a comment - Please reopen if this is still an issue

          People

            slide_o_mix Alex Earl
            pendix2 pendix2
            Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: