Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24611

Warnings plugin, missing AntJavaParser in the parser selection list

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • warnings-plugin
    • Windows 7 x64 with Java SE 8u20, CentOS 6.5 with Java SE 7u55

      I created the jenkins project with ant build for java sources.
      Install Warnings plugin, and add post-build action scan for compiler warnings.
      To select parser for ant java, but there are no parser in the parser drop down list.
      (screen capture attached.)

      There are two javac parser, "Java Compiler(Eclipse)" and "Java Compiler(javac)".
      I selected the later one, then no warning is reported in the ant build.

      The console output sample is as follows:
      [javac] Compiling 2 source files to C:\Users\momo\.jenkins\jobs\GoodMorningMrJenkins\workspace\app\Java8Lambdas\Album\build\classes
      [javac] C:\Users\momo\.jenkins\jobs\GoodMorningMrJenkins\workspace\app\Java8Lambdas\Album\src\music\album\Artist.java:65: warning: [deprecation] getDate() in Date has been deprecated
      [javac] int d = new Date().getDate();
      [javac] ^
      [javac] C:\Users\momo\.jenkins\jobs\GoodMorningMrJenkins\workspace\app\Java8Lambdas\Album\src\music\album\Artist.java:67: warning: [unchecked] unchecked call to add(E) as a member of the raw type List
      [javac] list.add("Warning?");
      [javac] ^
      [javac] where E is a type-variable:
      [javac] E extends Object declared in interface List
      [javac] 2 warnings

      Java Compiler(javac) parser cannot capture these warnings.

      I found a AntJavaParser.java source file in Warnings plugin source files.
      https://github.com/jenkinsci/warnings-plugin/blob/master/src/main/java/hudson/plugins/warnings/parser/AntJavacParser.java

      This file seemed to define a proper regular expression for ant javac output.
      It should be in parser seletion list like a "Java Compiler(Ant)".

        1. 24611_case1.log
          33 kB
        2. 24611_case2.log
          33 kB
        3. 24611_case3.log
          32 kB
        4. 24611_case4.log
          32 kB
        5. mvn-test.log
          46 kB
        6. mvn-test.log
          46 kB
        7. parserlist.png
          parserlist.png
          39 kB

          [JENKINS-24611] Warnings plugin, missing AntJavaParser in the parser selection list

          For case 3, I have set some encodings at project "Configure" > "Post-build Actions" > "Scan for compiler warnings" > "Advanced" > "Default Encoding".
          encodings tried are as follows:

          • blank (default)
          • "windows-31j"
          • "UTF-8"
          • "UTF-16"
          • "euc-jp-linux"

          Results are same, 0 warnings.

          note) "windows-31j" and "MS932" are same. IANA registration name is "Windows-31J".

          Expected to succeed in my environment(Windows OS, platform encoding Windows-31J):

          • blank
          • "windows-31j"

          note)blank is explained to mean platform default encoding.

          I tried to run jenkins with debugger attachable command-line option, then attached NetBeans debugger to the jenkins and analyzed the situation.

          • jenkins commandline
            > java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8086,suspend=n -jar jenkins-1.579.war
          • put break point at AntJavacParser.isLineInteresting.
            In this point, argument "line" is invalid in any encoding like this.
            "\ufffd\ufffd\ufffd[\ufffdU\ufffd[anonymous\ufffd\ufffd\ufffd\ufffd\ufffds"
            Non ascii character is misconverted.

          So, I looked for where reading console output and what encoding specified, from
          stack trace from AntJavacParser.isLineInteresting.

          The place is at ParserRegistory.createReader

          protected Reader createReader(final InputStream inputStream)

          { return new InputStreamReader(new BOMInputStream(inputStream), defaultCharset); }

          The specified encoding is a instance field "defaultCharset".
          This field is set in constructor.

          public ParserRegistry(final List<? extends AbstractWarningsParser> parsers, final String defaultEncoding,
          final String includePattern, final String excludePattern) {
          defaultCharset = EncodingValidator.defaultCharset(defaultEncoding);

          This ParserRegistry constructor is called from WarningsPublisher.parseConsoleLog method.

          Collection<FileAnnotation> warnings = new ParserRegistry(
          ParserRegistry.getParsers(parserName),
          CONSOLE_LOG_ENCODING, getIncludePattern(), getExcludePattern()).parse(build.getLogFile());

          Above code, encoding is specified by WarningsPublisher class final field CONSOLE_LOG_ENCODING.

          private static final String CONSOLE_LOG_ENCODING = "UTF-8";

          This looks like causing miss detections on non UTF-8 encoding platform.

          Possible fix is

          Collection<FileAnnotation> warnings = new ParserRegistry(
          ParserRegistry.getParsers(parserName),
          getDefaultEncoding(), getIncludePattern(),
          getExcludePattern()).parse(build.getLogFile());

          I changed the 2nd argument from CONSOLE_LOG_ENCODING to getDefaultEncoding().
          Then the default encoding specified in configuration is applied, and
          warnings can detected on Windows OS windows-31j(MS932) encoding.

          Toru Takahashi added a comment - For case 3, I have set some encodings at project "Configure" > "Post-build Actions" > "Scan for compiler warnings" > "Advanced" > "Default Encoding". encodings tried are as follows: blank (default) "windows-31j" "UTF-8" "UTF-16" "euc-jp-linux" Results are same, 0 warnings. note) "windows-31j" and "MS932" are same. IANA registration name is "Windows-31J". Expected to succeed in my environment(Windows OS, platform encoding Windows-31J): blank "windows-31j" note)blank is explained to mean platform default encoding. I tried to run jenkins with debugger attachable command-line option, then attached NetBeans debugger to the jenkins and analyzed the situation. jenkins commandline > java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8086,suspend=n -jar jenkins-1.579.war put break point at AntJavacParser.isLineInteresting. In this point, argument "line" is invalid in any encoding like this. "\ufffd\ufffd\ufffd[\ufffdU\ufffd[anonymous\ufffd\ufffd\ufffd\ufffd\ufffds" Non ascii character is misconverted. So, I looked for where reading console output and what encoding specified, from stack trace from AntJavacParser.isLineInteresting. The place is at ParserRegistory.createReader protected Reader createReader(final InputStream inputStream) { return new InputStreamReader(new BOMInputStream(inputStream), defaultCharset); } The specified encoding is a instance field "defaultCharset". This field is set in constructor. public ParserRegistry(final List<? extends AbstractWarningsParser> parsers, final String defaultEncoding, final String includePattern, final String excludePattern) { defaultCharset = EncodingValidator.defaultCharset(defaultEncoding); This ParserRegistry constructor is called from WarningsPublisher.parseConsoleLog method. Collection<FileAnnotation> warnings = new ParserRegistry( ParserRegistry.getParsers(parserName), CONSOLE_LOG_ENCODING, getIncludePattern(), getExcludePattern()).parse(build.getLogFile()); Above code, encoding is specified by WarningsPublisher class final field CONSOLE_LOG_ENCODING. private static final String CONSOLE_LOG_ENCODING = "UTF-8"; This looks like causing miss detections on non UTF-8 encoding platform. Possible fix is Collection<FileAnnotation> warnings = new ParserRegistry( ParserRegistry.getParsers(parserName), getDefaultEncoding(), getIncludePattern(), getExcludePattern()).parse(build.getLogFile()); I changed the 2nd argument from CONSOLE_LOG_ENCODING to getDefaultEncoding(). Then the default encoding specified in configuration is applied, and warnings can detected on Windows OS windows-31j(MS932) encoding.

          Ulli Hafner added a comment -

          OK, I see. I thought that the console encoding is always UTF8. I.e., if I change that to defaultEncoding then everything works as expected?

          Ulli Hafner added a comment - OK, I see. I thought that the console encoding is always UTF8. I.e., if I change that to defaultEncoding then everything works as expected?

          For example, in traditional Unix OS is not using "UTF-8" but "EUC-JP" or other encoding as Japanese environment. Recently, especially for Linux, uses "UTF-8" encoding as Japanese environment.

          I have tried the warnings plugin with defaultEncoding modified for encodings of EUC-JP, UTF-8, and Windows-31J. These three cases are Ok for counting ant javac warning messages both english and localized(Japanese).

          Without any fix, parsing can only handle ASCII characters on the platform other than UTF-8 because many encodings contain ASCII code. And advanced default encoding configuration is ignored.

          Toru Takahashi added a comment - For example, in traditional Unix OS is not using "UTF-8" but "EUC-JP" or other encoding as Japanese environment. Recently, especially for Linux, uses "UTF-8" encoding as Japanese environment. I have tried the warnings plugin with defaultEncoding modified for encodings of EUC-JP, UTF-8, and Windows-31J. These three cases are Ok for counting ant javac warning messages both english and localized(Japanese). Without any fix, parsing can only handle ASCII characters on the platform other than UTF-8 because many encodings contain ASCII code. And advanced default encoding configuration is ignored.

          Daniel Beck added a comment - - edited

          For example, in traditional Unix OS is not using "UTF-8" but "EUC-JP" or other encoding as Japanese environment. Recently, especially for Linux, uses "UTF-8" encoding as Japanese environment.

          OS X is UTF-8 for the last few years as well (not claiming it's "traditional Unix", just an extra data point).

          For some perspective, based on anonymous usage stats, use of Unixes other than Linux and OS X is really low. Of the ~105600 nodes with known OS, only 2250, or ~2.5%, are not Windows, Linux, or OS X; and of those, more than half are Solaris (SunOS).

          Daniel Beck added a comment - - edited For example, in traditional Unix OS is not using "UTF-8" but "EUC-JP" or other encoding as Japanese environment. Recently, especially for Linux, uses "UTF-8" encoding as Japanese environment. OS X is UTF-8 for the last few years as well (not claiming it's "traditional Unix", just an extra data point). For some perspective, based on anonymous usage stats, use of Unixes other than Linux and OS X is really low. Of the ~105600 nodes with known OS, only 2250, or ~2.5%, are not Windows, Linux, or OS X; and of those, more than half are Solaris (SunOS).

          Ulli Hafner added a comment -

          I'm not talking about the linux or unix console. My question is if the console log file created by Jenkins uses the default locale and not UTF-8? But since you tested it the suggested fix will be ok indeed!

          Ulli Hafner added a comment - I'm not talking about the linux or unix console. My question is if the console log file created by Jenkins uses the default locale and not UTF-8? But since you tested it the suggested fix will be ok indeed!

          Code changed in jenkins
          User: Ulli Hafner
          Path:
          .idea/libraries/Maven__org_apache_commons_commons_lang3_3_1.xml
          analysis-core
          tasks
          http://jenkins-ci.org/commit/analysis-suite-plugin/db3e86644600ecedef1315cc00291d2003f154b6
          Log:
          [FIXED JENKINS-24611] Use default encoding when reading the console log.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Ulli Hafner Path: .idea/libraries/Maven__org_apache_commons_commons_lang3_3_1.xml analysis-core tasks http://jenkins-ci.org/commit/analysis-suite-plugin/db3e86644600ecedef1315cc00291d2003f154b6 Log: [FIXED JENKINS-24611] Use default encoding when reading the console log.

          Code changed in jenkins
          User: Ulli Hafner
          Path:
          src/main/java/hudson/plugins/warnings/WarningsPublisher.java
          warnings.iml
          http://jenkins-ci.org/commit/warnings-plugin/6f457dce7e88581d733a1bf93da0f1091831d852
          Log:
          [FIXED JENKINS-24611] Use default encoding when reading the console log.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Ulli Hafner Path: src/main/java/hudson/plugins/warnings/WarningsPublisher.java warnings.iml http://jenkins-ci.org/commit/warnings-plugin/6f457dce7e88581d733a1bf93da0f1091831d852 Log: [FIXED JENKINS-24611] Use default encoding when reading the console log.

          Ulli Hafner added a comment -

          If you still want to add a good Japanese test case feel free to add a pull request...

          Ulli Hafner added a comment - If you still want to add a good Japanese test case feel free to add a pull request...

          > I'm not talking about the linux or unix console. My question is if the console log file created by Jenkins uses the default locale and not UTF-8?

          Yes, jenkins generates a console log as platform default encoding, not UTF-8.
          I analyzed some jenkins console log file which is /var/lib/jenkins/jobs/<job name>/builds/<year-month-date_hour_min_sec>/log on Linux or C:\Users\<user name>\.jenkins\jobs\<job name>\builds\<year-month-date_hour_min_sec>/log on Windows.

          Analyzied cases are as follows:
          1) Linux UTF-8 encoding platform => log file is UTF-8 encoding.
          This log is attached as name 24611_case1.log and 24611_case2.log
          2) Linux EUC-JP encoding platform => log file is euc-jp encoding.
          3) Windows Windows-31J encoding platform => log file is windows-31j encoding.
          This log is attached as name 24611_case3.log and 24611_case4.log

          I tried but failed to create unit test for WarningsPublisher.
          It is so difficult for me to prepare the instances of build and project, and to control them for testing.
          Mock approach may be suitable but I don't try yet.

          Toru Takahashi added a comment - > I'm not talking about the linux or unix console. My question is if the console log file created by Jenkins uses the default locale and not UTF-8? Yes, jenkins generates a console log as platform default encoding, not UTF-8. I analyzed some jenkins console log file which is /var/lib/jenkins/jobs/<job name>/builds/<year-month-date_hour_min_sec>/log on Linux or C:\Users\<user name>\.jenkins\jobs\<job name>\builds\<year-month-date_hour_min_sec>/log on Windows. Analyzied cases are as follows: 1) Linux UTF-8 encoding platform => log file is UTF-8 encoding. This log is attached as name 24611_case1.log and 24611_case2.log 2) Linux EUC-JP encoding platform => log file is euc-jp encoding. 3) Windows Windows-31J encoding platform => log file is windows-31j encoding. This log is attached as name 24611_case3.log and 24611_case4.log I tried but failed to create unit test for WarningsPublisher. It is so difficult for me to prepare the instances of build and project, and to control them for testing. Mock approach may be suitable but I don't try yet.

          Ulli Hafner added a comment - - edited

          For the unit test you just need to replace the file issue24611.txt with your log file and adapt the unit test in https://github.com/jenkinsci/warnings-plugin/blob/51df96d0253d4541d3131bbe1f4927d16b4da3fb/src/test/java/hudson/plugins/warnings/parser/AntJavacParserTest.java#L33.

          You need to change the locale and the number of warnings in the log. (Optionally you can check for the warning attributes, too.)

          Ulli Hafner added a comment - - edited For the unit test you just need to replace the file issue24611.txt with your log file and adapt the unit test in https://github.com/jenkinsci/warnings-plugin/blob/51df96d0253d4541d3131bbe1f4927d16b4da3fb/src/test/java/hudson/plugins/warnings/parser/AntJavacParserTest.java#L33 . You need to change the locale and the number of warnings in the log. (Optionally you can check for the warning attributes, too.)

            drulli Ulli Hafner
            momotaro Toru Takahashi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: