Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-67291

Parse MSBuild-compatible diagnostic with (line,col,line,col)

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • analysis-model
    • Jenkins 2.319.1
      Warnings Next Generation Plugin 9.9.0
      Analysis Model API Plugin 10.8.0

      Microsoft's C# compiler can report a warning like this via MSBuild:

      C:\Projects\EndLocation\Program.cs(5,24,5,26): warning CS1584: XML comment has syntactically incorrect cref attribute 'No such thing' [C:\Projects\EndLocation\EndLocation.csproj]
      

      The numbers between the parentheses are the starting line, starting column, ending line, and ending column. See MSBuild / Visual Studio aware error messages and message formats. Analysis-core is not yet able to parse this four-number format.

      Demo for how to get a warning in this format

      EndLocation.csproj:

      <Project Sdk="Microsoft.NET.Sdk">
      
        <PropertyGroup>
          <OutputType>Exe</OutputType>
          <TargetFramework>net6.0</TargetFramework>
          <GenerateDocumentationFile>true</GenerateDocumentationFile>
          <ErrorEndLocation>true</ErrorEndLocation>
        </PropertyGroup>
      
      </Project>
      

      Program.cs:

      namespace EndLocation
      {
          class Program
          {
              /// <see cref="No such thing"/>
              static void Main()
              {
              }
          }
      }
      

      Compilation with .NET SDK 6.0.100:

      $ dotnet build
      Microsoft (R) Build Engine version 17.0.0+c9eb9dd64 for .NET
      Copyright (C) Microsoft Corporation. All rights reserved.
      
        Determining projects to restore...
        All projects are up-to-date for restore.
      C:\Projects\EndLocation\Program.cs(5,24,5,26): warning CS1584: XML comment has syntactically incorrect cref attribute 'No such thing' [C:\Projects\EndLocation\EndLocation.csproj]
        EndLocation -> C:\Projects\EndLocation\bin\Debug\net6.0\EndLocation.dll
      
      Build succeeded.
      
      C:\Projects\EndLocation\Program.cs(5,24,5,26): warning CS1584: XML comment has syntactically incorrect cref attribute 'No such thing' [C:\Projects\EndLocation\EndLocation.csproj]
          1 Warning(s)
          0 Error(s)
      
      Time Elapsed 00:00:02.19
      

          [JENKINS-67291] Parse MSBuild-compatible diagnostic with (line,col,line,col)

          I suppose the fix would go in https://github.com/jenkinsci/analysis-model/blob/v10.8.0/src/main/java/edu/hm/hafner/analysis/parser/MsBuildParser.java, but the regex pattern looks rather complex.

          More references for MSBuild diagnostic formats:

          Kalle Niemitalo added a comment - I suppose the fix would go in https://github.com/jenkinsci/analysis-model/blob/v10.8.0/src/main/java/edu/hm/hafner/analysis/parser/MsBuildParser.java , but the regex pattern looks rather complex. More references for MSBuild diagnostic formats: CanonicalError.cs parses a diagnostic of a tool, for processing within MSBuild. EventArgsFormatting.cs formats a diagnostic back to a string. This can also use the (line-line,col) format that was not mentioned in MSBuild / Visual Studio aware error messages and message formats .

          So the whole set of location formats is:

          • (line)
          • (line,column)
          • (line-endLine)
          • (line,column-endColumn)
          • (line-endLine,column)
          • (line,column,endLine,endColumn)

          MsBuildParser is currently able to parse the first two.

          Kalle Niemitalo added a comment - So the whole set of location formats is: (line) (line,column) (line-endLine) (line,column-endColumn) (line-endLine,column) (line,column,endLine,endColumn) MsBuildParser is currently able to parse the first two.

          Ulli Hafner added a comment -

          Hopefully, we sometimes get a volunteer to provide support for Sarif in the analysis-model project then we are not required to use this complex regexp anymore.

          Ulli Hafner added a comment - Hopefully, we sometimes get a volunteer to provide support for Sarif in the analysis-model project then we are not required to use this complex regexp anymore.

          The Roslyn C# compiler can write diagnostics to a file in SARIF formats, but if there are problems with NuGet package dependencies or such, then the diagnostics do not come from the C# compiler and cannot take advantage of that feature. So you'd likely need the regexp anyway, unless the build is using a custom MSBuild logger that writes SARIF.

          Kalle Niemitalo added a comment - The Roslyn C# compiler can write diagnostics to a file in SARIF formats, but if there are problems with NuGet package dependencies or such, then the diagnostics do not come from the C# compiler and cannot take advantage of that feature. So you'd likely need the regexp anyway, unless the build is using a custom MSBuild logger that writes SARIF.

          MS_BUILD_WARNING_PATTERN could perhaps be simplified if it only tried to recognize diagnostics emitted by MSBuild console loggers or file loggers. Why does it involve ANT_TASK; is that for scenarios where Ant runs MSBuild and further decorates its output?

          Kalle Niemitalo added a comment - MS_BUILD_WARNING_PATTERN could perhaps be simplified if it only tried to recognize diagnostics emitted by MSBuild console loggers or file loggers. Why does it involve ANT_TASK; is that for scenarios where Ant runs MSBuild and further decorates its output?

          Arnab added a comment - - edited

          Hello drulli I've modified the part of the regex separately (responsible for parsing line and col) and tested it to parse all the different possible combination of line and col (seems like it works). But If I put it in the existing regex it will only make things more complicated (in terms of code readability).

          So, I was thinking if you have any idea on how to make the regex more readable and maintainable such that it will be easy to work the current one is just way too long.

          What if we break down the pattern into several segments and try to parse different parts of the log with separate helper functions inside the "createIssue" method. What do you think about this approach? 

          What are your thoughts on this?

          Arnab added a comment - - edited Hello drulli I've modified the part of the regex separately (responsible for parsing line and col) and tested it to parse all the different possible combination of line and col (seems like it works). But If I put it in the existing regex it will only make things more complicated (in terms of code readability). So, I was thinking if you have any idea on how to make the regex more readable and maintainable such that it will be easy to work the current one is just way too long. What if we break down the pattern into several segments and try to parse different parts of the log with separate helper functions inside the "createIssue" method. What do you think about this approach?  What are your thoughts on this?

          Ulli Hafner added a comment -

          Breaking it into several parts makes sense! I wonder if it makes sense to even split this parser into different classes? (As in Armcc parsers). Then we have a dedicated parser for each pattern?

          I think the maintainability will be much better if we can at least subdivide the parsing into smaller steps! Do you want to make a first suggestion?

          Ulli Hafner added a comment - Breaking it into several parts makes sense! I wonder if it makes sense to even split this parser into different classes? (As in Armcc parsers). Then we have a dedicated parser for each pattern? I think the maintainability will be much better if we can at least subdivide the parsing into smaller steps! Do you want to make a first suggestion?

          Arnab added a comment -

          Thank You for your confirmation! Yeah, I agree with you on the dividing into different classes approach that you took with the Armcc parsers. We can definitely try that!

          I will try to raise an initial PR during the weekend! If you have any suggestion please do share!

          Arnab added a comment - Thank You for your confirmation! Yeah, I agree with you on the dividing into different classes approach that you took with the Armcc parsers. We can definitely try that! I will try to raise an initial PR during the weekend! If you have any suggestion please do share!

          Ulli Hafner added a comment -

          I am on holidays for one week now, so I cannot help in the next week.

          Ulli Hafner added a comment - I am on holidays for one week now, so I cannot help in the next week.

          Arnab added a comment - - edited

          That actually helps me too. Since, I'll be having my exams for a week (starting from a week later). So, we can work on this when we both are available.

          Arnab added a comment - - edited That actually helps me too. Since, I'll be having my exams for a week (starting from a week later). So, we can work on this when we both are available.

            code_arnab Arnab
            kon Kalle Niemitalo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: