Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-64579

First build on multibranch pipeline job has no SCM information

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • None

      Hi all,

      I’ve been trying to configure Jenkins for static code analysis with the use of Warning NG and Git Forensics plugins to execute static analysis in a scope of a Docker container (with Checkstyle, PMD and other utilities, as well as their rules bundled in). The idea is to utilize multibranch pipelines, so that when a developer creates a pull request in Bitbucket, Jenkins automatically executes a job for that branch, checks for any new issues reported by code analysis tool and then informs Bitbucket if everything is okay and pull request can be merged. Pretty standard stuff.

      Reproduction steps:

      1. Run a fresh Jenkins instance and install the required plugins (warning ng, git forensics)
      2. Add new multibranch pipeline connected to Bitbucket Cloud repository (Jenkinsfile in the attachments)
      3. Scan multibranch pipeline to run the builds for all the branches

      However, I’ve come across an implementation which seems invalid and I’m not able to find any workaround for it (hence the major bug priority). For the very first job executed for the branch or pull request, the Forensics API and Warning NG which is using it are not able to figure out what SCM is being used and cannot extract information like who introduced the new code smells:

      [CheckStyle] Creating SCM blamer to obtain author and commit information for affected files 
      [CheckStyle] SCM 'hudson.scm.NullSCM' is not of type GitSCM 
      [CheckStyle] -> Git blamer could not be created for SCM 'hudson.scm.NullSCM@71e85a6f' in working tree '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix'

      Then, when trying to mine the repository with mine repository forensics step, I get the following output:

      [Forensics] SCM 'hudson.scm.NullSCM' is not of type GitSCM 
      [Forensics] -> Git miner could not be created for SCM 'hudson.scm.NullSCM@239301ef' in working tree '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix'

      However, on the second run of the very same job, everything works as expected:

      [CheckStyle] Creating SCM blamer to obtain author and commit information for affected files 
      [CheckStyle] -> Git blamer successfully created in working tree '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [CheckStyle] Creating SCM miner to obtain statistics for affected repository files 
      [CheckStyle] -> Git miner successfully created in working tree '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [CheckStyle] Resolving file names for all issues in source directory '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [CheckStyle] -> resolved paths in source directory (27 found, 0 not found) 
      
      ... 
      
      [Forensics] Creating SCM miner to obtain statistics for affected repository files 
      [Forensics] -> Git miner successfully created in working tree '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [Forensics] Analyzing the commit log of the Git repository '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [Forensics] Invoking Git miner to create statistics for all available files 
      [Forensics] Git working tree = '/var/jenkins_home/workspace/Statistics_-_Static_Analysis_fix' 
      [Forensics] -> created statistics for 355 files 
      [Forensics] -> created report for 355 files in 20 seconds

      The issue is present for the multibranch pipelines not only when Bitbucket Cloud is selected as an SCM but also in case of generic git. The issue doesn’t happen when the standard pipeline (non-multibranch) is used - in its case SCM is properly figured out even during the first build run of the job. Unfortunately, the mentioned use case is not one where a standard pipeline could be used.

      I’ve checked the source code of Jenkins and its plugins and also debugged it a bit and here is what I found that causes the issue. In order to obtain the miner and blamer, Forensics API uses helper class called [ScmResolver|https://github.com/jenkinsci/forensics-api-plugin/blob/master/src/main/java/io/jenkins/plugins/forensics/util/ScmResolver.java]. Looks like it has a separate way of figuring out SCM from the freestyle projects (extractFromProject method) and the pipelines (extractFromPipeline method). The latter one gets the SCMs from the job with the getSCMs method from the SCMTriggerItem class. In my case, the instance is actually of a WorkflowJob type and its implementation of extracting SCMs looks as follows:]

      @Override public Collection<? extends SCM> getSCMs() {        
        WorkflowRun b = getLastSuccessfulBuild();        
        if (b == null) {            
          b = getLastCompletedBuild();        
        }        
        if (b == null) { 
          return Collections.emptySet();        
        }        
        Map<String,SCM> scms = new LinkedHashMap<>();        
        for (WorkflowRun.SCMCheckout co : b.checkouts(null) {
          scms.put(co.scm.getKey(), co.scm);        
        }        
        return scms.values();
      }
      

      I don’t know why it is like that, but figuring out the SCM is made on the base of previously executed builds. This explains why during the very first build it cannot determine the SCM, but on the other hand - couldn’t there be a different way to get that information? I’ve attached the debugger to Jenkins instance and analyzed this snippet of code in the workflow-job-plugin library. I’ve got the following result - WorkflowJob returning empty set of SCMs because there were no builds made before:

      On the other hand, looking on the content of WorkflowJob instance, there is a GitSCM instance dangling somewhere in its properties:

      It’s in the instance of [BranchJobProperty|https://github.com/jenkinsci/workflow-multibranch-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/multibranch/BranchJobProperty.java] class from the workflow multibranch plugin.

      It makes it impossible to simply apply the solution in the form “if-multibranch-then-get-from-property” without diving into reflection mess due to the fact the workflow-multibranch plugin uses workflow-job as a library and has access to its libraries, but not vice versa.

      What I think could help is either change how Forensics plugin’s ScmResolver finds out about the multibranch pipeline’s SCM or modify the code in WorkflowJob to not give up in case when there were no jobs executed before and somehow get that information. However, although both solutions could be implemented in such a way that it wouldn’t affect previous behavior, I don’t know which one would be more reasonable to make modifications in.

        1. image-2021-01-08-20-20-31-953.png
          159 kB
          Konrad Ponichtera
        2. image-2021-01-11-19-32-59-899.png
          90 kB
          Konrad Ponichtera
        3. Jenkinsfile
          0.8 kB
          Konrad Ponichtera
        4. Screenshot from 2021-01-08 18-44-08.png
          54 kB
          Konrad Ponichtera
        5. Screenshot from 2021-01-11 20-21-30.png
          64 kB
          Konrad Ponichtera
        6. Screenshot from 2021-01-11 20-23-57.png
          53 kB
          Konrad Ponichtera

            drulli Ulli Hafner
            konpon96 Konrad Ponichtera
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: