We have a specific workflow that enables us to discover that bug. However, it could and will probably happen to other people soon.

      When running scm checkouts in parallels in a Jenkins file, the changes discovered are discovered for every git checkout that runs at the same time.

      In our example, we run checkouts with 8 threads in parallel (background: https://roidelapluie.be/blog/2016/11/18/gitslave-jenkins/ ).

      One change is displayed in 8 different repositories, the 8 repositories that are checked out at the same time.

      The next checkouts do not contains the changes, so I guess it is kind of a global variable that is reset after the checkout.

      lock('checkout') {
          def checkouts = [:]
          def threads = 8
      
          stage('Super Repo') {
              checkout([$class: 'GitSCM',
                      branches: [[name: branch]],
                      doGenerateSubmoduleConfigurations: false,
                      extensions: [[$class: 'CleanCheckout'], [$class: 'ScmName', name: 'super']],
                      submoduleCfg: [],
                      userRemoteConfigs: [[url: "${gitBaseUrl}/${superRepo}"]]])
      
              def repos = readFile('.gitslave')
              def reposLines = repos.readLines()
              for (i = 0; i < threads; i++){
                  checkouts["Thread ${i}"] = []
              }
              def idx = 0
              for (line in reposLines) {
                  def repoInfo = line.split(' ')
                  def repoUrl = repoInfo[0]
                  def repoPath = repoInfo[1]
                  def curatedRepoUrl = repoUrl.substring(4, repoUrl.length()-1)
                  def curatedRepoPath = repoPath.substring(1, repoPath.length()-1)
                  echo("Thread ${idx%threads}")
                  checkouts["Thread ${idx%threads}"] << [path: curatedRepoPath, url: "${gitBaseUrl}/${curatedRepoUrl}"]
                  idx++
              }
          }
          stage('GitSlave Repos') {
              def doCheckouts = [:]
              for (i = 0; i < threads; i++){
                  def j = i
                  doCheckouts["Thread ${j}"] = {
                      for (co in checkouts["Thread ${j}"]) {
                          retry(3) {
                              checkout([$class: 'GitSCM',
                                      branches: [[name: branch]],
                                      doGenerateSubmoduleConfigurations: false,
                                      extensions: [[$class: 'RelativeTargetDirectory', relativeTargetDir: co.path], [$class: 'CleanCheckout'], [$class: 'ScmName', name: co.path]],
                                      submoduleCfg: [],
                                      userRemoteConfigs: [[url: co.url]]])
                          }
                      }
                  }
              }
              parallel doCheckouts
          }
      

          [JENKINS-39968] git checkouts are not pipeline-parallel safe

          PS: I originally set this as major because of potential side effects of this bug. Maybe it affects all the SCM plugins, or affects other things than changelogs.

          Julien Pivotto added a comment - PS: I originally set this as major because of potential side effects of this bug. Maybe it affects all the SCM plugins, or affects other things than changelogs.

          Mark Waite added a comment -

          Could you explain more deeply what you were expecting instead of what you observed? You said:

          One change is displayed in 8 different repositories, the 8 repositories that are checked out at the same time

          I interpret that to mean that you were expecting the change to only be found in zero or one of the eight repositories. If so, then that is different than what I expect. I expect that a single SHA1 is selected as the checkout for the duration of the job, unless the user specifically configures the definition in the checkout step to use something different. If that isn't the case, then we risk that a user running a build may have a later checkout which fails to use the same SHA1 as the original checkout.

          Mark Waite added a comment - Could you explain more deeply what you were expecting instead of what you observed? You said: One change is displayed in 8 different repositories, the 8 repositories that are checked out at the same time I interpret that to mean that you were expecting the change to only be found in zero or one of the eight repositories. If so, then that is different than what I expect. I expect that a single SHA1 is selected as the checkout for the duration of the job, unless the user specifically configures the definition in the checkout step to use something different. If that isn't the case, then we risk that a user running a build may have a later checkout which fails to use the same SHA1 as the original checkout.

          Julien Pivotto added a comment - - edited

          We check out 30 different git repositories.

          We set the branch to 'master'.

          The plugins pick the master branch of all the repositories.

          But,

          If there is a change in one repo, the plugin will display the change in all the repos that are checkout out at the same time (in other parallel() threads), even if the commit is not in that repo.

          Julien Pivotto added a comment - - edited We check out 30 different git repositories. We set the branch to 'master'. The plugins pick the master branch of all the repositories. But, If there is a change in one repo, the plugin will display the change in all the repos that are checkout out at the same time (in other parallel() threads), even if the commit is not in that repo.

          pixman20 added a comment -

          This still appears to be a problem.

          I noticed another problem today and finally realized several issues I've been seeing are related to Git not working with parallel steps in pipelines.

          From what I can tell, the issue is that the changelog#.xml gets used for each checkout running at the same time.

          For example, I have a job that does a parallel checkout of 3 different repositories.

          On that run, each of them will write to changelog4.xml, which means the last one wins.

          Also, in the build.xml for the run, each "org.jenkinsci.plugins.workflow.job.WorkflowRun_-SCMCheckout" node references changelog4.xml.

          This is also probably why in the past sometimes links will go to the wrong repository.

          At the time of the build what's in memory is fine, however after the memory is GC'ed and reloaded or Jenkins is restarted, the data from the changelog XMLs gets reloaded and (in my case) the changes for 1 repository are listed 3 times (1 for each of the different checkouts).

           

          I did a test and ran the same set of changes in another job that did a sequential checkout, and everything worked correctly, with each getting their own respective changelog#.xml.

          FYI markewaite

          pixman20 added a comment - This still appears to be a problem. I noticed another problem today and finally realized several issues I've been seeing are related to Git not working with parallel steps in pipelines. From what I can tell, the issue is that the changelog#.xml gets used for each checkout running at the same time. For example, I have a job that does a parallel checkout of 3 different repositories. On that run, each of them will write to changelog4.xml, which means the last one wins. Also, in the build.xml for the run, each "org.jenkinsci.plugins.workflow.job.WorkflowRun_-SCMCheckout" node references changelog4.xml. This is also probably why in the past sometimes links will go to the wrong repository. At the time of the build what's in memory is fine, however after the memory is GC'ed and reloaded or Jenkins is restarted, the data from the changelog XMLs gets reloaded and (in my case) the changes for 1 repository are listed 3 times (1 for each of the different checkouts).   I did a test and ran the same set of changes in another job that did a sequential checkout, and everything worked correctly, with each getting their own respective changelog#.xml. FYI  markewaite

          I just created and linked a new bug to this one (JENKINS-54732), which I think may be related. My test case suggests that the SCM checkout function is not thread safe, sharing some global data or settings or something across invocations. This results in incorrect reference data describing the checkout to be propagated elsewhere - in my case back to the caller in the form of the data returned from the function call. That being the case this could explain why there are other problems / bugs like this one when running checkouts / clones in parallel.

          Kevin Phillips added a comment - I just created and linked a new bug to this one ( JENKINS-54732 ), which I think may be related. My test case suggests that the SCM checkout function is not thread safe, sharing some global data or settings or something across invocations. This results in incorrect reference data describing the checkout to be propagated elsewhere - in my case back to the caller in the form of the data returned from the function call. That being the case this could explain why there are other problems / bugs like this one when running checkouts / clones in parallel.

          Tiago Santos added a comment - - edited

          We are experiencing the same issue in our jenkins installation, while checking out many git repositories in parallel. 
          I tried to expose and fix the problem in this PR: https://github.com/jenkinsci/workflow-scm-step-plugin/pull/31

          Hope you guys can review it (or fix it if there is a better way).

          Tiago Santos added a comment - - edited We are experiencing the same issue in our jenkins installation, while checking out many git repositories in parallel.  I tried to expose and fix the problem in this PR: https://github.com/jenkinsci/workflow-scm-step-plugin/pull/31 Hope you guys can review it (or fix it if there is a better way).

          Jesse Glick added a comment -

          JENKINS-54732 seems distinct: about the return value of git / checkout; whereas IIUC this is about the build changelog, possibly making it a duplicate of JENKINS-34313.

          Jesse Glick added a comment - JENKINS-54732 seems distinct: about the return value of git / checkout ; whereas IIUC this is about the build changelog, possibly making it a duplicate of JENKINS-34313 .

            Unassigned Unassigned
            roidelapluie Julien Pivotto
            Votes:
            5 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: