Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54732

Revision information produced by pipline 'checkout' operation isn't parallel safe

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • git-plugin, pipeline
    • None
    • Jenkins v2.148
      Pipeline Plugin v2.5
      Git Plugin v3.9.1
      Git Client Plugin v2.7.3

      I recently ran some test builds on our build farm to check out the "new" return data produced by the scm "checkout" function. In our case we are using Git as our source repository and are doing checkouts using code that looks like this:

       

      def branchName = "MyBranch"
      def res = checkout (
          changelog: true,
          poll: false,
          scm: [
              $class: 'GitSCM',
              branches: [[name: branchName]],
              browser: [$class: 'GitLab', repoUrl: "path/to/repo', version: "10.0"],
              doGenerateSubmoduleConfigurations: false,
              extensions: [
                  [$class: 'CloneOption', depth: 0, noTags: false, reference: '', shallow: false, timeout: 90],
              ],
              submoduleCfg: [],
       
              userRemoteConfigs: [[credentialsId: 'git_creds', url: "path/to/repo.git"]]
      ]
      )
      echo "Results from checkout are " + res.toString()
       
      This code works well enough on it's own, until you try to run it in parallel. In our case we have 2 parallel stages that are set up to check out code from the same Git repository but that use 2 different branches. When you examine the output produced by the "echo" statement at the end of the snippet you will see that the revision information returned (ie: res.GIT_COMMIT) reflects the correct revision number for the "first" clone operation that runs, but it does NOT for the "second" clone operation (dependent upon which of the parallel stages runs first). The second clone operation to run seems to be returning result data from the first clone operation, almost like the two clone processes are sharing common data behind the scenes or something. This makes it impossible to rely on the results of a "checkout" call when performing the operation in parallel.

          [JENKINS-54732] Revision information produced by pipline 'checkout' operation isn't parallel safe

          I am linking defect JENKINS-39968 to this one since I have a gut feeling that the root cause of my defect here is probably related to the root cause of this other defect as well. Assuming there is some sort of shared-state within the "checkout" function, and this shared-state is not thread safe, then it could explain why someone might get inconsistent results elsewhere - like in the change logs - when running checkouts in parallel.

          Kevin Phillips added a comment - I am linking defect JENKINS-39968 to this one since I have a gut feeling that the root cause of my defect here is probably related to the root cause of this other defect as well. Assuming there is some sort of shared-state within the "checkout" function, and this shared-state is not thread safe, then it could explain why someone might get inconsistent results elsewhere - like in the change logs - when running checkouts in parallel.

          Linking another related task to this one, which I think may have been closed solely due to the lack of detail on the bug report. However, based on the brief description of the problem it sounds like it's the same thing I've encountered here. That means I'm likely not the first person to report this problem.

          Kevin Phillips added a comment - Linking another related task to this one, which I think may have been closed solely due to the lack of detail on the bug report. However, based on the brief description of the problem it sounds like it's the same thing I've encountered here. That means I'm likely not the first person to report this problem.

          Mark Waite added a comment -

          In the short term, you'll need to use shell commands to extract the information you need rather than relying on the return values from `checkout scm` . That should allow you to get the information you need without waiting for a fix.

          Mark Waite added a comment - In the short term, you'll need to use shell commands to extract the information you need rather than relying on the return values from `checkout scm` . That should allow you to get the information you need without waiting for a fix.

          Actually it's funny you mention that. This is what we are currently doing and have been doing for quite some time, since before the feature was added to the "checkout" function to return the metadata. I just recently decided to try out using the return data to simplify some of our scripts and so forth and wanted to leverage the return data produced by this build step.

          The fact that the step produces return data that is incorrect, at least in the case of performing operations in parallel, suggests to me the handling of the data is fundamentally broken in some way. That being the case people shouldn't be relying on it being correct. Even if they rely on it when running serialized operations, they are likely to hit odd, hard to debug problems if they try to extend their build scripts down the road to parallelize them. It's just a ticking time bomb waiting to happen. That being the case I would personally suggest removing the return data completely from to avoid confusion and error until the functionality can be implemented correctly.

          Also, if I am correct and the underlying root cause of this problem is in fact related to the other problems I've linked to above then that also suggests that the root cause of this problem - whatever that may be - may affect systems other than just the return data produced by the build task. That being the case the importance / criticality of this defect really should be reconsidered since it may affect even more areas, like changeset reporting ... and possibly others which may be even more severe (ie: commit triggers, etc. etc.).

          If an assessment has already been completed and the root cause of this defect is known to be of minimal impact and limited to just the metadata being returned, then I agree - using shell scripts to gather checkout information is a viable workaround. Perhaps you could provide a bit more context surrounding the problem to set users' minds at ease in this regard.

          Kevin Phillips added a comment - Actually it's funny you mention that. This is what we are currently doing and have been doing for quite some time, since before the feature was added to the "checkout" function to return the metadata. I just recently decided to try out using the return data to simplify some of our scripts and so forth and wanted to leverage the return data produced by this build step. The fact that the step produces return data that is incorrect, at least in the case of performing operations in parallel, suggests to me the handling of the data is fundamentally broken in some way. That being the case people shouldn't be relying on it being correct. Even if they rely on it when running serialized operations, they are likely to hit odd, hard to debug problems if they try to extend their build scripts down the road to parallelize them. It's just a ticking time bomb waiting to happen. That being the case I would personally suggest removing the return data completely from to avoid confusion and error until the functionality can be implemented correctly. Also, if I am correct and the underlying root cause of this problem is in fact related to the other problems I've linked to above then that also suggests that the root cause of this problem - whatever that may be - may affect systems other than just the return data produced by the build task. That being the case the importance / criticality of this defect really should be reconsidered since it may affect even more areas, like changeset reporting ... and possibly others which may be even more severe (ie: commit triggers, etc. etc.). If an assessment has already been completed and the root cause of this defect is known to be of minimal impact and limited to just the metadata being returned, then I agree - using shell scripts to gather checkout information is a viable workaround. Perhaps you could provide a bit more context surrounding the problem to set users' minds at ease in this regard.

          Mark Waite added a comment -

          Unfortunately, I don't have more context surrounding the problem. I haven't investigated it further than is described in the bug reports that you linked.

          Mark Waite added a comment - Unfortunately, I don't have more context surrounding the problem. I haven't investigated it further than is described in the bug reports that you linked.

          Jesse Glick added a comment -

          Looks more likely to be a duplicate of JENKINS-53346.

          Jesse Glick added a comment - Looks more likely to be a duplicate of JENKINS-53346 .

            Unassigned Unassigned
            leedega Kevin Phillips
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: