Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73321

git lock error jenkins pipeline stage reuse workspace, race?

      We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.

      However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

      This shows up as a git lock error: 

      ERROR: Error fetching remote repo 'origin'
      hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
      ...
      ... returned status code 1:
      ...
      error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.
      
      Another git process seems to be running in this repository...
      

      The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

      This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

      pipeline code:

      #!/usr/bin/env groovy
      
      def items_to_be_tested
      
      pipeline {
      
        agent none
      
        options {
          disableConcurrentBuilds()
          timestamps()
          buildDiscarder logRotator(numToKeepStr: '20')
          allowBrokenBuildClaiming()
          ansiColor('xterm')
        }
      
        stages {
          stage ( "Run on items from CMake" ) {
            stages {
              stage ( "Get list of items from CMake" ) {
                agent any
                steps {
                  dir ( "${WORKSPACE}/items" ) {
                    script { items_to_be_tested = getItemToBeTested() }
                  }
                }
                post { cleanup { deleteDir() } }
              }
              stage ("Run TOOL on items from CMake") {
                steps {
                  script { parallel generateTaskMap(items_to_be_tested) }
                }
              }
            }
          }
        } // stages
      }
      
      def getItemToBeTested() {
        return params.ITEMS.tokenize(" ;,:").join(";")
      }
      
      def getTaskBody(item) {
        return {
          // Grab lock to limit parallel tasks to MAX (~8)
          lock( label: 'TOOL_lock', quantity: 1) {
            stage("${item}") {
              node('lsf') {
      
                checkout(scm)
      
                // Run tests here
                sleep 1200  // running TOOL takes roughly 20 minutes
      
                // post directive is not supported here, please do clean up after you are done
                deleteDir()
              }
            }
          }
        }
      }
      
      
      def generateTaskMap(item_list) {
        stages = [:]
        expanded_item_list = item_list.split(';') as List
        echo "Item list: ${expanded_item_list}"
        expanded_item_list.each {
          echo "Adding tests for ${it}"
          stages["TOOL on ${it}"] = getTaskBody(it)
        }
        return stages
      }
      

          [JENKINS-73321] git lock error jenkins pipeline stage reuse workspace, race?

          David created issue -

          David added a comment -

          I'm sorry for the messed up code, I hope it's understandable anyway.

          David added a comment - I'm sorry for the messed up code, I hope it's understandable anyway.

          Mark Waite added a comment -

          The Pipeline fragment that you provided is not enough for others to duplicate the problem as far as I can tell. Please provide a more complete Pipeline sample that will allow others to duplicate the problem.

          Jenkins no longer supports either RHEL 6 or RHEL 7. Plan your upgrade to a supported operating system. I don't think the operating system affects this issue, but it is worth warning you that you're using operating systems that have not been tested by the Jenkins project for many months (or in the case of RHEL 6, years).

          Mark Waite added a comment - The Pipeline fragment that you provided is not enough for others to duplicate the problem as far as I can tell. Please provide a more complete Pipeline sample that will allow others to duplicate the problem. Jenkins no longer supports either RHEL 6 or RHEL 7. Plan your upgrade to a supported operating system. I don't think the operating system affects this issue, but it is worth warning you that you're using operating systems that have not been tested by the Jenkins project for many months (or in the case of RHEL 6, years).
          David made changes -
          Description Original: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.

          {{stage ("Run") {
            steps {
              script \{ parallel my_steps() }  // my_steps returns a dict(?) (`[:]`)
          }}{{{}  }
          {}}}{{{}}{}}}

          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {{ERROR: Error fetching remote repo 'origin'}}
          {{hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame}}
          {{...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...}}

          {{The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.}}

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?
          New: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?
          David made changes -
          Description Original: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?
          New: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

          David made changes -
          Description Original: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

          New: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

          pipeline code:

          {code:groovy}
          #!/usr/bin/env groovy

          def items_to_be_tested

          pipeline {

            agent none

            options {
              disableConcurrentBuilds()
              timestamps()
              buildDiscarder logRotator(numToKeepStr: '20')
              allowBrokenBuildClaiming()
              ansiColor('xterm')
            }

            stages {
              stage ( "Run on items from CMake" ) {
                stages {
                  stage ( "Get list of items from CMake" ) {
                    agent any
                    steps {
                      dir ( "${WORKSPACE}/items" ) {
                        script { items_to_be_tested = getItemToBeTested() }
                      }
                    }
                    post { cleanup { deleteDir() } }
                  }
                  stage ("Run Spyglass on items from CMake") {
                    steps {
                      script { parallel generateTaskMap(items_to_be_tested) }
                    }
                  }
                }
              }
            } // stages
          }

          def getItemToBeTested() {
            return params.ITEMS.tokenize(" ;,:").join(";")
          }

          def getTaskBody(item) {
            return {
              // Grab lock to limit parallel tasks to MAX (~8)
              lock( label: 'TOOL_lock', quantity: 1) {
                stage("${item}") {
                  node('lsf') {

                    checkout(scm)

                    // Run tests here
                    sleep 1200 // running TOOL takes roughly 20 minutes

                    // post directive is not supported here, please do clean up after you are done
                    deleteDir()
                  }
                }
              }
            }
          }


          def generateTaskMap(item_list) {
            stages = [:]
            expanded_item_list = item_list.split(';') as List
            echo "Item list: ${expanded_item_list}"
            expanded_item_list.each {
              echo "Adding tests for ${it}"
              stages["TOOL on ${it}"] = getTaskBody(it)
            }
            return stages
          }
          {code}
          David made changes -
          Comment [ Here is a more comprehensive pipeline example with hopefully correct formatting:

          {code:groovy}
          #!/usr/bin/env groovy

          def items_to_be_tested

          pipeline {

            agent none

            options {
              disableConcurrentBuilds()
              timestamps()
              buildDiscarder logRotator(numToKeepStr: '20')
              allowBrokenBuildClaiming()
              ansiColor('xterm')
            }

            stages {
              stage ( "Run on items from CMake" ) {
                stages {
                  stage ( "Get list of items from CMake" ) {
                    agent any
                    steps {
                      dir ( "${WORKSPACE}/items" ) {
                        script { items_to_be_tested = getItemToBeTested() }
                      }
                    }
                    post { cleanup { deleteDir() } }
                  }
                  stage ("Run Spyglass on items from CMake") {
                    steps {
                      script { parallel generateTaskMap(items_to_be_tested) }
                    }
                  }
                }
              }
            } // stages
          }

          def getItemToBeTested() {
            return params.ITEMS.tokenize(" ;,:").join(";")
          }

          def getTaskBody(item) {
            return {
              // Grab lock to limit parallel tasks to MAX (~8)
              lock( label: 'TOOL_lock', quantity: 1) {
                stage("${item}") {
                  node('lsf') {

                    checkout(scm)

                    // Run tests here
                    sleep 1200 // running TOOL takes roughly 20 minutes

                    // post directive is not supported here, please do clean up after you are done
                    deleteDir()
                  }
                }
              }
            }
          }


          def generateTaskMap(item_list) {
            stages = [:]
            expanded_item_list = item_list.split(';') as List
            echo "Item list: ${expanded_item_list}"
            expanded_item_list.each {
              echo "Adding tests for ${it}"
              stages["TOOL on ${it}"] = getTaskBody(it)
            }
            return stages
          }
          {code} ]

          David added a comment - - edited

          Thanks for the feedback Mark.

          Issue updated with comprehensive pipeline source code, and correctly formatted.

          David added a comment - - edited Thanks for the feedback Mark. Issue updated with comprehensive pipeline source code, and correctly formatted.
          David made changes -
          Description Original: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

          pipeline code:

          {code:groovy}
          #!/usr/bin/env groovy

          def items_to_be_tested

          pipeline {

            agent none

            options {
              disableConcurrentBuilds()
              timestamps()
              buildDiscarder logRotator(numToKeepStr: '20')
              allowBrokenBuildClaiming()
              ansiColor('xterm')
            }

            stages {
              stage ( "Run on items from CMake" ) {
                stages {
                  stage ( "Get list of items from CMake" ) {
                    agent any
                    steps {
                      dir ( "${WORKSPACE}/items" ) {
                        script { items_to_be_tested = getItemToBeTested() }
                      }
                    }
                    post { cleanup { deleteDir() } }
                  }
                  stage ("Run Spyglass on items from CMake") {
                    steps {
                      script { parallel generateTaskMap(items_to_be_tested) }
                    }
                  }
                }
              }
            } // stages
          }

          def getItemToBeTested() {
            return params.ITEMS.tokenize(" ;,:").join(";")
          }

          def getTaskBody(item) {
            return {
              // Grab lock to limit parallel tasks to MAX (~8)
              lock( label: 'TOOL_lock', quantity: 1) {
                stage("${item}") {
                  node('lsf') {

                    checkout(scm)

                    // Run tests here
                    sleep 1200 // running TOOL takes roughly 20 minutes

                    // post directive is not supported here, please do clean up after you are done
                    deleteDir()
                  }
                }
              }
            }
          }


          def generateTaskMap(item_list) {
            stages = [:]
            expanded_item_list = item_list.split(';') as List
            echo "Item list: ${expanded_item_list}"
            expanded_item_list.each {
              echo "Adding tests for ${it}"
              stages["TOOL on ${it}"] = getTaskBody(it)
            }
            return stages
          }
          {code}
          New: We have a pipeline job where a stage is divided into several (17) steps. Each step is run in parallel and should get its own workspace.


          However, Jenkins seems to sometimes allocate the same workspace to multiple steps.

          This shows up as a git lock error: 

          {code}
          ERROR: Error fetching remote repo 'origin'
          hudson.plugins.git.GitException: Failed to fetch from git@url.com:reponame
          ...
          ... returned status code 1:
          ...
          error: cannot lock ref `refs/remote/origin/feature_A': Unable to create '/jenkinsurl/workspace/jobname@3/.git/refs/remotes/origin/feature_A.lock': File exists.

          Another git process seems to be running in this repository...
          {code}

          The same error can be seen in several steps and they are all referring to the workspace folder with a "@3" suffix.

          This seems like a race condition when Jenkins assign workspace folder to each parallel step. Is it possible for us to control the names of the WS folder manually to avoid this issue? Are we using parallel correctly? Or is this a bug in Jenkins?

          pipeline code:

          {code:groovy}
          #!/usr/bin/env groovy

          def items_to_be_tested

          pipeline {

            agent none

            options {
              disableConcurrentBuilds()
              timestamps()
              buildDiscarder logRotator(numToKeepStr: '20')
              allowBrokenBuildClaiming()
              ansiColor('xterm')
            }

            stages {
              stage ( "Run on items from CMake" ) {
                stages {
                  stage ( "Get list of items from CMake" ) {
                    agent any
                    steps {
                      dir ( "${WORKSPACE}/items" ) {
                        script { items_to_be_tested = getItemToBeTested() }
                      }
                    }
                    post { cleanup { deleteDir() } }
                  }
                  stage ("Run TOOL on items from CMake") {
                    steps {
                      script { parallel generateTaskMap(items_to_be_tested) }
                    }
                  }
                }
              }
            } // stages
          }

          def getItemToBeTested() {
            return params.ITEMS.tokenize(" ;,:").join(";")
          }

          def getTaskBody(item) {
            return {
              // Grab lock to limit parallel tasks to MAX (~8)
              lock( label: 'TOOL_lock', quantity: 1) {
                stage("${item}") {
                  node('lsf') {

                    checkout(scm)

                    // Run tests here
                    sleep 1200 // running TOOL takes roughly 20 minutes

                    // post directive is not supported here, please do clean up after you are done
                    deleteDir()
                  }
                }
              }
            }
          }


          def generateTaskMap(item_list) {
            stages = [:]
            expanded_item_list = item_list.split(';') as List
            echo "Item list: ${expanded_item_list}"
            expanded_item_list.each {
              echo "Adding tests for ${it}"
              stages["TOOL on ${it}"] = getTaskBody(it)
            }
            return stages
          }
          {code}

          David added a comment -

          markewaite Do you think it's reproducible now?

          How are agents supposed to work? Why does several steps seem to reuse the @3 folder?

          Will each parallel step run in the same workspace because of the agent being on another level in the pipeline script? Will the git checkouts occur at the same time due to the manual checkout(scm) in the leaf stage?

          David added a comment - markewaite Do you think it's reproducible now? How are agents supposed to work? Why does several steps seem to reuse the @3 folder? Will each parallel step run in the same workspace because of the agent being on another level in the pipeline script? Will the git checkouts occur at the same time due to the manual checkout(scm) in the leaf stage?

            Unassigned Unassigned
            david_mse David
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: