• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • git-plugin
    • Windows Server 2008R2; Jenkins 1.54.3; Git Plugin 2.2.1

      We met randomly failure of git scm, it hung at the fetch process for a long time and will timeout. When it timeout it says

      02:56:20 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://bmcdiags@.../ghts/ta +refs/heads/:refs/remotes/origin/" returned status code -1:
      02:56:20 stdout:
      02:56:20 stderr: Could not create directory 'c/Users/Administrator/.ssh'.

          [JENKINS-24454] Windows GIT SCM fetch code hung

          sharon xia added a comment -

          02:36:16 Started by upstream project "echidna-patch-quality" build number 335
          02:36:16 originally caused by:
          02:36:16 Started by command line by xxx
          02:36:16 [EnvInject] - Loading node environment variables.
          02:36:17 Building remotely on ECHIDNA-QUALITY (6.1 windows-6.1 windows amd64-windows amd64-windows-6.1 amd64) in workspace c:\buildfarm-slave\workspace\echidna-patch-compile
          02:36:18 > git rev-parse --is-inside-work-tree
          02:36:19 Fetching changes from the remote Git repository
          02:36:19 > git config remote.origin.url ssh://@...:/ghts/ta
          02:36:20 Fetching upstream changes from ssh://@...:/ghts/ta
          02:36:20 > git --version
          02:36:20 > git fetch --tags --progress ssh://@...:/ghts/ta +refs/heads/:refs/remotes/origin/
          02:56:20 ERROR: Timeout after 20 minutes
          02:56:20 FATAL: Failed to fetch from ssh://@...:/ghts/ta
          02:56:20 hudson.plugins.git.GitException: Failed to fetch from ssh://bmcdiags@10.110.61.117:30000/ghts/ta
          02:56:20 at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623)
          02:56:20 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855)
          02:56:20 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880)
          02:56:20 at hudson.model.AbstractProject.checkout(AbstractProject.java:1414)
          02:56:20 at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:671)
          02:56:20 at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          02:56:20 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:580)
          02:56:20 at hudson.model.Run.execute(Run.java:1684)
          02:56:20 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          02:56:20 at hudson.model.ResourceController.execute(ResourceController.java:88)
          02:56:20 at hudson.model.Executor.run(Executor.java:231)
          02:56:20 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://@...:/ghts/ta +refs/heads/:refs/remotes/origin/" returned status code -1:
          02:56:20 stdout:
          02:56:20 stderr: Could not create directory 'c/Users/Administrator/.ssh'.
          02:56:20
          02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325)
          02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186)
          02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87)
          02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257)
          02:56:20 at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153)
          02:56:20 at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146)
          02:56:20 at hudson.remoting.UserRequest.perform(UserRequest.java:118)
          02:56:20 at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          02:56:20 at hudson.remoting.Request$2.run(Request.java:326)
          02:56:20 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          02:56:20 at java.util.concurrent.FutureTask.run(Unknown Source)
          02:56:20 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          02:56:20 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          02:56:20 at hudson.remoting.Engine$1$1.run(Engine.java:63)
          02:56:20 at java.lang.Thread.run(Unknown Source)

          sharon xia added a comment - 02:36:16 Started by upstream project "echidna-patch-quality" build number 335 02:36:16 originally caused by: 02:36:16 Started by command line by xxx 02:36:16 [EnvInject] - Loading node environment variables. 02:36:17 Building remotely on ECHIDNA-QUALITY (6.1 windows-6.1 windows amd64-windows amd64-windows-6.1 amd64) in workspace c:\buildfarm-slave\workspace\echidna-patch-compile 02:36:18 > git rev-parse --is-inside-work-tree 02:36:19 Fetching changes from the remote Git repository 02:36:19 > git config remote.origin.url ssh:// @...: /ghts/ta 02:36:20 Fetching upstream changes from ssh:// @...: /ghts/ta 02:36:20 > git --version 02:36:20 > git fetch --tags --progress ssh:// @...:/ghts/ta +refs/heads/:refs/remotes/origin/ 02:56:20 ERROR: Timeout after 20 minutes 02:56:20 FATAL: Failed to fetch from ssh:// @...: /ghts/ta 02:56:20 hudson.plugins.git.GitException: Failed to fetch from ssh://bmcdiags@10.110.61.117:30000/ghts/ta 02:56:20 at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623) 02:56:20 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855) 02:56:20 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880) 02:56:20 at hudson.model.AbstractProject.checkout(AbstractProject.java:1414) 02:56:20 at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:671) 02:56:20 at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) 02:56:20 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:580) 02:56:20 at hudson.model.Run.execute(Run.java:1684) 02:56:20 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 02:56:20 at hudson.model.ResourceController.execute(ResourceController.java:88) 02:56:20 at hudson.model.Executor.run(Executor.java:231) 02:56:20 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh:// @...:/ghts/ta +refs/heads/:refs/remotes/origin/ " returned status code -1: 02:56:20 stdout: 02:56:20 stderr: Could not create directory 'c/Users/Administrator/.ssh'. 02:56:20 02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325) 02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186) 02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87) 02:56:20 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257) 02:56:20 at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153) 02:56:20 at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146) 02:56:20 at hudson.remoting.UserRequest.perform(UserRequest.java:118) 02:56:20 at hudson.remoting.UserRequest.perform(UserRequest.java:48) 02:56:20 at hudson.remoting.Request$2.run(Request.java:326) 02:56:20 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) 02:56:20 at java.util.concurrent.FutureTask.run(Unknown Source) 02:56:20 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 02:56:20 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 02:56:20 at hudson.remoting.Engine$1$1.run(Engine.java:63) 02:56:20 at java.lang.Thread.run(Unknown Source)

          Daniel Beck added a comment -

          Not a core issue, and likely not a Git issue either. Have you tried creating the directory Jenkins fails to create? Or tried Google?

          Daniel Beck added a comment - Not a core issue, and likely not a Git issue either. Have you tried creating the directory Jenkins fails to create? Or tried Google?

          Mark Waite added a comment -

          You might review this stackoverflow posting for ideas of things you might try. It is not exactly the same case, but it was one of the first items when I searched google with that error message.

          Mark Waite added a comment - You might review this stackoverflow posting for ideas of things you might try. It is not exactly the same case, but it was one of the first items when I searched google with that error message.

          Mark Waite added a comment -

          Since you report it is a random failure, and it seems to be Windows specific, you might also try to accelerate the frequency at which you encounter the problem by defining multiple jobs which use the same ssh authenticated repository, with a local reference repository (to reduce the cloning data transfer), then run the jobs concurrently.

          If the problem is a file locking problem with the C:\Users\Administrator\.ssh directory, or with a file in that directory, then running many jobs in parallel should make it happen much more often, and may give you a chance to see other hints which may suggest what is causing the locking problem.

          Mark Waite added a comment - Since you report it is a random failure, and it seems to be Windows specific, you might also try to accelerate the frequency at which you encounter the problem by defining multiple jobs which use the same ssh authenticated repository, with a local reference repository (to reduce the cloning data transfer), then run the jobs concurrently. If the problem is a file locking problem with the C:\Users\Administrator\.ssh directory, or with a file in that directory, then running many jobs in parallel should make it happen much more often, and may give you a chance to see other hints which may suggest what is causing the locking problem.

          sharon xia added a comment -

          You are right Waite! We are running 3 jobs concurrently on the same win 7 slave pulling from the same code source. And two of them will success and one will fail. The failure is random. I have to workaround to make the job running in serial now to see whether it will get rid of this issue. Is there any other log I can provide?

          sharon xia added a comment - You are right Waite! We are running 3 jobs concurrently on the same win 7 slave pulling from the same code source. And two of them will success and one will fail. The failure is random. I have to workaround to make the job running in serial now to see whether it will get rid of this issue. Is there any other log I can provide?

          Mark Waite added a comment -

          The message is coming from the stderr output of the "git fetch" command as far as I can tell. That would usually mean that a change to fix the issue would be needed inside the "git" program, external to Jenkins.

          If the C:\users\Administrator\.ssh directory does not exist, you can create it by logging in as Administrator, and entering the command "ssh-keygen" from a "Git Bash" shell.

          Mark Waite added a comment - The message is coming from the stderr output of the "git fetch" command as far as I can tell. That would usually mean that a change to fix the issue would be needed inside the "git" program, external to Jenkins. If the C:\users\Administrator\.ssh directory does not exist, you can create it by logging in as Administrator, and entering the command "ssh-keygen" from a "Git Bash" shell.

          sharon xia added a comment -

          Of course this file is existed otherwise the other two jobs cannot be succeedd and we will not be able to clone the file. This is the folder for placing the ssh public key and known host.

          The strange thing is we didn't meet this issue before July. The changes we did: upgrade git plugin from 1.1.6 to 2.2.1 and upgrade Jenkins from 1.532.3 to 1.554.3, change the git repository url(I don't think this is related with this issue as the other two jobs clone succeed.)

          sharon xia added a comment - Of course this file is existed otherwise the other two jobs cannot be succeedd and we will not be able to clone the file. This is the folder for placing the ssh public key and known host. The strange thing is we didn't meet this issue before July. The changes we did: upgrade git plugin from 1.1.6 to 2.2.1 and upgrade Jenkins from 1.532.3 to 1.554.3, change the git repository url(I don't think this is related with this issue as the other two jobs clone succeed.)

          Mark Waite added a comment -

          Changing the git plugin from 1.1.6 to 2.2.1 also changed from relying on per client credential configuration to using the Jenkins credentials plugin for credential management. I don't know why the git fetch command thinks it needs to create (or lock) the %HOME%\.ssh directory, but that is the challenge you're trying to resolve.

          You could check if the JGit implementation inside the git client plugin is better at handling this case. You enable JGit from the "Manage Jenkins" page, where you add a git implementation named "jgit" from the pick list.

          Mark Waite added a comment - Changing the git plugin from 1.1.6 to 2.2.1 also changed from relying on per client credential configuration to using the Jenkins credentials plugin for credential management. I don't know why the git fetch command thinks it needs to create (or lock) the %HOME%\.ssh directory, but that is the challenge you're trying to resolve. You could check if the JGit implementation inside the git client plugin is better at handling this case. You enable JGit from the "Manage Jenkins" page, where you add a git implementation named "jgit" from the pick list.

          sharon xia added a comment -

          I can stable reproduce this issue on other windows slaves as well. I believe this is a bug in git plugin. We didn't see this issue before. Try polling more than 1 job from the same node concurrently will have this issue in a random manager. The node is using ssh for clone.

          sharon xia added a comment - I can stable reproduce this issue on other windows slaves as well. I believe this is a bug in git plugin. We didn't see this issue before. Try polling more than 1 job from the same node concurrently will have this issue in a random manager. The node is using ssh for clone.

          Mark Waite added a comment -

          I am reasonably confident that it is a bug in the git program (or a bug in the Windows file system and its locking design), not a bug in the git plugin.

          The command which is failing with a timeout is a call to the "git" program as a separate process. The git plugin calls the git program and waits for the git program to either complete or for the timeout to expire. In this case, the timeout expired, probably because of Windows file system locking semantics.

          When you say that you did not see the issue before, were you polling and/or building from multiple concurrent jobs on Windows machines previously?

          What version of the git program are you running on your Windows slaves?

          Mark Waite added a comment - I am reasonably confident that it is a bug in the git program (or a bug in the Windows file system and its locking design), not a bug in the git plugin. The command which is failing with a timeout is a call to the "git" program as a separate process. The git plugin calls the git program and waits for the git program to either complete or for the timeout to expire. In this case, the timeout expired, probably because of Windows file system locking semantics. When you say that you did not see the issue before, were you polling and/or building from multiple concurrent jobs on Windows machines previously? What version of the git program are you running on your Windows slaves?

          sharon xia added a comment - - edited

          We are using Git-1.8.4-preview20130916 (msysgit)

          sharon xia added a comment - - edited We are using Git-1.8.4-preview20130916 (msysgit)

          sharon xia added a comment -

          Hi,

          We meet this issue again on another slave node:

          We are seeing a number of git processes on the slave node when this issue happen.

          It often happens when user cancelled task during git fetch code step. The git process is not killed properly. And for a while, there are a bunch of git process not killed on the slave.

          Here is the output, we have to restart jenkins service on slave node to let it work:

          Started by user XXX
          [EnvInject] - Loading node environment variables.
          Building remotely on GPS-NODE (x86-windows-6.1 6.1 x86-windows windows-6.1 windows x86) in workspace d:\hudson-slave\workspace\Andy_Dev_Branch
          > git rev-parse --is-inside-work-tree
          Fetching changes from the remote Git repository
          > git config remote.origin.url ssh://git@hardware.corp.emc.com:7999/bf/uefi_bios_moons.git
          Cleaning workspace
          > git rev-parse --verify HEAD
          Resetting working tree
          > git reset --hard
          > git clean -fdx
          Fetching upstream changes from ssh://git@****:7999/bf/uefi_bios_moons.git
          > git --version
          > git fetch --tags --progress ssh://git@***:7999/bf/uefi_bios_moons.git +refs/heads/:refs/remotes/origin/*
          FATAL: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git
          hudson.plugins.git.GitException: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git
          at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623)
          at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855)
          at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880)
          at hudson.model.AbstractProject.checkout(AbstractProject.java:1252)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:615)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:524)
          at hudson.model.Run.execute(Run.java:1706)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:232)
          Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://git@***:7999/bf/uefi_bios_moons.git +refs/heads/:refs/remotes/origin/*" returned status code 128:
          stdout:
          stderr: Could not create directory 'c/Users/buildfarmadmin/.ssh'.
          fatal: Could not read from remote repository.

          Please make sure you have the correct access rights
          and the repository exists.

          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257)
          at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153)
          at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146)
          at hudson.remoting.UserRequest.perform(UserRequest.java:118)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:326)
          at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at hudson.remoting.Engine$1$1.run(Engine.java:58)
          at java.lang.Thread.run(Unknown Source)

          sharon xia added a comment - Hi, We meet this issue again on another slave node: We are seeing a number of git processes on the slave node when this issue happen. It often happens when user cancelled task during git fetch code step. The git process is not killed properly. And for a while, there are a bunch of git process not killed on the slave. Here is the output, we have to restart jenkins service on slave node to let it work: Started by user XXX [EnvInject] - Loading node environment variables. Building remotely on GPS-NODE (x86-windows-6.1 6.1 x86-windows windows-6.1 windows x86) in workspace d:\hudson-slave\workspace\Andy_Dev_Branch > git rev-parse --is-inside-work-tree Fetching changes from the remote Git repository > git config remote.origin.url ssh://git@hardware.corp.emc.com:7999/bf/uefi_bios_moons.git Cleaning workspace > git rev-parse --verify HEAD Resetting working tree > git reset --hard > git clean -fdx Fetching upstream changes from ssh://git@****:7999/bf/uefi_bios_moons.git > git --version > git fetch --tags --progress ssh://git@*** :7999/bf/uefi_bios_moons.git +refs/heads/ :refs/remotes/origin/* FATAL: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git hudson.plugins.git.GitException: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623) at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880) at hudson.model.AbstractProject.checkout(AbstractProject.java:1252) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:615) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:524) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://git@*** :7999/bf/uefi_bios_moons.git +refs/heads/ :refs/remotes/origin/*" returned status code 128: stdout: stderr: Could not create directory 'c/Users/buildfarmadmin/.ssh'. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source)

          sharon xia added a comment -

          Any possibility there is issue with jenkins slave process clean up?

          sharon xia added a comment - Any possibility there is issue with jenkins slave process clean up?

          sharon xia added a comment -

          There seems to be a git process left unkilled when the job is aborted or killed. Leaving the job hung when next time it starts a new build.

          sharon xia added a comment - There seems to be a git process left unkilled when the job is aborted or killed. Leaving the job hung when next time it starts a new build.

          sharon xia added a comment -

          It has two symptoms:
          1. When multiple jobs fetching git code at the same time, e.g. set these jobs with the same GIT SCM code repository, we will randomly meet this issue. However, next time you kicked off the build, it will not have this issue.
          2. The build always failed with the same message. You have to restart jenkins slave to resolve this issue. We observed there are a lot of git process in process management console.

          sharon xia added a comment - It has two symptoms: 1. When multiple jobs fetching git code at the same time, e.g. set these jobs with the same GIT SCM code repository, we will randomly meet this issue. However, next time you kicked off the build, it will not have this issue. 2. The build always failed with the same message. You have to restart jenkins slave to resolve this issue. We observed there are a lot of git process in process management console.

          Maximin added a comment -

          Getting the same error frequently.

          Jenkins - 1.574
          Git Plugin - 2.2.7
          git version 1.9.5.msysgit.1

          While the fetch is stuck, from the Process Explorer it could be seen that the ssh.exe is stuck on the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'"

          Below is from Process Explorer.

          jenkins.exe
            java.exe
              git.exe
                git.exe
                  ssh.exe // this one is stuck
          

          While the process is stuck, executing the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" from command line gives the response which ends with

          .........
          005467bd6f492ad36325aea516dfc2f423b1bc5e8dfe refs/tags/branch1
          0057747b9750f2389c6ca630480674a85e1decad2387 refs/tags/branch1^{}
          0000
          Connection to github.faked.com closed by remote host.
          

          From the dump which generated while the process was hung,

          STACK_TEXT:  
          0028d53c 74ee15f7 00000002 0028d58c 00000001 ntdll!NtWaitForMultipleObjects+0x15
          0028d5d8 76741a0c 0028d58c 0028d600 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
          0028d620 767441f0 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
          0028d63c 68015424 00000002 0028d694 00000000 kernel32!WaitForMultipleObjects+0x18
          

          The last control flow was to ntdll!NtWaitForMultipleObjects. From the name of the thread it seems like it is waiting for some resources, which is not known at this point.

          Any ideas on how to fix this or workarounds which is working?

          Maximin added a comment - Getting the same error frequently. Jenkins - 1.574 Git Plugin - 2.2.7 git version 1.9.5.msysgit.1 While the fetch is stuck, from the Process Explorer it could be seen that the ssh.exe is stuck on the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" Below is from Process Explorer. jenkins.exe java.exe git.exe git.exe ssh.exe // this one is stuck While the process is stuck, executing the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" from command line gives the response which ends with ......... 005467bd6f492ad36325aea516dfc2f423b1bc5e8dfe refs/tags/branch1 0057747b9750f2389c6ca630480674a85e1decad2387 refs/tags/branch1^{} 0000 Connection to github.faked.com closed by remote host. From the dump which generated while the process was hung, STACK_TEXT: 0028d53c 74ee15f7 00000002 0028d58c 00000001 ntdll!NtWaitForMultipleObjects+0x15 0028d5d8 76741a0c 0028d58c 0028d600 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100 0028d620 767441f0 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0 0028d63c 68015424 00000002 0028d694 00000000 kernel32!WaitForMultipleObjects+0x18 The last control flow was to ntdll!NtWaitForMultipleObjects . From the name of the thread it seems like it is waiting for some resources, which is not known at this point. Any ideas on how to fix this or workarounds which is working?

          Mark Waite added a comment -

          maximin I'm afraid that I have no ideas to offer. You're running a recent version of msysgit (1.9.5), which contains (as far as I know) a recent version of ssh.

          You could try switching to JGit instead of using command line git. There are some use cases which the JGit implementation in the plugin does not support (submodules, pushing tags, and several others), but for simple use cases the JGit implementation is sufficient. You may find that the age of your Jenkins installation (Jenkins 1.574 is now about 2 years old) and the version of the git plugin (2.2.x has been replaced by the 2.3.x series) may be too old to have the most recent JGit implementation fixes, but you could try JGit to see if it resolves your issue.

          Mark Waite added a comment - maximin I'm afraid that I have no ideas to offer. You're running a recent version of msysgit (1.9.5), which contains (as far as I know) a recent version of ssh. You could try switching to JGit instead of using command line git. There are some use cases which the JGit implementation in the plugin does not support (submodules, pushing tags, and several others), but for simple use cases the JGit implementation is sufficient. You may find that the age of your Jenkins installation (Jenkins 1.574 is now about 2 years old) and the version of the git plugin (2.2.x has been replaced by the 2.3.x series) may be too old to have the most recent JGit implementation fixes, but you could try JGit to see if it resolves your issue.

          Jake Cobb added a comment -

          We're seeing this same problem with:

          Jenkins - 1.617
          Git Plugin - 2.3.5
          git 1.9.0.msysgit.0

          We started getting this problem after upgrading from a quite old version of the Git Plugin - 1.4.0, which we were using with the same version of git on the windows slave (1.9.0.mysysgit.0).

          We see mostly the same behavior maximin described. We do not get the error about the .ssh directory mentioned in the original description here.

          Running the git command spawned by the Jenkins slave manually in a git bash shell in the workspace works every time without delay, regardless of whether or not other jobs are hung on it in the same slave. I did this by copying the command line from process explorer on the hung git command and just pasting it in, so it's exactly the same.

          Running just the ssh command gives a response but hangs, the remote end does not close the connection:

          003c29363ef2df43efb9d3e517e6f78fc7bda2f46f7e refs/tags/help
          0000
          

          However, this behavior should be fine by the Git protocol, the 0000 indicates the end of message.

          I wonder if there could be a change in input/output buffering when git is run by another process and this is causing some communication deadlock. We can't reproduce this behavior with git alone (version unchanged) and never saw it with the old Git Plugin version.

          Jake Cobb added a comment - We're seeing this same problem with: Jenkins - 1.617 Git Plugin - 2.3.5 git 1.9.0.msysgit.0 We started getting this problem after upgrading from a quite old version of the Git Plugin - 1.4.0, which we were using with the same version of git on the windows slave (1.9.0.mysysgit.0). We see mostly the same behavior maximin described. We do not get the error about the .ssh directory mentioned in the original description here. Running the git command spawned by the Jenkins slave manually in a git bash shell in the workspace works every time without delay, regardless of whether or not other jobs are hung on it in the same slave. I did this by copying the command line from process explorer on the hung git command and just pasting it in, so it's exactly the same. Running just the ssh command gives a response but hangs, the remote end does not close the connection: 003c29363ef2df43efb9d3e517e6f78fc7bda2f46f7e refs/tags/help 0000 However, this behavior should be fine by the Git protocol, the 0000 indicates the end of message . I wonder if there could be a change in input/output buffering when git is run by another process and this is causing some communication deadlock. We can't reproduce this behavior with git alone (version unchanged) and never saw it with the old Git Plugin version.

          Mark Waite added a comment -

          jakecobb I doubt there is a change of input/output buffering when git is run by another process, but you'd need to investigate the git source code to decide that for sure.

          If you're running your Windows slave as a Windows service, then you'll have real difficulty interactively duplicating the environment where the git process runs. You could try running that process from inside a Jenkins job (using a Windows Batch job step, for instance) to see if the same good behavior exists when the git program is run inside a Jenkins job.

          You might also consider updating from msysgit 1.9.0 to the most recent 1.9.5 version. I don't know that it will fix your problem, but there are several useful fixes in the intervening releases between what you're running and the latest version. Among other things, the version of OpenSSH was upgraded between those two versions so that "git clone" using an ssh protocol URL is no longer limited to 1 MB / second download.

          Mark Waite added a comment - jakecobb I doubt there is a change of input/output buffering when git is run by another process, but you'd need to investigate the git source code to decide that for sure. If you're running your Windows slave as a Windows service, then you'll have real difficulty interactively duplicating the environment where the git process runs. You could try running that process from inside a Jenkins job (using a Windows Batch job step, for instance) to see if the same good behavior exists when the git program is run inside a Jenkins job. You might also consider updating from msysgit 1.9.0 to the most recent 1.9.5 version. I don't know that it will fix your problem, but there are several useful fixes in the intervening releases between what you're running and the latest version. Among other things, the version of OpenSSH was upgraded between those two versions so that "git clone" using an ssh protocol URL is no longer limited to 1 MB / second download.

          Mark Waite added a comment -

          Closing as "Cannot reproduce"

          Mark Waite added a comment - Closing as "Cannot reproduce"

            Unassigned Unassigned
            sharon_xia sharon xia
            Votes:
            4 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: