• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • git-plugin
    • Windows Server 2008R2; Jenkins 1.54.3; Git Plugin 2.2.1

      We met randomly failure of git scm, it hung at the fetch process for a long time and will timeout. When it timeout it says

      02:56:20 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://bmcdiags@.../ghts/ta +refs/heads/:refs/remotes/origin/" returned status code -1:
      02:56:20 stdout:
      02:56:20 stderr: Could not create directory 'c/Users/Administrator/.ssh'.

          [JENKINS-24454] Windows GIT SCM fetch code hung

          sharon xia added a comment - - edited

          We are using Git-1.8.4-preview20130916 (msysgit)

          sharon xia added a comment - - edited We are using Git-1.8.4-preview20130916 (msysgit)

          sharon xia added a comment -

          Hi,

          We meet this issue again on another slave node:

          We are seeing a number of git processes on the slave node when this issue happen.

          It often happens when user cancelled task during git fetch code step. The git process is not killed properly. And for a while, there are a bunch of git process not killed on the slave.

          Here is the output, we have to restart jenkins service on slave node to let it work:

          Started by user XXX
          [EnvInject] - Loading node environment variables.
          Building remotely on GPS-NODE (x86-windows-6.1 6.1 x86-windows windows-6.1 windows x86) in workspace d:\hudson-slave\workspace\Andy_Dev_Branch
          > git rev-parse --is-inside-work-tree
          Fetching changes from the remote Git repository
          > git config remote.origin.url ssh://git@hardware.corp.emc.com:7999/bf/uefi_bios_moons.git
          Cleaning workspace
          > git rev-parse --verify HEAD
          Resetting working tree
          > git reset --hard
          > git clean -fdx
          Fetching upstream changes from ssh://git@****:7999/bf/uefi_bios_moons.git
          > git --version
          > git fetch --tags --progress ssh://git@***:7999/bf/uefi_bios_moons.git +refs/heads/:refs/remotes/origin/*
          FATAL: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git
          hudson.plugins.git.GitException: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git
          at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623)
          at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855)
          at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880)
          at hudson.model.AbstractProject.checkout(AbstractProject.java:1252)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:615)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:524)
          at hudson.model.Run.execute(Run.java:1706)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:232)
          Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://git@***:7999/bf/uefi_bios_moons.git +refs/heads/:refs/remotes/origin/*" returned status code 128:
          stdout:
          stderr: Could not create directory 'c/Users/buildfarmadmin/.ssh'.
          fatal: Could not read from remote repository.

          Please make sure you have the correct access rights
          and the repository exists.

          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87)
          at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257)
          at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153)
          at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146)
          at hudson.remoting.UserRequest.perform(UserRequest.java:118)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:326)
          at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at hudson.remoting.Engine$1$1.run(Engine.java:58)
          at java.lang.Thread.run(Unknown Source)

          sharon xia added a comment - Hi, We meet this issue again on another slave node: We are seeing a number of git processes on the slave node when this issue happen. It often happens when user cancelled task during git fetch code step. The git process is not killed properly. And for a while, there are a bunch of git process not killed on the slave. Here is the output, we have to restart jenkins service on slave node to let it work: Started by user XXX [EnvInject] - Loading node environment variables. Building remotely on GPS-NODE (x86-windows-6.1 6.1 x86-windows windows-6.1 windows x86) in workspace d:\hudson-slave\workspace\Andy_Dev_Branch > git rev-parse --is-inside-work-tree Fetching changes from the remote Git repository > git config remote.origin.url ssh://git@hardware.corp.emc.com:7999/bf/uefi_bios_moons.git Cleaning workspace > git rev-parse --verify HEAD Resetting working tree > git reset --hard > git clean -fdx Fetching upstream changes from ssh://git@****:7999/bf/uefi_bios_moons.git > git --version > git fetch --tags --progress ssh://git@*** :7999/bf/uefi_bios_moons.git +refs/heads/ :refs/remotes/origin/* FATAL: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git hudson.plugins.git.GitException: Failed to fetch from ssh://git@****:7999/bf/uefi_bios_moons.git at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623) at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880) at hudson.model.AbstractProject.checkout(AbstractProject.java:1252) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:615) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:524) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://git@*** :7999/bf/uefi_bios_moons.git +refs/heads/ :refs/remotes/origin/*" returned status code 128: stdout: stderr: Could not create directory 'c/Users/buildfarmadmin/.ssh'. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1325) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1186) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$200(CliGitAPIImpl.java:87) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:257) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source)

          sharon xia added a comment -

          Any possibility there is issue with jenkins slave process clean up?

          sharon xia added a comment - Any possibility there is issue with jenkins slave process clean up?

          sharon xia added a comment -

          There seems to be a git process left unkilled when the job is aborted or killed. Leaving the job hung when next time it starts a new build.

          sharon xia added a comment - There seems to be a git process left unkilled when the job is aborted or killed. Leaving the job hung when next time it starts a new build.

          sharon xia added a comment -

          It has two symptoms:
          1. When multiple jobs fetching git code at the same time, e.g. set these jobs with the same GIT SCM code repository, we will randomly meet this issue. However, next time you kicked off the build, it will not have this issue.
          2. The build always failed with the same message. You have to restart jenkins slave to resolve this issue. We observed there are a lot of git process in process management console.

          sharon xia added a comment - It has two symptoms: 1. When multiple jobs fetching git code at the same time, e.g. set these jobs with the same GIT SCM code repository, we will randomly meet this issue. However, next time you kicked off the build, it will not have this issue. 2. The build always failed with the same message. You have to restart jenkins slave to resolve this issue. We observed there are a lot of git process in process management console.

          Maximin added a comment -

          Getting the same error frequently.

          Jenkins - 1.574
          Git Plugin - 2.2.7
          git version 1.9.5.msysgit.1

          While the fetch is stuck, from the Process Explorer it could be seen that the ssh.exe is stuck on the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'"

          Below is from Process Explorer.

          jenkins.exe
            java.exe
              git.exe
                git.exe
                  ssh.exe // this one is stuck
          

          While the process is stuck, executing the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" from command line gives the response which ends with

          .........
          005467bd6f492ad36325aea516dfc2f423b1bc5e8dfe refs/tags/branch1
          0057747b9750f2389c6ca630480674a85e1decad2387 refs/tags/branch1^{}
          0000
          Connection to github.faked.com closed by remote host.
          

          From the dump which generated while the process was hung,

          STACK_TEXT:  
          0028d53c 74ee15f7 00000002 0028d58c 00000001 ntdll!NtWaitForMultipleObjects+0x15
          0028d5d8 76741a0c 0028d58c 0028d600 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
          0028d620 767441f0 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
          0028d63c 68015424 00000002 0028d694 00000000 kernel32!WaitForMultipleObjects+0x18
          

          The last control flow was to ntdll!NtWaitForMultipleObjects. From the name of the thread it seems like it is waiting for some resources, which is not known at this point.

          Any ideas on how to fix this or workarounds which is working?

          Maximin added a comment - Getting the same error frequently. Jenkins - 1.574 Git Plugin - 2.2.7 git version 1.9.5.msysgit.1 While the fetch is stuck, from the Process Explorer it could be seen that the ssh.exe is stuck on the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" Below is from Process Explorer. jenkins.exe java.exe git.exe git.exe ssh.exe // this one is stuck While the process is stuck, executing the command ssh git@github.faked.com "git-upload-pack 'XYZ/Faked.git'" from command line gives the response which ends with ......... 005467bd6f492ad36325aea516dfc2f423b1bc5e8dfe refs/tags/branch1 0057747b9750f2389c6ca630480674a85e1decad2387 refs/tags/branch1^{} 0000 Connection to github.faked.com closed by remote host. From the dump which generated while the process was hung, STACK_TEXT: 0028d53c 74ee15f7 00000002 0028d58c 00000001 ntdll!NtWaitForMultipleObjects+0x15 0028d5d8 76741a0c 0028d58c 0028d600 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100 0028d620 767441f0 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0 0028d63c 68015424 00000002 0028d694 00000000 kernel32!WaitForMultipleObjects+0x18 The last control flow was to ntdll!NtWaitForMultipleObjects . From the name of the thread it seems like it is waiting for some resources, which is not known at this point. Any ideas on how to fix this or workarounds which is working?

          Mark Waite added a comment -

          maximin I'm afraid that I have no ideas to offer. You're running a recent version of msysgit (1.9.5), which contains (as far as I know) a recent version of ssh.

          You could try switching to JGit instead of using command line git. There are some use cases which the JGit implementation in the plugin does not support (submodules, pushing tags, and several others), but for simple use cases the JGit implementation is sufficient. You may find that the age of your Jenkins installation (Jenkins 1.574 is now about 2 years old) and the version of the git plugin (2.2.x has been replaced by the 2.3.x series) may be too old to have the most recent JGit implementation fixes, but you could try JGit to see if it resolves your issue.

          Mark Waite added a comment - maximin I'm afraid that I have no ideas to offer. You're running a recent version of msysgit (1.9.5), which contains (as far as I know) a recent version of ssh. You could try switching to JGit instead of using command line git. There are some use cases which the JGit implementation in the plugin does not support (submodules, pushing tags, and several others), but for simple use cases the JGit implementation is sufficient. You may find that the age of your Jenkins installation (Jenkins 1.574 is now about 2 years old) and the version of the git plugin (2.2.x has been replaced by the 2.3.x series) may be too old to have the most recent JGit implementation fixes, but you could try JGit to see if it resolves your issue.

          Jake Cobb added a comment -

          We're seeing this same problem with:

          Jenkins - 1.617
          Git Plugin - 2.3.5
          git 1.9.0.msysgit.0

          We started getting this problem after upgrading from a quite old version of the Git Plugin - 1.4.0, which we were using with the same version of git on the windows slave (1.9.0.mysysgit.0).

          We see mostly the same behavior maximin described. We do not get the error about the .ssh directory mentioned in the original description here.

          Running the git command spawned by the Jenkins slave manually in a git bash shell in the workspace works every time without delay, regardless of whether or not other jobs are hung on it in the same slave. I did this by copying the command line from process explorer on the hung git command and just pasting it in, so it's exactly the same.

          Running just the ssh command gives a response but hangs, the remote end does not close the connection:

          003c29363ef2df43efb9d3e517e6f78fc7bda2f46f7e refs/tags/help
          0000
          

          However, this behavior should be fine by the Git protocol, the 0000 indicates the end of message.

          I wonder if there could be a change in input/output buffering when git is run by another process and this is causing some communication deadlock. We can't reproduce this behavior with git alone (version unchanged) and never saw it with the old Git Plugin version.

          Jake Cobb added a comment - We're seeing this same problem with: Jenkins - 1.617 Git Plugin - 2.3.5 git 1.9.0.msysgit.0 We started getting this problem after upgrading from a quite old version of the Git Plugin - 1.4.0, which we were using with the same version of git on the windows slave (1.9.0.mysysgit.0). We see mostly the same behavior maximin described. We do not get the error about the .ssh directory mentioned in the original description here. Running the git command spawned by the Jenkins slave manually in a git bash shell in the workspace works every time without delay, regardless of whether or not other jobs are hung on it in the same slave. I did this by copying the command line from process explorer on the hung git command and just pasting it in, so it's exactly the same. Running just the ssh command gives a response but hangs, the remote end does not close the connection: 003c29363ef2df43efb9d3e517e6f78fc7bda2f46f7e refs/tags/help 0000 However, this behavior should be fine by the Git protocol, the 0000 indicates the end of message . I wonder if there could be a change in input/output buffering when git is run by another process and this is causing some communication deadlock. We can't reproduce this behavior with git alone (version unchanged) and never saw it with the old Git Plugin version.

          Mark Waite added a comment -

          jakecobb I doubt there is a change of input/output buffering when git is run by another process, but you'd need to investigate the git source code to decide that for sure.

          If you're running your Windows slave as a Windows service, then you'll have real difficulty interactively duplicating the environment where the git process runs. You could try running that process from inside a Jenkins job (using a Windows Batch job step, for instance) to see if the same good behavior exists when the git program is run inside a Jenkins job.

          You might also consider updating from msysgit 1.9.0 to the most recent 1.9.5 version. I don't know that it will fix your problem, but there are several useful fixes in the intervening releases between what you're running and the latest version. Among other things, the version of OpenSSH was upgraded between those two versions so that "git clone" using an ssh protocol URL is no longer limited to 1 MB / second download.

          Mark Waite added a comment - jakecobb I doubt there is a change of input/output buffering when git is run by another process, but you'd need to investigate the git source code to decide that for sure. If you're running your Windows slave as a Windows service, then you'll have real difficulty interactively duplicating the environment where the git process runs. You could try running that process from inside a Jenkins job (using a Windows Batch job step, for instance) to see if the same good behavior exists when the git program is run inside a Jenkins job. You might also consider updating from msysgit 1.9.0 to the most recent 1.9.5 version. I don't know that it will fix your problem, but there are several useful fixes in the intervening releases between what you're running and the latest version. Among other things, the version of OpenSSH was upgraded between those two versions so that "git clone" using an ssh protocol URL is no longer limited to 1 MB / second download.

          Mark Waite added a comment -

          Closing as "Cannot reproduce"

          Mark Waite added a comment - Closing as "Cannot reproduce"

            Unassigned Unassigned
            sharon_xia sharon xia
            Votes:
            4 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: