Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14752

SCM Polling / Max # of concurrent polling = 1 hangs github polling

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None
    • ubuntu x32, java 7x32, jenkins 1.476, git 1.7.9.5

      SCM Polling / Max # of concurrent polling = 1 hangs github polling

      after a while process list shows a lot of entries like this, for each project in jenkins

      jenkins 18383 0.0 0.0 6208 1192 ? S 17:09 0:00 git fetch -t git@github.com:carrot-garden/carrot-util.git +refs/heads/:refs/remotes/origin/
      jenkins 18388 0.0 0.0 6300 2332 ? S 17:09 0:00 ssh git@github.com git-upload-pack 'carrot-garden/carrot-util.git'

      which appear to be in hang state;

      github scm polling works no more;

      resolution - periodic
      "killall git ssh"

          [JENKINS-14752] SCM Polling / Max # of concurrent polling = 1 hangs github polling

          Mig Jacq added a comment -

          This is suddenly happening to me too on LTS release 1.554.2 and Ubuntu 12.04 LTS, git version 1.7.9.5. I have verified that it's not a network issue as github is reachable, it seems sometimes the process just hangs with read() (when strace'ing it).

          Mig Jacq added a comment - This is suddenly happening to me too on LTS release 1.554.2 and Ubuntu 12.04 LTS, git version 1.7.9.5. I have verified that it's not a network issue as github is reachable, it seems sometimes the process just hangs with read() (when strace'ing it).

          Rob Duff added a comment -

          Something similar is happening with me, although it has nothing to do with the "max # of concurrent polling" setting in my case. The ssh process gets hung up when it runs "git-upload-pack". It seems to happen when we're grabbing the same repo on another executor on the same machine at the exact same time. It makes me think that this is the same issue as what you're seeing, but manifested by different means.

          Another problem we get when we grab the same repo on the same machine at the same time is https://issues.jenkins-ci.org/browse/JENKINS-24179. I won't go so far as to say that they are the same issue, but it has some potential to be somehow related. If that is the case, then it might be a git/ssh concurrency issue and not so much of a Jenkins issue. But – I can't reproduce this outside of Jenkins yet, so... I can't really say for sure.

          I'd be interested to know if the other cases happen to have the same repo accessed at the same time on the same machine.

          Rob Duff added a comment - Something similar is happening with me, although it has nothing to do with the "max # of concurrent polling" setting in my case. The ssh process gets hung up when it runs "git-upload-pack". It seems to happen when we're grabbing the same repo on another executor on the same machine at the exact same time. It makes me think that this is the same issue as what you're seeing, but manifested by different means. Another problem we get when we grab the same repo on the same machine at the same time is https://issues.jenkins-ci.org/browse/JENKINS-24179 . I won't go so far as to say that they are the same issue, but it has some potential to be somehow related. If that is the case, then it might be a git/ssh concurrency issue and not so much of a Jenkins issue. But – I can't reproduce this outside of Jenkins yet, so... I can't really say for sure. I'd be interested to know if the other cases happen to have the same repo accessed at the same time on the same machine.

          David Feldsine added a comment - - edited

          I am having the same issue.
          git 1.7.1
          Jenkins 1.492
          Git Plugin 1.4.0
          RedHat 2.6.32-279.5.2.el6.x86_64

          I will go for weeks with no issue then I will have a flurry of them. I would say that it fails about .1% to .01% of the time.

          jenkins  18718  0.0  0.0  10136  1240 ?        S    16:59   0:00 git fetch -t origin +refs/heads/*:refs/remotes/origin/*
          jenkins  18722  0.0  0.0  60040  3048 ?        S    16:59   0:00 ssh git@bitbucket.org git-upload-pack 'company/REPO.git'
          jenkins  29927  0.0  0.0  37336  3640 ?        S    16:01   0:00 git fetch -t origin +refs/heads/*:refs/remotes/origin/*
          jenkins  29931  0.0  0.0  60040  3048 ?        S    16:01   0:00 ssh git@bitbucket.org git-upload-pack 'company/REPO2.git'
          

          David Feldsine added a comment - - edited I am having the same issue. git 1.7.1 Jenkins 1.492 Git Plugin 1.4.0 RedHat 2.6.32-279.5.2.el6.x86_64 I will go for weeks with no issue then I will have a flurry of them. I would say that it fails about .1% to .01% of the time. jenkins 18718 0.0 0.0 10136 1240 ? S 16:59 0:00 git fetch -t origin +refs/heads/*:refs/remotes/origin/* jenkins 18722 0.0 0.0 60040 3048 ? S 16:59 0:00 ssh git@bitbucket.org git-upload-pack 'company/REPO.git' jenkins 29927 0.0 0.0 37336 3640 ? S 16:01 0:00 git fetch -t origin +refs/heads/*:refs/remotes/origin/* jenkins 29931 0.0 0.0 60040 3048 ? S 16:01 0:00 ssh git@bitbucket.org git-upload-pack 'company/REPO2.git'

          Jason Kushmaul added a comment - - edited

          Same issue here. I reproduced this using the latest versions and documented my steps with an empty repo.

          Jenkins master
          	Linux, CentOS 6.5
          	Jenkins ver. 1.588
          	Git Plugin 2.2.7
          	Git client plugin 1.11.0
          Jenkins slave 
          	Windows 8.1 Pro
          	using jnlp installed as service.
          	E:\Jenkins\workspace\TestNewGitBranch>C:\git\cmd\git.exe --version
          		git version 1.9.4.msysgit.2
          	java -version
          		java version "1.7.0_71"
          		Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
          		Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
          
          1. Setup a new empty repo on git hosting (gitlab in this case)
            1. Make public
              ssh://git@builds.me.local/jk/testgitreponewbranch.git
              I can clone it using Jenkin's key on slave, as the user the service is running as.
          2. Setup a new Jenkins job to be built on windows slave.
            1. Restrict to windows node
            2. Set git repo to above url
              ssh://git@builds.me.local/jk/testgitreponewbranch.git
            3. Set branches to build to
              */release/*
            4. Set poll SCM to
              */1 * * * *

              (so we don't have to wait)

            5. Save
          3. Check Git Polling
            I see (No changes)
            That's good, we haven't pushed anything to a release/* branch yet.
            Just wait.
          4. Next you'll see git polling log hanging:
            	Started on Nov 4, 2014 9:21:00 AM
            	Polling SCM changes on WINDOWS
            	Using strategy: Default
            	 > "C:\git\cmd\git.exe" rev-parse --is-inside-work-tree # timeout=10
            	Fetching changes from the remote Git repositories
            	 > "C:\git\cmd\git.exe" config remote.origin.url ssh://git@builds.me.local/jk/testgitreponewbranch.git # timeout=10
            	Fetching upstream changes from ssh://git@builds.me.local/jk/testgitreponewbranch.git
            	 > "C:\git\cmd\git.exe" --version # timeout=10
            	 > "C:\git\cmd\git.exe" fetch --tags --progress ssh://git@builds.me.local/jk/testgitreponewbranch.git +refs/heads/*:refs/remotes/origin/*
            

            You can see the process never ending in taskmgr

          However, running that last command manually (from a command prompt) does not result in hang, it exits immediately.

          I should add that I do not have any other git polling jobs enabled. I do not have any other jobs referencing this git repo. Initially I did not have a branch at all. I tried adding one and pushing it "release/1.0" so that the job had something to do. This had no effect, the "Fetching upstream changes" process above still hangs, with SSH being the bottom most child process in procexp

          Jason Kushmaul added a comment - - edited Same issue here. I reproduced this using the latest versions and documented my steps with an empty repo. Jenkins master Linux, CentOS 6.5 Jenkins ver. 1.588 Git Plugin 2.2.7 Git client plugin 1.11.0 Jenkins slave Windows 8.1 Pro using jnlp installed as service. E:\Jenkins\workspace\TestNewGitBranch>C:\git\cmd\git.exe --version git version 1.9.4.msysgit.2 java -version java version "1.7.0_71" Java(TM) SE Runtime Environment (build 1.7.0_71-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode) Setup a new empty repo on git hosting (gitlab in this case) Make public ssh://git@builds.me.local/jk/testgitreponewbranch.git I can clone it using Jenkin's key on slave, as the user the service is running as. Setup a new Jenkins job to be built on windows slave. Restrict to windows node Set git repo to above url ssh://git@builds.me.local/jk/testgitreponewbranch.git Set branches to build to */release/* Set poll SCM to */1 * * * * (so we don't have to wait) Save Check Git Polling I see (No changes) That's good, we haven't pushed anything to a release/* branch yet. Just wait. Next you'll see git polling log hanging: Started on Nov 4, 2014 9:21:00 AM Polling SCM changes on WINDOWS Using strategy: Default > "C:\git\cmd\git.exe" rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repositories > "C:\git\cmd\git.exe" config remote.origin.url ssh://git@builds.me.local/jk/testgitreponewbranch.git # timeout=10 Fetching upstream changes from ssh://git@builds.me.local/jk/testgitreponewbranch.git > "C:\git\cmd\git.exe" --version # timeout=10 > "C:\git\cmd\git.exe" fetch --tags --progress ssh://git@builds.me.local/jk/testgitreponewbranch.git +refs/heads/*:refs/remotes/origin/* You can see the process never ending in taskmgr However, running that last command manually (from a command prompt) does not result in hang, it exits immediately. I should add that I do not have any other git polling jobs enabled. I do not have any other jobs referencing this git repo. Initially I did not have a branch at all. I tried adding one and pushing it "release/1.0" so that the job had something to do. This had no effect, the "Fetching upstream changes" process above still hangs, with SSH being the bottom most child process in procexp

          Another update regarding my previous post.

          When I inspect the hung git.exe process, using procexp, I can view the "Environment".
          I can see that the "HOME" directory is set to "/var/lib/jenkins" rather than the usual "C:\users\git".

          So it appears that Jenkins master is passing through the HOME environment variable as it's own (which it is in deed /var/lib/jenkins on my linux master)
          which is blowing away the usual HOME environment variable on my windows slave.

          Jason Kushmaul added a comment - Another update regarding my previous post. When I inspect the hung git.exe process, using procexp, I can view the "Environment". I can see that the "HOME" directory is set to "/var/lib/jenkins" rather than the usual "C:\users\git". So it appears that Jenkins master is passing through the HOME environment variable as it's own (which it is in deed /var/lib/jenkins on my linux master) which is blowing away the usual HOME environment variable on my windows slave.

          Daniel Beck added a comment -

          Any of you using EnvInject? What happens when you disable it?

          Daniel Beck added a comment - Any of you using EnvInject? What happens when you disable it?

          Jason Kushmaul added a comment - - edited

          Paydirt.

          So I added a custom "git" batch script in my C:\Git\Cmd directory as C:\git\cmd\git.cmd"
          I then set my WINDOWS slave node to use this as my git command instead of C:\git\cmd\git.exe"

          @rem Do not use "echo off" to not affect any child calls.
          
          @rem Enable extensions, the `verify other 2>nul` is a trick from the setlocal help
          @verify other 2>nul
           
          @rem The above script again with immediate expansion, in case delayed expansion
          @rem is unavailable.
          @for /F "delims=" %%I in ("%~dp0..") do @set git_install_root=%%~fI
          @set PATH=%git_install_root%\bin;%git_install_root%\mingw\bin;%PATH%
          
          @set HOME=%userprofile%
          
          
          @"%git_install_root%\bin\git.exe" %*
          
          :end
          @rem End of script
          
          

          And my git polling no longer hangs.

          This is because I'm setting the Home directory manually. This isn't the fix for this problem, but it's a very simple work around.

          Update - Unfortunately this is not a work around, it only proves the problem exists. Normal git operations suffer due to this, I'm not a batch script pro so I must be doing something wrong. Like I said this shouldn't be used as a solution, but simply as a reproduction of the problem (You can see that it fails git poll without, and succeeds with)

          Jason Kushmaul added a comment - - edited Paydirt. So I added a custom "git" batch script in my C:\Git\Cmd directory as C:\git\cmd\git.cmd" I then set my WINDOWS slave node to use this as my git command instead of C:\git\cmd\git.exe" @rem Do not use "echo off" to not affect any child calls. @rem Enable extensions, the `verify other 2>nul` is a trick from the setlocal help @verify other 2>nul @rem The above script again with immediate expansion, in case delayed expansion @rem is unavailable. @for /F "delims=" %%I in ("%~dp0..") do @set git_install_root=%%~fI @set PATH=%git_install_root%\bin;%git_install_root%\mingw\bin;%PATH% @set HOME=%userprofile% @"%git_install_root%\bin\git.exe" %* :end @rem End of script And my git polling no longer hangs. This is because I'm setting the Home directory manually. This isn't the fix for this problem, but it's a very simple work around. Update - Unfortunately this is not a work around, it only proves the problem exists. Normal git operations suffer due to this, I'm not a batch script pro so I must be doing something wrong. Like I said this shouldn't be used as a solution, but simply as a reproduction of the problem (You can see that it fails git poll without, and succeeds with)

          Daniel,

          I have that plugin installed, but not checked for this project. I could try uninstalling it, good check.

          Jason Kushmaul added a comment - Daniel, I have that plugin installed, but not checked for this project. I could try uninstalling it, good check.

          Daniel,

          I disabled the plugin EnvInject, restarted Jenkins, set my Windows Git back to C:\git\cmd\git.exe which caused the hang before.

          The hang is still present though, same problem with the home directory being the linux home variable, not the windows home directory.

          Jason Kushmaul added a comment - Daniel, I disabled the plugin EnvInject, restarted Jenkins, set my Windows Git back to C:\git\cmd\git.exe which caused the hang before. The hang is still present though, same problem with the home directory being the linux home variable, not the windows home directory.

          For those of you that need a work around to this, my previous ".cmd" works enough to allow polling to work, but for some reason, causes normal git operations to fail (Like simply building project using "Build now"

          Using the following as C:\git\cmd\git.bat

          @set HOME=%USERPROFILE%
          @C:\git\bin\git.exe %*
          

          And then set that to your windows Git.

          This allows git functions such as build now, gitlab merge request builder, and polling all work to work until the bug is fixed.

          Jason Kushmaul added a comment - For those of you that need a work around to this, my previous ".cmd" works enough to allow polling to work, but for some reason, causes normal git operations to fail (Like simply building project using "Build now" Using the following as C:\git\cmd\git.bat @set HOME=%USERPROFILE% @C:\git\bin\git.exe %* And then set that to your windows Git. This allows git functions such as build now, gitlab merge request builder, and polling all work to work until the bug is fixed.

            Unassigned Unassigned
            andrei_pozolotin Andrei Pozolotin
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: