• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • git-plugin
    • None

      We are seeing hung git processes which seem to be left over from when Jenkins is using Git to poll for changes. This is likely because there was some IO problem (disk or network) when the polling was attempted. Instead of hanging forever, the fetch should eventually timeout.

          [JENKINS-11286] Git plugin does not timeout

          Ryan Campbell added a comment -

          Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join().

          Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy.

          Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.

          Ryan Campbell added a comment - Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join(). Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy. Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.

          Karol Depka Pradzinski added a comment - - edited

          I've started working on this bug. Some questions:
          1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes?
          1.1. Maybe other approach would be to limit the number of running/hanging git processes?
          2. Would my changes be useful also if this setting is at first not configurable?

          Karol Depka Pradzinski added a comment - - edited I've started working on this bug. Some questions: 1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes? 1.1. Maybe other approach would be to limit the number of running/hanging git processes? 2. Would my changes be useful also if this setting is at first not configurable?

          I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?

          Karol Depka Pradzinski added a comment - I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?

          ProcStarter.start() return a Proc that has this joinWithTimeout() method

          Nicolas De Loof added a comment - ProcStarter.start() return a Proc that has this joinWithTimeout() method

          Ryan Campbell added a comment -

          Answers:

          1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so?
          1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling.
          2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.

          Ryan Campbell added a comment - Answers: 1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so? 1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling. 2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.

          Tully Foote added a comment -

          Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime.

          And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.

          Tully Foote added a comment - Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime. And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.

          c Kirschner added a comment -

          1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing.
          To get it running again i have to kill all git and git-remote-https processes.

          c Kirschner added a comment - 1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing. To get it running again i have to kill all git and git-remote-https processes.

          Olivier Jolit added a comment -

          Same problem here, for each temporary network failure we have to restart our jenkins installations. Apparently assignee has no activity since October 2011, I guess it's not "In Progress" anymore.

          Olivier Jolit added a comment - Same problem here, for each temporary network failure we have to restart our jenkins installations. Apparently assignee has no activity since October 2011, I guess it's not "In Progress" anymore.

          Hi Guys. Due to time constraints, I was not able to fix this bug. Earlier I forgot to change the status and un-assign myself.

          Karol Depka Pradzinski added a comment - Hi Guys. Due to time constraints, I was not able to fix this bug. Earlier I forgot to change the status and un-assign myself.

          Is this fix in the 2.0 release or something else? I'm not seeing it mentioned in the changelog.

          Dave Brondsema added a comment - Is this fix in the 2.0 release or something else? I'm not seeing it mentioned in the changelog .

          Joel Gallant added a comment - - edited

          I noticed it mentioned as a system property in 1.4.6 - (org.jenkinsci.plugins.gitclient.Git.timeout).

          I tried setting this in the launch, and it shows in the environment variables, but doesn't appear to do anything - my timeout is still @10 minutes...

          Joel Gallant added a comment - - edited I noticed it mentioned as a system property in 1.4.6 - (org.jenkinsci.plugins.gitclient.Git.timeout). I tried setting this in the launch, and it shows in the environment variables, but doesn't appear to do anything - my timeout is still @10 minutes...

          Joel Gallant added a comment -

          Ah! Now I see, it's in the 1.4.x maintenance branch!

          Joel Gallant added a comment - Ah! Now I see, it's in the 1.4.x maintenance branch!

          Code changed in jenkins
          User: Nicolas De Loof
          Path:
          src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java
          http://jenkins-ci.org/commit/git-client-plugin/1b7fd2b18d626d8ca081933d8a004fd7b2279210
          Log:
          JENKINS-11286 run git commands with a time-out

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nicolas De Loof Path: src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java http://jenkins-ci.org/commit/git-client-plugin/1b7fd2b18d626d8ca081933d8a004fd7b2279210 Log: JENKINS-11286 run git commands with a time-out

          Don Ross added a comment -

          Why, oh why, was this implemented without an option to disable it?

          Don Ross added a comment - Why, oh why, was this implemented without an option to disable it?

          Steve Cohen added a comment - - edited

          Agree with Don Ross

          I have a valid, albeit rare, use case for running without a timeout.  Given a large job with git LFS objects that change infrequently, it might take 5 or 6 hours for the original download.  Being able to specify no timeout would be a good thing here.  Is there some values (say, maybe -1) that means don't time out?  Once the initial LFS stuff is checked out timeouts are again reasonable.

          Steve Cohen added a comment - - edited Agree with Don Ross I have a valid, albeit rare, use case for running without a timeout.  Given a large job with git LFS objects that change infrequently, it might take 5 or 6 hours for the original download.  Being able to specify no timeout would be a good thing here.  Is there some values (say, maybe -1) that means don't time out?  Once the initial LFS stuff is checked out timeouts are again reasonable.

          Mark Waite added a comment -

          sc1478 there is no value which means "don't time out".

          I don't see much difference for your use case between a very large number and no timeout. If you set the timeout to 1000000 minutes, it seems unlikely that you'll reach the timeout before either interrupting the job, restarting Jenkins, or rebooting the computer.

          Mark Waite added a comment - sc1478 there is no value which means "don't time out". I don't see much difference for your use case between a very large number and no timeout. If you set the timeout to 1000000 minutes, it seems unlikely that you'll reach the timeout before either interrupting the job, restarting Jenkins, or rebooting the computer.

          Steve Cohen added a comment - - edited

          True enough Mark Waite.  However, see JENKINS-47616, wherein I request making this parameterizable, since with Git-LFS the initial checkout is much slower than subsequent ones.

          Parameterization is key, since otherwise you must keep messing around with configurations otherwise.

          Steve Cohen added a comment - - edited True enough Mark Waite.  However, see JENKINS-47616 , wherein I request making this parameterizable, since with Git-LFS the initial checkout is much slower than subsequent ones. Parameterization is key, since otherwise you must keep messing around with configurations otherwise.

          Patrick B added a comment -

          Hey guys,

           

          in which file I can find this option? `org.jenkinsci.plugins.gitclient.Git.timeOut`

          I searched in all administrative areas and not find it.
          -> YES on each project I can add this as additional configuration.

          But I want this global on all projects without that people must set it up every time.

          (On Windows, I checked config.xml, settings.xml, plugins-settings.xml and everything)

          Patrick B added a comment - Hey guys,   in which file I can find this option? `org.jenkinsci.plugins.gitclient.Git.timeOut` I searched in all administrative areas and not find it. -> YES on each project I can add this as additional configuration. But I want this global on all projects without that people must set it up every time. (On Windows, I checked config.xml, settings.xml, plugins-settings.xml and everything)

          Mark Waite added a comment -

          patrick_i if your users need to increase the timeout generally, then I think you may have missed an opportunity to help your users with faster clones. Refer to the Jenkins World 2017 15 minute talk on "git in the large" (slides).

          For example, cloning a large git repository can be significantly reduced with a reference repository. Cloning a large git repository can be significantly reduced by using a narrow refspec. Cloning a large git repository can be significantly reduced with shallow clone.

          Even with the best of techniques, there still may be times when you choose to attempt to adjust the global git client plugin timeout value. That requires a change of the command line parameters used to start Jenkins. There is no user interface support for global adjustment of the git client plugin timeout value.

          In general, the Java process which starts Jenkins needs the argument -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=12345

          Refer to my docker_run.py script as one example of a way to pass that argument to the java command which starts Jenkins. If your Jenkins starts from an init script on Ubuntu or Debian, you may be able to adjust command line arguments from /etc/defaults/jenkins. If your Jenkins starts from an init script on Red Hat, CentOS, OpenSUSE, or SUSE, you may be able to adjust command line arguments in /etc/sysconfig/jenkins. If your Jenkins is a service on Windows, I believe there is a configuration file that can be changed to add arguments to the Java command line which starts Jenkins.

          Mark Waite added a comment - patrick_i if your users need to increase the timeout generally, then I think you may have missed an opportunity to help your users with faster clones. Refer to the Jenkins World 2017 15 minute talk on " git in the large " ( slides ). For example, cloning a large git repository can be significantly reduced with a reference repository. Cloning a large git repository can be significantly reduced by using a narrow refspec. Cloning a large git repository can be significantly reduced with shallow clone. Even with the best of techniques, there still may be times when you choose to attempt to adjust the global git client plugin timeout value. That requires a change of the command line parameters used to start Jenkins. There is no user interface support for global adjustment of the git client plugin timeout value. In general, the Java process which starts Jenkins needs the argument -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=12345 Refer to my docker_run.py script as one example of a way to pass that argument to the java command which starts Jenkins. If your Jenkins starts from an init script on Ubuntu or Debian, you may be able to adjust command line arguments from /etc/defaults/jenkins. If your Jenkins starts from an init script on Red Hat, CentOS, OpenSUSE, or SUSE, you may be able to adjust command line arguments in /etc/sysconfig/jenkins. If your Jenkins is a service on Windows, I believe there is a configuration file that can be changed to add arguments to the Java command line which starts Jenkins.

          Patrick B added a comment -

          @markewaite: Thank you very much.
          I understand that it is more professional if we fix the problem in general instead using larger timeouts.
          So I will check your slides in the next weeks but for the moment I added this command.

           

          For everybody Windows users else:
          C:\Program Files (x86)\Jenkins\jenkins.xml

          There you scroll down to service and arguments. There you can add this:

           -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=60
          

           

          And then your log file shows # timeout=60 and everything is fine 

           

          Patrick B added a comment - @ markewaite : Thank you very much. I understand that it is more professional if we fix the problem in general instead using larger timeouts. So I will check your slides in the next weeks but for the moment I added this command.   For everybody Windows users else: C:\Program Files (x86)\Jenkins\jenkins.xml There you scroll down to service and arguments . There you can add this:  -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=60   And then your log file shows # timeout=60 and everything is fine   

            ndeloof Nicolas De Loof
            recampbell Ryan Campbell
            Votes:
            6 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: