• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • git-plugin
    • None

      We are seeing hung git processes which seem to be left over from when Jenkins is using Git to poll for changes. This is likely because there was some IO problem (disk or network) when the polling was attempted. Instead of hanging forever, the fetch should eventually timeout.

          [JENKINS-11286] Git plugin does not timeout

          Ryan Campbell created issue -

          Ryan Campbell added a comment -

          Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join().

          Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy.

          Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.

          Ryan Campbell added a comment - Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join(). Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy. Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.

          Karol Depka Pradzinski added a comment - - edited

          I've started working on this bug. Some questions:
          1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes?
          1.1. Maybe other approach would be to limit the number of running/hanging git processes?
          2. Would my changes be useful also if this setting is at first not configurable?

          Karol Depka Pradzinski added a comment - - edited I've started working on this bug. Some questions: 1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes? 1.1. Maybe other approach would be to limit the number of running/hanging git processes? 2. Would my changes be useful also if this setting is at first not configurable?
          Karol Depka Pradzinski made changes -
          Assignee Original: Andrew Bayer [ abayer ] New: Karol Depka Pradzinski [ karol_depka ]
          Karol Depka Pradzinski made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

          I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?

          Karol Depka Pradzinski added a comment - I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?

          ProcStarter.start() return a Proc that has this joinWithTimeout() method

          Nicolas De Loof added a comment - ProcStarter.start() return a Proc that has this joinWithTimeout() method

          Ryan Campbell added a comment -

          Answers:

          1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so?
          1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling.
          2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.

          Ryan Campbell added a comment - Answers: 1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so? 1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling. 2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.

          Tully Foote added a comment -

          Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime.

          And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.

          Tully Foote added a comment - Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime. And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.

          c Kirschner added a comment -

          1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing.
          To get it running again i have to kill all git and git-remote-https processes.

          c Kirschner added a comment - 1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing. To get it running again i have to kill all git and git-remote-https processes.

            ndeloof Nicolas De Loof
            recampbell Ryan Campbell
            Votes:
            6 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: