Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4794

Mercurial cache job

XMLWordPrintable

    • Icon: New Feature New Feature
    • Resolution: Fixed
    • Icon: Major Major
    • mercurial-plugin
    • None
    • Platform: All, OS: All

      For Hudson installations that have a lot of jobs all running off one (or a small
      number) of Mercurial repositories, it is inefficient to have them all pull over
      the network, as they will be repeatedly pulling the exact same changesets. The
      situation is even worse when you consider that polling effectively pulls in
      changesets as well (just discarding them after logging their metadata).

      Suggest a new special job type, Mercurial Cache, which would have attributes:

      1. List of repository URLs.

      2. Optional schedule, like a project.

      There is a corresponding workspace on the master and possibly on some or all
      slaves. Whenever the scheduler fires or the job is otherwise run (e.g.
      manually), the following actions will be taken:

      1. For each repo, if there is a matching cache in the master's workspace, 'hg in
      --bundle incoming.hg && hg pull incoming.hg' to pull all changesets into it.

      2. For each repo and for each slave, if the slave's workspace also contains that
      repo, send incoming.hg to the slave (over the usual channel) and have the slave
      'hg unbundle' it.

      Whenever a project using Mercurial SCM with a matching repository location is
      run or does polling:

      1. If on the master, quietly swap in the local cache repo location for all Hg
      operations that would normally use the remote repo URL (I think this is always
      'hg incoming' in some variant). Note that this means sharing hardlinks in most
      cases. If the cache repo does not yet exist, 'hg clone -U' it and then proceed.

      2. If on a slave, swap in the local (slave) cache repo location. If it does not
      yet exist on the slave, run 'hg bundle --all' on the master, send to the slave
      over the channel, and 'hg init && hg unbundle ...' to create a clone. If it does
      not yet exist on the master, clone it as in #1.

      There needs to be some synchronization so that master and slave caches remain in
      lockstep.

      No configuration for named branches in the caches; only complete repositories
      are cached. Projects using branches will still only pull that branch from the
      cache. The cache does not keep a checkout ("working copy") so no configuration
      needed for that either.

      One possible side benefit of this setup is that the slave does not perform any
      network operations except over its channel to the master. Providing that the
      project build does not perform any network operations, you could then have a
      slave with no internet connection: the master does all pulls from the remote
      repository.

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: