• Icon: Patch Patch
    • Resolution: Fixed
    • Icon: Critical Critical
    • core
    • None
    • Platform: All, OS: All

      For our builds, using a slow CVS server (operated by CollabNet) with a large
      repository with many historical revisions, creating the CVS changelog is very
      slow. In fact, it is slower than any other part of the build. Hudson calls cvs
      log over the entire active portion of the repository, meaning many thousands of
      files many of which have years worth of changes. This can take half an hour at
      the minimum. It also adds a lot of load to the CVS server, making VC operations
      slower for everyone.

      The attached patch tries to solve this issue. It only runs log on those
      CVS-controlled files in the working checkout which have a newer timestamp than
      the previous build, on the assumption that any changes which have been updated
      since then will have caused those files to be touched. This makes creation of
      the CVS changelog much faster - and what time is spent on changelog calculation
      is mostly in the Hudson server (scanning dirs for recently modified files)
      rather than on the CVS server. That is to be preferred since CVS servers cannot
      easily be replicated for load balancing. For our builds, the patch seems to
      reduce typical build durations from about an hour to about 20 minutes.

      The main drawback is that deleted files will not be noted in the changelog. This
      is probably a reasonable tradeoff on the assumption that deletions are less
      common than additions (in a project under development); much less common than
      modifications; and less likely than either to cause a broken build or a major
      functional change which would need to be noted in the changelog. I have looked
      at ways in which this restriction could be removed without compromising
      performance but so far have not found any good solution; see comments in patch
      for details.

      To be conservative, the code falls back to the current behavior in case it
      cannot find any modified files.

      This patch probably needs more field testing to see if it holds up in different
      people's setups. It seems to be working for us.

      You could argue that the patch should really be applied to Ant's <cvschangelog>
      task, at least as an option.

          [JENKINS-71] Speed up CVS changelog generation

          Jesse Glick added a comment -

          Created an attachment (id=14)
          Proposed patch

          Jesse Glick added a comment - Created an attachment (id=14) Proposed patch

          Thanks. This is really a great idea. I also hate the changelog computation
          taking so long. But I'm bit concerned about being unable to detect deleted files.

          As you pointed out, I think it's better (and likely faster) if we parse the
          output from cvs update command. I already needed to create my own copy of Ant
          ChangeLogParser to patch a problem, so I'm fine with doing something similar
          with ChangeLogTask.

          I'd like to see if we can do this relatively painlessly. If not, let's use your
          patch.

          Kohsuke Kawaguchi added a comment - Thanks. This is really a great idea. I also hate the changelog computation taking so long. But I'm bit concerned about being unable to detect deleted files. As you pointed out, I think it's better (and likely faster) if we parse the output from cvs update command. I already needed to create my own copy of Ant ChangeLogParser to patch a problem, so I'm fine with doing something similar with ChangeLogTask. I'd like to see if we can do this relatively painlessly. If not, let's use your patch.

          I implemented the code to parse cvs update output and use that as the basis of
          the changelog work.

          Changes are committed. If you can try out the snapshot build, that would be great.

          Kohsuke Kawaguchi added a comment - I implemented the code to parse cvs update output and use that as the basis of the changelog work. Changes are committed. If you can try out the snapshot build, that would be great.

          Jesse Glick added a comment -

          I will try your version and let you know if anything looks wrong. I did not
          realize you had your own version of the Ant task anyway!

          Jesse Glick added a comment - I will try your version and let you know if anything looks wrong. I did not realize you had your own version of the Ant task anyway!

          Jesse Glick added a comment -

          Fix causes most (but, curiously, not all) builds to completely fail in the
          changelog generation step.

          java.lang.NoSuchMethodError:
          org.apache.tools.ant.taskdefs.cvslib.ChangeLogTask.setFile(Ljava/util/List;)V

          is thrown from the body of CVSSCM.calcChangeLog. Presumably WEB-INF/lib/ant.jar
          is being loaded before WEB-INF/lib/hudson.jar sometimes.

          It is unwise to retain the same package name when modifying the Ant task. I
          suggest moving it to e.g.

          package hudson.org.apache.tools.ant.taskdefs.cvslib;

          Jesse Glick added a comment - Fix causes most (but, curiously, not all) builds to completely fail in the changelog generation step. java.lang.NoSuchMethodError: org.apache.tools.ant.taskdefs.cvslib.ChangeLogTask.setFile(Ljava/util/List;)V is thrown from the body of CVSSCM.calcChangeLog. Presumably WEB-INF/lib/ant.jar is being loaded before WEB-INF/lib/hudson.jar sometimes. It is unwise to retain the same package name when modifying the Ant task. I suggest moving it to e.g. package hudson.org.apache.tools.ant.taskdefs.cvslib;

          Jesse Glick added a comment -

          Unfortunately due to package-private access it is then also necessary to copy
          the remaining classes from this package into the Hudson source tree. I am still
          testing this change.

          Jesse Glick added a comment - Unfortunately due to package-private access it is then also necessary to copy the remaining classes from this package into the Hudson source tree. I am still testing this change.

          Jesse Glick added a comment -

          Appears to be working with package rename and copy of remaining classes from
          original package.

          BTW Ant trunk classes are no longer compatible with your (old) version of the
          task, so use

          svn export
          https://svn.apache.org/repos/asf/ant/core/tags/ANT_162/src/main/org/apache/tools/ant/taskdefs/cvslib
          cp -iv * ..../hudson/src/hudson/org/apache/tools/ant/taskdefs/cvslib

          and then manually s/package org/package hudson.org/g.

          Jesse Glick added a comment - Appears to be working with package rename and copy of remaining classes from original package. BTW Ant trunk classes are no longer compatible with your (old) version of the task, so use svn export https://svn.apache.org/repos/asf/ant/core/tags/ANT_162/src/main/org/apache/tools/ant/taskdefs/cvslib cp -iv * ..../hudson/src/hudson/org/apache/tools/ant/taskdefs/cvslib and then manually s/package org/package hudson.org/g.

          Thanks. You are right. Changed as suggested.

          Kohsuke Kawaguchi added a comment - Thanks. You are right. Changed as suggested.

            Unassigned Unassigned
            jglick Jesse Glick
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: