Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-2913

svn polling only working for a small subset of jobs in 1.277

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • subversion-plugin
    • None
    • Platform: All, OS: All

      After upgrading from Hudson 1.270 to 1.277 I have started to see a major problem
      with Subversion polling in our environment.

      It appears to be caused, or at least catalyzed, by this statement on line 852 of
      java/hudson/scm/SubversionSCM.java:

      Thread.sleep(60*1000); // debug

      Judging from svn and hudson changelogs this line was introduced to
      SubversionSCM.java somewhere around 1.273.

      I commented out this line, compiled my own SubversionSCM.class file, and patched
      it into my Hudson 1.277 environment. After applying this change, polling seems
      to be working fine again. There is a clear difference in the Subversion polling
      logs. The polling duration has reduced from > 1 minute to 1-2 seconds, which is
      where it was before with 1.270.

      Configuration in our environment which is worth mentioning:

      • concurrent svn polling limit of 12 threads in Hudson system config
      • using ssh-slaves plugin to manage about 20 remote nodes
      • about 90 jobs which are configured to poll svn on differing schedules: every
        24 hours, every 10 minutes, every 5 minutes, every 3 minutes, and every 1 minute
        depending on the job.

      I attached to the Hudson JVM using jconsole and monitored the threads used for
      svn polling. As I expected, a pool of 12 threads was created. But
      surprisingly, only 4 of these threads would poll svn. The same 4 threads were
      just repeatedly polling the same 4 jobs. The rest of the threads appeared to be
      waiting for locks held by these 4 threads:

      "pool-2-thread-12" Id=249 WAITING on
      java.util.concurrent.locks.ReentrantLock$NonfairSync@18b9d48 owned by
      "pool-2-thread-8" Id=243
      at sun.misc.Unsafe.park(Native Method)

      • waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@18b9d48

      I think I understand why the 4 active threads kept polling the same jobs: these
      were jobs configured to poll every minute and the Hudson polling Java method was
      sleeping for 60 seconds during every call. So every time the code would leave
      the pollChanges method, the same job would immediately require another polling
      attempt.

      I had other jobs (not the 4 lucky ones) polling hours apart when it should
      have been minutes. The other 8 threads were always waiting for locks held by
      the 4 active threads. Perhaps this has exposed a bigger problem with
      synchronization between polling threads. It makes no sense to me that only 4 of
      the 12 threads would be actively polling as I have a large number of projects
      polling all the time. Maybe the sleep statement in 1.270 was exposing the
      problem already documented for earlier builds in issue 2516
      (https://hudson.dev.java.net/issues/show_bug.cgi?id=2516) in my environment.
      Yesterday I added a comment to this issue with a thread dump from my environment.

      If the sleep statement was checked in accidentally (the "debug" comment makes me
      wonder) it should be removed for a future Hudson release.

      -Andrew

      Versions:
      Hudson: 1.277 (winstone container)
      JDK: 1.6.0_06

          [JENKINS-2913] svn polling only working for a small subset of jobs in 1.277

          Has anyone else seen this problem with Hudson 1.27x? svn polling never takes
          less than 60 seconds, and many jobs stuck when they should be polling? I was
          seeing it with 1.277 and I fixed it by removing one line of code (in bug
          description). I'm reluctant to upgrade to 1.279 because it does not look like
          this problem has been fixed.

          Andrew Goktepe added a comment - Has anyone else seen this problem with Hudson 1.27x? svn polling never takes less than 60 seconds, and many jobs stuck when they should be polling? I was seeing it with 1.277 and I fixed it by removing one line of code (in bug description). I'm reluctant to upgrade to 1.279 because it does not look like this problem has been fixed.

          Alan Harder added a comment -

          it wasn't mentioned in the changelog, but that line was removed for 1.279 (r14821)

          Alan Harder added a comment - it wasn't mentioned in the changelog, but that line was removed for 1.279 (r14821)

          Alan Harder added a comment -

          I added this to the changelog (debug line was in 1.273 thru 1.278)

          Alan Harder added a comment - I added this to the changelog (debug line was in 1.273 thru 1.278)

          Alan Harder added a comment -
              • Issue 2894 has been marked as a duplicate of this issue. ***

          Alan Harder added a comment - Issue 2894 has been marked as a duplicate of this issue. ***

            Unassigned Unassigned
            andrewgoktepe Andrew Goktepe
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: