Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4176

Reference counting error for RemotableSVNAuthenticationProviderImpl singleton

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • subversion-plugin
    • None
    • Platform: All, OS: All

    Description

      As reported in this thread on the dev list, we have been seeing some odd
      intermittent behavior since the Channel.unexport fix from issue 4045 was
      released in Hudson 1.317:
      http://www.nabble.com/Error-in-SubversionSCM-"Unable-to-call-getCredential"-td24799653.html

      java.lang.IllegalStateException: Unable to call getCredential. Invalid
      object ID 476
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:259)
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:246)
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:206)
      at hudson.remoting.UserRequest.perform(UserRequest.java:92)
      at hudson.remoting.UserRequest.perform(UserRequest.java:46)
      at hudson.remoting.Request$2.run(Request.java:236)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at
      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:619)

      The issue is related to the fact that RemotableSVNAuthenticationProviderImpl is
      a singleton that is exported multiple times during overlapping builds on the
      same node. Here is roughly what happens in a two-executor scenario with two
      SVN-based builds set up to trigger at the same time:

      1. Executor #1 calls Entry.addRef as part of the first call to FilePath.act in
      SubversionSCM (call A; reference count is now 1; proxy A is exported to slave)
      2. Executor #0 calls Entry.addRef as part of the first call to FilePath.act in
      SubversionSCM (call B; reference count is now 2; proxy B is exported to slave)
      3. Executor #0 calls Entry.release as part of the first call to FilePath.act in
      SubversionSCM (call B; reference count is now 1; proxy B is unexported, but
      still uncollected on slave)
      4. Executor #0 calls Entry.addRef as part of the second call to FilePath.act in
      SubversionSCM (call C; reference count is now 2; proxy C is exported to slave)
      5. Executor #1 calls Entry.release as part of the first call to FilePath.act in
      SubversionSCM (call A; reference count is now 1; proxy A is unexported, but
      still uncollected on slave)
      6. Executor #1 calls Entry.addRef as part of the second call to FilePath.act in
      SubversionSCM (call D; reference count is now 2; proxy D is exported to slave)
      7. GC is triggered on the slave, causing proxy A and proxy B to be collected and
      triggering UnexportCommand to be sent for each from RemoteInvocationHandler.finalize
      8. Channel reader calls Entry.release in response to the first UnexportCommand
      (reference count is now 1)
      9. Channel reader calls Entry.release in response to the second UnexportCommand
      (reference count is now 0; Entry is removed from the ExportTable, but still in
      the ExportLists for calls C and D)
      10. Slave Executor #0 invokes getCredential, causing an RPCRequest to be sent
      back to the master with the oid that was just removed, resulting in exception
      11. Slave Executor #1 invokes getCredential, causing an RPCRequest to be sent
      back to the master with the oid that was just removed, resulting in exception
      12. Executor #0 calls Entry.release as part of the second call to FilePath.act
      in SubversionSCM (call C; reference count is now -1)
      13. Executor #1 calls Entry.release as part of the second call to FilePath.act
      in SubversionSCM (call D; reference count is now -2)

      The reason this is intermittent is that the GC that cleans up the proxies it not
      predictable. If it doesn't happen until after call C and call D complete, the
      UnexportCommand's are simply ignored. I just got lucky being able to reproduce
      it because I noticed that it would happen for the first build right after I
      disconnected and reconnected the slave. The reason I believe this ended up
      letting me reproduce it reliably was that the slave VM starts off with a low
      default heap size and the chances of it needing to GC immediately when the two
      builds hit it at the same time are much higher.

      Also, the reason this can't happen in a single executor setup is that the two
      FilePath.act calls are serialized and the Proxy for the auth provider has a
      different oid each time. It is only in the case of these closely overlapping
      calls with an intervening GC on the slave that you see the spurious ref-count
      decrement happen.

      I've successfully fixed this in our internal copy of Hudson by changing the
      RemotableSVNAuthenticationProviderImpl from a true singleton to a per-thread
      singleton tied to each executor thread. I have a patch for this and will provide
      it shortly.

      Also, this issue is not going to be limited to
      RemotableSVNAuthenticationProviderImpl. Any place where a singleton is multiply
      exported from overlapping calls could see the same problem. The reason we didn't
      see this before I fixed issue 4045 was that the Channel.unexport(int) command
      used to be broken and didn't have an effect on the reference count. This meant
      that the UnexportCommands sent in step #7 above were effectively ignored by the
      master.

      Attachments

        Activity

          People

            Unassigned Unassigned
            md5 Mike Dillon
            Votes:
            7 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: