Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4176

Reference counting error for RemotableSVNAuthenticationProviderImpl singleton

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • subversion-plugin
    • None
    • Platform: All, OS: All

      As reported in this thread on the dev list, we have been seeing some odd
      intermittent behavior since the Channel.unexport fix from issue 4045 was
      released in Hudson 1.317:
      http://www.nabble.com/Error-in-SubversionSCM-"Unable-to-call-getCredential"-td24799653.html

      java.lang.IllegalStateException: Unable to call getCredential. Invalid
      object ID 476
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:259)
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:246)
      at
      hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:206)
      at hudson.remoting.UserRequest.perform(UserRequest.java:92)
      at hudson.remoting.UserRequest.perform(UserRequest.java:46)
      at hudson.remoting.Request$2.run(Request.java:236)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at
      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:619)

      The issue is related to the fact that RemotableSVNAuthenticationProviderImpl is
      a singleton that is exported multiple times during overlapping builds on the
      same node. Here is roughly what happens in a two-executor scenario with two
      SVN-based builds set up to trigger at the same time:

      1. Executor #1 calls Entry.addRef as part of the first call to FilePath.act in
      SubversionSCM (call A; reference count is now 1; proxy A is exported to slave)
      2. Executor #0 calls Entry.addRef as part of the first call to FilePath.act in
      SubversionSCM (call B; reference count is now 2; proxy B is exported to slave)
      3. Executor #0 calls Entry.release as part of the first call to FilePath.act in
      SubversionSCM (call B; reference count is now 1; proxy B is unexported, but
      still uncollected on slave)
      4. Executor #0 calls Entry.addRef as part of the second call to FilePath.act in
      SubversionSCM (call C; reference count is now 2; proxy C is exported to slave)
      5. Executor #1 calls Entry.release as part of the first call to FilePath.act in
      SubversionSCM (call A; reference count is now 1; proxy A is unexported, but
      still uncollected on slave)
      6. Executor #1 calls Entry.addRef as part of the second call to FilePath.act in
      SubversionSCM (call D; reference count is now 2; proxy D is exported to slave)
      7. GC is triggered on the slave, causing proxy A and proxy B to be collected and
      triggering UnexportCommand to be sent for each from RemoteInvocationHandler.finalize
      8. Channel reader calls Entry.release in response to the first UnexportCommand
      (reference count is now 1)
      9. Channel reader calls Entry.release in response to the second UnexportCommand
      (reference count is now 0; Entry is removed from the ExportTable, but still in
      the ExportLists for calls C and D)
      10. Slave Executor #0 invokes getCredential, causing an RPCRequest to be sent
      back to the master with the oid that was just removed, resulting in exception
      11. Slave Executor #1 invokes getCredential, causing an RPCRequest to be sent
      back to the master with the oid that was just removed, resulting in exception
      12. Executor #0 calls Entry.release as part of the second call to FilePath.act
      in SubversionSCM (call C; reference count is now -1)
      13. Executor #1 calls Entry.release as part of the second call to FilePath.act
      in SubversionSCM (call D; reference count is now -2)

      The reason this is intermittent is that the GC that cleans up the proxies it not
      predictable. If it doesn't happen until after call C and call D complete, the
      UnexportCommand's are simply ignored. I just got lucky being able to reproduce
      it because I noticed that it would happen for the first build right after I
      disconnected and reconnected the slave. The reason I believe this ended up
      letting me reproduce it reliably was that the slave VM starts off with a low
      default heap size and the chances of it needing to GC immediately when the two
      builds hit it at the same time are much higher.

      Also, the reason this can't happen in a single executor setup is that the two
      FilePath.act calls are serialized and the Proxy for the auth provider has a
      different oid each time. It is only in the case of these closely overlapping
      calls with an intervening GC on the slave that you see the spurious ref-count
      decrement happen.

      I've successfully fixed this in our internal copy of Hudson by changing the
      RemotableSVNAuthenticationProviderImpl from a true singleton to a per-thread
      singleton tied to each executor thread. I have a patch for this and will provide
      it shortly.

      Also, this issue is not going to be limited to
      RemotableSVNAuthenticationProviderImpl. Any place where a singleton is multiply
      exported from overlapping calls could see the same problem. The reason we didn't
      see this before I fixed issue 4045 was that the Channel.unexport(int) command
      used to be broken and didn't have an effect on the reference count. This meant
      that the UnexportCommands sent in step #7 above were effectively ignored by the
      master.

          [JENKINS-4176] Reference counting error for RemotableSVNAuthenticationProviderImpl singleton

          mdillon added a comment -

          Created an attachment (id=822)
          Change remotableProvider from singleton to thread-local

          mdillon added a comment - Created an attachment (id=822) Change remotableProvider from singleton to thread-local

          Code changed in hudson
          User: : mindless
          Path:
          trunk/hudson/plugins/subversion/src/main/java/hudson/scm/SubversionSCM.java
          http://fisheye4.cenqua.com/changelog/hudson/?cs=20561
          Log:
          [FIXED JENKINS-4176] patch from mdillon to fix reference counting error
          for RemotableSVNAuthenticationProviderImpl singleton in concurrent builds

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : mindless Path: trunk/hudson/plugins/subversion/src/main/java/hudson/scm/SubversionSCM.java http://fisheye4.cenqua.com/changelog/hudson/?cs=20561 Log: [FIXED JENKINS-4176] patch from mdillon to fix reference counting error for RemotableSVNAuthenticationProviderImpl singleton in concurrent builds

          I'm still trying to follow the description, but it seems to me that we need the
          reference counting fixed properly, since exporting a singleton should be a valid
          pattern.

          Kohsuke Kawaguchi added a comment - I'm still trying to follow the description, but it seems to me that we need the reference counting fixed properly, since exporting a singleton should be a valid pattern.

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/ExportTable.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/ImportedClassLoaderTable.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/RemoteInvocationHandler.java
          trunk/www/changelog.html
          http://fisheye4.cenqua.com/changelog/hudson/?cs=20609
          Log:
          [FIXED JENKINS-4176] In 1.320. Made RemoveInvocationHandler aware of explicit unexporting by the caller to handle the reference counting properly.

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/ExportTable.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/ImportedClassLoaderTable.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/RemoteInvocationHandler.java trunk/www/changelog.html http://fisheye4.cenqua.com/changelog/hudson/?cs=20609 Log: [FIXED JENKINS-4176] In 1.320. Made RemoveInvocationHandler aware of explicit unexporting by the caller to handle the reference counting properly.

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/plugins/subversion/src/main/java/hudson/scm/SubversionSCM.java
          http://fisheye4.cenqua.com/changelog/hudson/?cs=20610
          Log:
          JENKINS-4176 rolling back rev.20561

          SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/plugins/subversion/src/main/java/hudson/scm/SubversionSCM.java http://fisheye4.cenqua.com/changelog/hudson/?cs=20610 Log: JENKINS-4176 rolling back rev.20561

          mdillon added a comment -

          "I'm still trying to follow the description, but it seems to me that we need the
          reference counting fixed properly, since exporting a singleton should be a valid
          pattern."

          Yeah, well, I should have just scanned my whiteboard. It was much easier to
          understand in pictures.

          mdillon added a comment - "I'm still trying to follow the description, but it seems to me that we need the reference counting fixed properly, since exporting a singleton should be a valid pattern." Yeah, well, I should have just scanned my whiteboard. It was much easier to understand in pictures.

          Following error occurs intermittent in 1.323.

          SCM change trigger started this job
          Building remotely on win32_oslo
          Updating svn+ssh://svn.company.local/home/svn/repository/projects/example/trunk
          FATAL: Unable to call getCredential. Invalid object ID 21217
          java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 21217
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215)
          at hudson.remoting.UserRequest.perform(UserRequest.java:104)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:236)
          at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)

          maartenvandewaarsenburg added a comment - Following error occurs intermittent in 1.323. SCM change trigger started this job Building remotely on win32_oslo Updating svn+ssh://svn.company.local/home/svn/repository/projects/example/trunk FATAL: Unable to call getCredential. Invalid object ID 21217 java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 21217 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:104) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

          lance_t added a comment -

          We get this intermittently in 1.342 as well:

          FATAL: Unable to call getCredential. Invalid object ID 1393
          java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 1393
          at
          hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268)
          at
          hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255)
          at
          hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215)
          at hudson.remoting.UserRequest.perform(UserRequest.java:104)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:270)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at
          java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at
          java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:619)

          lance_t added a comment - We get this intermittently in 1.342 as well: FATAL: Unable to call getCredential. Invalid object ID 1393 java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 1393 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:104) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)

          lance_t added a comment -

          Got this again today:

          java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 7899
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215)
          at hudson.remoting.UserRequest.perform(UserRequest.java:104)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:270)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:619)

          Any idea if this is something that can be looked at? It is fairly annoying since it triggers emails, build lights and things like that since it is considered a failed build.

          lance_t added a comment - Got this again today: java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 7899 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:104) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Any idea if this is something that can be looked at? It is fairly annoying since it triggers emails, build lights and things like that since it is considered a failed build.

          msacarny added a comment - - edited

          We just upgraded to 1.379 and for the first time I get errors like this (attached.)
          Is there a workaround? DO I need to upgrade slaves?

          FATAL: Unable to call getCredential. Invalid object ID 3345
          java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 3345
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255)
          at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215)
          at hudson.remoting.UserRequest.perform(UserRequest.java:114)
          at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          at hudson.remoting.Request$2.run(Request.java:270)
          at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)

          msacarny added a comment - - edited We just upgraded to 1.379 and for the first time I get errors like this (attached.) Is there a workaround? DO I need to upgrade slaves? FATAL: Unable to call getCredential. Invalid object ID 3345 java.lang.IllegalStateException: Unable to call getCredential. Invalid object ID 3345 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:268) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255) at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215) at hudson.remoting.UserRequest.perform(UserRequest.java:114) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

            Unassigned Unassigned
            md5 Mike Dillon
            Votes:
            7 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: