Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-6604

Possible race condition in RemoteClassLoader renders slave unusable

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • core
    • CentOS 5.3, Sun JDK 1.6.0_19 64-bit

      We are restarting hudson each Sunday afternoon to evade problems with memory leaks and have a couple of nightly builds that kick in at midnight. The scenario is that Hudson is fresh when multiple builds kick in, that is its remote class loader did not have a chance to read any classes yet. We have 3 executors defined. I suppose that the SCM poll action that is sent in many build procedures causes multiple requests to load classes for the SCM (we use slightly hacked version of CVS SCM). We are getting the following exception:
      java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/ModelObject"

      I have looked around on the web and found this (http://jira.codehaus.org/browse/JETTY-418) that lead me to believe that lack of synchronization while loading classes in remote class loader is the cause.

      Full stack trace:

      Started on May 24, 2010 12:00:54 AM
      FATAL: remote file operation failed: /home/hudson-slave/workspace/BPE_8.1SR at hudson.remoting.Channel@1219b8c:slave-81
      hudson.util.IOException2: remote file operation failed: /home/hudson-slave/workspace/BPE_8.1SR at hudson.remoting.Channel@1219b8c:slave-81
      	at hudson.FilePath.act(FilePath.java:743)
      	at hudson.FilePath.act(FilePath.java:729)
      	at com.syncron.hudson.cvs2.CVS2.isUpdatable(CVS2.java:813)
      	at com.syncron.hudson.cvs2.CVS2.pollChanges(CVS2.java:310)
      	at hudson.scm.SCM.poll(SCM.java:370)
      	at hudson.model.AbstractProject.poll(AbstractProject.java:1153)
      	at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:330)
      	at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:359)
      	at hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:118)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:619)
      Caused by: java.io.IOException: Remote call on slave-81 failed
      	at hudson.remoting.Channel.call(Channel.java:560)
      	at hudson.FilePath.act(FilePath.java:736)
      	... 14 more
      Caused by: java.lang.LinkageError: loader (instance of  hudson/remoting/RemoteClassLoader): attempted  duplicate class definition for name: "hudson/model/ModelObject"
      	at java.lang.ClassLoader.defineClass1(Native Method)
      	at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
      	at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
      	at java.lang.ClassLoader.defineClass(ClassLoader.java:466)
      	at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151)
      	at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
      	at java.lang.ClassLoader.defineClass1(Native Method)
      	at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
      	at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
      	at java.lang.ClassLoader.defineClass(ClassLoader.java:466)
      	at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151)
      	at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
      	at java.lang.Class.getDeclaredFields0(Native Method)
      	at java.lang.Class.privateGetDeclaredFields(Class.java:2291)
      	at java.lang.Class.getDeclaredField(Class.java:1880)
      	at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1610)
      	at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52)
      	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:425)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413)
      	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:310)
      	at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:547)
      	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1583)
      	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1732)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      	at hudson.remoting.UserRequest.deserialize(UserRequest.java:178)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:98)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
      	at hudson.remoting.Request$2.run(Request.java:270)
      	... 6 more
      Done. Took 63 ms
      No changes
      

      If we start single job manually after restart it executes properly. Any consecutive jobs will also run fine. However if we get that exception once, no other jobs that use the class mentioned in exception (pretty much all) will execute anymore until slave is restarted.

            jglick Jesse Glick
            michal_grzejszczak michal_grzejszczak
            Votes:
            5 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: