Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54974

Jenkins does not start due to a deadlock after upgrade from 2.121.2.2 to 2.138.2.2

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • claim-plugin, core
    • None
    • Jenkns Core 2.163

      Jenkins does not start due to a deadlockJenkins does not start due to a deadlock
      The issue we are facing is very similar to JENKINS-49038.We have upgraded Jenkins instance from 2.121 to 2.138.2.2.The instance service starts normally, but the UI is loading infinitely long.At startup we get the deadlock

      // output
      
      "PreventRefreshFilter.initAutoRefreshFilter" #57 daemon prio=5 os_prio=0 tid=0x00007fdb5c02f800 nid=0x58ad waiting for monitor entry [0x00007fdb20193000]   java.lang.Thread.State: BLOCKED (on object monitor) at hudson.ExtensionList.ensureLoaded(ExtensionList.java:317) - waiting to lock <0x00000006c0120260> (a hudson.ExtensionList$Lock) at hudson.ExtensionList.getComponents(ExtensionList.java:183) at hudson.DescriptorExtensionList.load(DescriptorExtensionList.java:192) at hudson.ExtensionList.ensureLoaded(ExtensionList.java:318) - locked <0x00000006c37c7680> (a hudson.DescriptorExtensionList) at hudson.ExtensionList.iterator(ExtensionList.java:172) at hudson.ExtensionList.get(ExtensionList.java:149) at hudson.plugins.claim.ClaimConfig.get(ClaimConfig.java:202) at hudson.plugins.claim.http.PreventRefreshFilter.initAutoRefreshFilter(PreventRefreshFilter.java:43) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296) at jenkins.model.Jenkins$5.runTask(Jenkins.java:1069) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
      
      

      The deadlock seems to be intermittent, i.e. when stopping and starting the instance, it may finally start 2 times of 10.The issue can not be reproduced on a clean instance without custom plugins (only default plugins installed).

          [JENKINS-54974] Jenkins does not start due to a deadlock after upgrade from 2.121.2.2 to 2.138.2.2

          Oleg Nenashev added a comment -

          Not sure it is specifically related to Claim Plugin, the API usage looks to be valid

          schneeheld please provide a full thread dump

           

           

          Oleg Nenashev added a comment - Not sure it is specifically related to Claim Plugin, the API usage looks to be valid schneeheld please provide a full thread dump    

          Kirill Gostaf added a comment -

          Hi Oleg,

          As per your comment, I've attached report_threads1.log

          In addition, removing the claim plugin does not solve the issue. Startup then complained about the Radiator view plugin dependency on claim.

          Disabling the Radiator view plugin does not help. There was still a deadlock after having claim removed and radiatorviewplugin disabled.

          Kirill Gostaf added a comment - Hi Oleg, As per your comment, I've attached report_threads1.log In addition, removing the claim plugin does not solve the issue. Startup then complained about the Radiator view plugin dependency on claim. Disabling the Radiator view plugin does not help. There was still a deadlock after having claim removed and radiatorviewplugin disabled.

          Arnaud TAMAILLON added a comment - - edited

          Hi oleg_nenashev (and danielbeck as Oleg is stepping back at the moment from Core maintenance). 

          From my analysis, and some of other issues reported speaking about deadlocks (JENKINS-20988JENKINS-21034JENKINS-31622JENKINS-44564JENKINS-49038JENKINS-50663), the issue lies in the DescriptorExtensionList, especially in the way it acquires its load lock.
          The DescriptorExtensionList getLoadLock method documentation indicates that it is taking part in the real load activity, and that as such, it can lock on *this *rather than on the *singleton Lock *used by ExtensionList.

          However, many plugins rely on a GlobalConfiguration object, which is acquired through a code similar to the following (which is actually explicitly recommended in GlobalConfiguration documentation).

          public static SpecificPluginConfig get() {
              return GlobalConfiguration.all().get(SpecificPluginConfig.class);
          }
          

          (the all() method from the GlobalConfiguration is returning a DescriptorExtensionList)

          As the configuration for a plugin can be called from many places (initialization of plugin, http requests, ...), it is very easy to have at the same time a DescriptorExtensionList being instantiated, needing in return an ExtensionList, while at the same time, some injection code will have taken the ExtensionList lock and will require DescriptorExtensionList one.
          Of course, some other uses of DescriptorExtensionList, not related to GlobalConfiguration, can also create the same kind of issues.

          Taking in to account that the lock is only taken when the list is initialized for the first time (in ensureLoaded()), I would say that removing the override of getLoadLock in DescriptorExtensionList should solve the issue at a very minimal cost, or at least make the lock the same as ExtensionList for Descriptor.class

          What do you think about this proposal ? Do you see other unintended consequences ?

          Arnaud TAMAILLON added a comment - - edited Hi  oleg_nenashev  (and danielbeck as Oleg is stepping back at the moment from Core maintenance).  From my analysis, and some of other issues reported speaking about deadlocks ( JENKINS-20988 ,  JENKINS-21034 ,  JENKINS-31622 ,  JENKINS-44564 ,  JENKINS-49038 ,  JENKINS-50663 ), the issue lies in the  DescriptorExtensionList , especially in the way it acquires its load lock. The DescriptorExtensionList getLoadLock method documentation indicates that it is taking part in the real load activity, and that as such, it can lock on *this *rather than on the *singleton Lock *used by ExtensionList. However, many plugins rely on a GlobalConfiguration object, which is acquired through a code similar to the following (which is actually explicitly recommended in GlobalConfiguration documentation ). public static SpecificPluginConfig get() { return GlobalConfiguration.all().get(SpecificPluginConfig.class); } (the all() method from the GlobalConfiguration is returning a DescriptorExtensionList ) As the configuration for a plugin can be called from many places (initialization of plugin, http requests, ...), it is very easy to have at the same time a DescriptorExtensionList being instantiated, needing in return an ExtensionList, while at the same time, some injection code will have taken the ExtensionList lock and will require DescriptorExtensionList one. Of course, some other uses of DescriptorExtensionList, not related to GlobalConfiguration, can also create the same kind of issues. Taking in to account that the lock is only taken when the list is initialized for the first time (in ensureLoaded() ), I would say that removing the override of getLoadLock in DescriptorExtensionList should solve the issue at a very minimal cost, or at least make the lock the same as ExtensionList for Descriptor.class What do you think about this proposal ? Do you see other unintended consequences ?

          Oleg Nenashev added a comment -

          It should be fixed by the patch from greybird in 2.163

          Oleg Nenashev added a comment - It should be fixed by the patch from greybird in 2.163

            greybird Arnaud TAMAILLON
            schneeheld Kirill Gostaf
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: