Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-16070

Deadlock using Windows native calls

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
    • Environment:
      Jenkins 1.480.1 LTS
      Slave running Windows 7x64 & Sun JDK 1.6.0_23
      Master on Windows 2008R2.
    • Similar Issues:

      Description

      My build locked up whilst calling hudson.Util.deleteContentsRecursive

      deleteContentsRecursive calls hudson.util.jna.Kernel32Utils.isJunctionOrSymlink
      isJunctionOrSymlink uses hudson.util.jna.Kernel32
      Kernel32 uses com.sun.jna.Native
      Native deadlocked at Native.java:122 in Native.<clinit> calling initIDs (Native method).

      In another thread, hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call was calling org.jvnet.hudson.Windows.monitor which then used com.sun.jna.Structure
      Structure deadlocked at Structure.java:134 in Structure.<clinit>

        Attachments

          Issue Links

            Activity

            Hide
            mmitche Matthew Mitchell added a comment -

            I've seen this now in 2.46.3 + Java 8 b141

            Show
            mmitche Matthew Mitchell added a comment - I've seen this now in 2.46.3 + Java 8 b141
            Hide
            jglick Jesse Glick added a comment -

            No immediate indication that is the same bug. The new issue can be evaluated independently.

            Show
            jglick Jesse Glick added a comment - No immediate indication that is the same bug. The new issue can be evaluated independently.
            Hide
            gregcovertsmith Greg Smith added a comment - - edited

            I am reopening this issue, as it is not resolved. We have recreated it in latest LTS release, 2.19.2 as of this comment. Our slaves are running latest Java 8 VMs.

            For details, and a similar matching stack trace, see linked issue JENKINS-39179

            Show
            gregcovertsmith Greg Smith added a comment - - edited I am reopening this issue, as it is not resolved. We have recreated it in latest LTS release, 2.19.2 as of this comment. Our slaves are running latest Java 8 VMs. For details, and a similar matching stack trace, see linked issue JENKINS-39179
            Show
            jglick Jesse Glick added a comment - Probably fixed by https://github.com/twall/jna/commit/78969b80508dd2525e91f198fc2d59663b959620 via https://github.com/jenkinsci/jenkins/pull/1387 .
            Hide
            jglick Jesse Glick added a comment -
            "pool-1-thread-7" … in Object.wait() [0x…]
               java.lang.Thread.State: RUNNABLE
            	at com.sun.jna.Structure.<clinit>(Structure.java:134)
            	at org.jvnet.hudson.Windows.monitor(Windows.java:40)
            	at hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:113)
            	at …
            "pool-1-thread-1" … in Object.wait() [0x…]
               java.lang.Thread.State: RUNNABLE
            	at com.sun.jna.Native.initIDs(Native Method)
            	at com.sun.jna.Native.<clinit>(Native.java:122)
            	at hudson.util.jna.Kernel32.<clinit>(Kernel32.java:38)
            	at hudson.util.jna.Kernel32Utils.isJunctionOrSymlink(Kernel32Utils.java:62)
            	at hudson.Util.isSymlink(Util.java:437)
            	at …
            
            Show
            jglick Jesse Glick added a comment - "pool-1-thread-7" … in Object.wait() [0x…] java.lang.Thread.State: RUNNABLE at com.sun.jna.Structure.<clinit>(Structure.java:134) at org.jvnet.hudson.Windows.monitor(Windows.java:40) at hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:113) at … "pool-1-thread-1" … in Object.wait() [0x…] java.lang.Thread.State: RUNNABLE at com.sun.jna.Native.initIDs(Native Method) at com.sun.jna.Native.<clinit>(Native.java:122) at hudson.util.jna.Kernel32.<clinit>(Kernel32.java:38) at hudson.util.jna.Kernel32Utils.isJunctionOrSymlink(Kernel32Utils.java:62) at hudson.Util.isSymlink(Util.java:437) at …
            Hide
            jglick Jesse Glick added a comment -

            Better to fix in https://github.com/jenkinsci/jna and then submit to https://github.com/twall/jna for the public benefit. Overuse of static initializers is always a danger. Structure.MAX_GNUC_ALIGNMENT could simply be inlined, solving your problem. Wuala’s deadlock involving Pointer.SIZE looks tougher since this field is public; perhaps it could be deprecated and made nonfinal, with Native.<clinit> setting it.

            BTW newer versions of Jenkins do not use JNA for this purpose when running on Java 7+.

            Show
            jglick Jesse Glick added a comment - Better to fix in https://github.com/jenkinsci/jna and then submit to https://github.com/twall/jna for the public benefit. Overuse of static initializers is always a danger. Structure.MAX_GNUC_ALIGNMENT could simply be inlined, solving your problem. Wuala’s deadlock involving Pointer.SIZE looks tougher since this field is public; perhaps it could be deprecated and made nonfinal, with Native.<clinit> setting it. BTW newer versions of Jenkins do not use JNA for this purpose when running on Java 7+.
            Hide
            pjdarton pjdarton added a comment -

            After some investigation of the stack trace, I think that this appears to be a deadlock bug in com.sun.jna code where various classes in this package depend on each other, so if two threads both trigger class-loading of different classes in com.sun.jna then we can get a classloading deadlock, e.g. A depends on B and B depends on A then if both A and B are classloaded by different threads simultaneously then both can deadlock.
            (We're not the only people to be affected by this issue, e.g. as reported in http://bugs.wuala.com/view.php?id=3871#c10332 )

            The workaround suggested in wuala is to trigger classloading in a thread-safe fashion so that by the time we have multiple threads, all these classes are loaded.

            e.g. A possible workaround would be for the slave (and probably the main Jenkins process) to do something like this:

            try {
              com.sun.jna.Structure.class.toString();
              com.sun.jna.Native.class.toString();
            } catch (ClassNotFoundException ex) {
              // ignore
            }
            

            during their start-up phase, before they start any new threads. That'll safely trigger classloading (if extra safety is desired, catch Throwable instead) thus avoiding classloading these deadlock-prone classes in a multi-threaded environment.

            Show
            pjdarton pjdarton added a comment - After some investigation of the stack trace, I think that this appears to be a deadlock bug in com.sun.jna code where various classes in this package depend on each other, so if two threads both trigger class-loading of different classes in com.sun.jna then we can get a classloading deadlock, e.g. A depends on B and B depends on A then if both A and B are classloaded by different threads simultaneously then both can deadlock. (We're not the only people to be affected by this issue, e.g. as reported in http://bugs.wuala.com/view.php?id=3871#c10332 ) The workaround suggested in wuala is to trigger classloading in a thread-safe fashion so that by the time we have multiple threads, all these classes are loaded. e.g. A possible workaround would be for the slave (and probably the main Jenkins process) to do something like this: try { com.sun.jna.Structure. class. toString(); com.sun.jna.Native. class. toString(); } catch (ClassNotFoundException ex) { // ignore } during their start-up phase, before they start any new threads. That'll safely trigger classloading (if extra safety is desired, catch Throwable instead) thus avoiding classloading these deadlock-prone classes in a multi-threaded environment.
            Hide
            pjdarton pjdarton added a comment -

            Note: As a result of this deadlock, the slave had hundreds of threads all trying to call the SwapSpaceMonitor. This suggests that the monitor process isn't waiting for calls to complete before triggering another. This didn't help.

            Show
            pjdarton pjdarton added a comment - Note: As a result of this deadlock, the slave had hundreds of threads all trying to call the SwapSpaceMonitor. This suggests that the monitor process isn't waiting for calls to complete before triggering another. This didn't help.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              pjdarton pjdarton
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: