• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • Jenkins 1.480.1 LTS
      Slave running Windows 7x64 & Sun JDK 1.6.0_23
      Master on Windows 2008R2.

      My build locked up whilst calling hudson.Util.deleteContentsRecursive

      deleteContentsRecursive calls hudson.util.jna.Kernel32Utils.isJunctionOrSymlink
      isJunctionOrSymlink uses hudson.util.jna.Kernel32
      Kernel32 uses com.sun.jna.Native
      Native deadlocked at Native.java:122 in Native.<clinit> calling initIDs (Native method).

      In another thread, hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call was calling org.jvnet.hudson.Windows.monitor which then used com.sun.jna.Structure
      Structure deadlocked at Structure.java:134 in Structure.<clinit>

          [JENKINS-16070] Deadlock using Windows native calls

          pjdarton added a comment -

          Note: As a result of this deadlock, the slave had hundreds of threads all trying to call the SwapSpaceMonitor. This suggests that the monitor process isn't waiting for calls to complete before triggering another. This didn't help.

          pjdarton added a comment - Note: As a result of this deadlock, the slave had hundreds of threads all trying to call the SwapSpaceMonitor. This suggests that the monitor process isn't waiting for calls to complete before triggering another. This didn't help.

          pjdarton added a comment -

          After some investigation of the stack trace, I think that this appears to be a deadlock bug in com.sun.jna code where various classes in this package depend on each other, so if two threads both trigger class-loading of different classes in com.sun.jna then we can get a classloading deadlock, e.g. A depends on B and B depends on A then if both A and B are classloaded by different threads simultaneously then both can deadlock.
          (We're not the only people to be affected by this issue, e.g. as reported in http://bugs.wuala.com/view.php?id=3871#c10332 )

          The workaround suggested in wuala is to trigger classloading in a thread-safe fashion so that by the time we have multiple threads, all these classes are loaded.

          e.g. A possible workaround would be for the slave (and probably the main Jenkins process) to do something like this:

          try {
            com.sun.jna.Structure.class.toString();
            com.sun.jna.Native.class.toString();
          } catch (ClassNotFoundException ex) {
            // ignore
          }
          

          during their start-up phase, before they start any new threads. That'll safely trigger classloading (if extra safety is desired, catch Throwable instead) thus avoiding classloading these deadlock-prone classes in a multi-threaded environment.

          pjdarton added a comment - After some investigation of the stack trace, I think that this appears to be a deadlock bug in com.sun.jna code where various classes in this package depend on each other, so if two threads both trigger class-loading of different classes in com.sun.jna then we can get a classloading deadlock, e.g. A depends on B and B depends on A then if both A and B are classloaded by different threads simultaneously then both can deadlock. (We're not the only people to be affected by this issue, e.g. as reported in http://bugs.wuala.com/view.php?id=3871#c10332 ) The workaround suggested in wuala is to trigger classloading in a thread-safe fashion so that by the time we have multiple threads, all these classes are loaded. e.g. A possible workaround would be for the slave (and probably the main Jenkins process) to do something like this: try { com.sun.jna.Structure. class. toString(); com.sun.jna.Native. class. toString(); } catch (ClassNotFoundException ex) { // ignore } during their start-up phase, before they start any new threads. That'll safely trigger classloading (if extra safety is desired, catch Throwable instead) thus avoiding classloading these deadlock-prone classes in a multi-threaded environment.

          Jesse Glick added a comment -

          Better to fix in https://github.com/jenkinsci/jna and then submit to https://github.com/twall/jna for the public benefit. Overuse of static initializers is always a danger. Structure.MAX_GNUC_ALIGNMENT could simply be inlined, solving your problem. Wuala’s deadlock involving Pointer.SIZE looks tougher since this field is public; perhaps it could be deprecated and made nonfinal, with Native.<clinit> setting it.

          BTW newer versions of Jenkins do not use JNA for this purpose when running on Java 7+.

          Jesse Glick added a comment - Better to fix in https://github.com/jenkinsci/jna and then submit to https://github.com/twall/jna for the public benefit. Overuse of static initializers is always a danger. Structure.MAX_GNUC_ALIGNMENT could simply be inlined, solving your problem. Wuala’s deadlock involving Pointer.SIZE looks tougher since this field is public; perhaps it could be deprecated and made nonfinal, with Native.<clinit> setting it. BTW newer versions of Jenkins do not use JNA for this purpose when running on Java 7+.

          Jesse Glick added a comment -
          "pool-1-thread-7" … in Object.wait() [0x…]
             java.lang.Thread.State: RUNNABLE
          	at com.sun.jna.Structure.<clinit>(Structure.java:134)
          	at org.jvnet.hudson.Windows.monitor(Windows.java:40)
          	at hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:113)
          	at …
          "pool-1-thread-1" … in Object.wait() [0x…]
             java.lang.Thread.State: RUNNABLE
          	at com.sun.jna.Native.initIDs(Native Method)
          	at com.sun.jna.Native.<clinit>(Native.java:122)
          	at hudson.util.jna.Kernel32.<clinit>(Kernel32.java:38)
          	at hudson.util.jna.Kernel32Utils.isJunctionOrSymlink(Kernel32Utils.java:62)
          	at hudson.Util.isSymlink(Util.java:437)
          	at …
          

          Jesse Glick added a comment - "pool-1-thread-7" … in Object.wait() [0x…] java.lang.Thread.State: RUNNABLE at com.sun.jna.Structure.<clinit>(Structure.java:134) at org.jvnet.hudson.Windows.monitor(Windows.java:40) at hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:113) at … "pool-1-thread-1" … in Object.wait() [0x…] java.lang.Thread.State: RUNNABLE at com.sun.jna.Native.initIDs(Native Method) at com.sun.jna.Native.<clinit>(Native.java:122) at hudson.util.jna.Kernel32.<clinit>(Kernel32.java:38) at hudson.util.jna.Kernel32Utils.isJunctionOrSymlink(Kernel32Utils.java:62) at hudson.Util.isSymlink(Util.java:437) at …

          Jesse Glick added a comment -

          Jesse Glick added a comment - Probably fixed by https://github.com/twall/jna/commit/78969b80508dd2525e91f198fc2d59663b959620 via https://github.com/jenkinsci/jenkins/pull/1387 .

          Greg Smith added a comment - - edited

          I am reopening this issue, as it is not resolved. We have recreated it in latest LTS release, 2.19.2 as of this comment. Our slaves are running latest Java 8 VMs.

          For details, and a similar matching stack trace, see linked issue JENKINS-39179

          Greg Smith added a comment - - edited I am reopening this issue, as it is not resolved. We have recreated it in latest LTS release, 2.19.2 as of this comment. Our slaves are running latest Java 8 VMs. For details, and a similar matching stack trace, see linked issue JENKINS-39179

          Jesse Glick added a comment -

          No immediate indication that is the same bug. The new issue can be evaluated independently.

          Jesse Glick added a comment - No immediate indication that is the same bug. The new issue can be evaluated independently.

          I've seen this now in 2.46.3 + Java 8 b141

          Matthew Mitchell added a comment - I've seen this now in 2.46.3 + Java 8 b141

            Unassigned Unassigned
            pjdarton pjdarton
            Votes:
            3 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: