Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28690

Deadlock in hudson.model.Executor

    XMLWordPrintable

Details

    Description

      In very specific scenario, when build is running on slave, and PingThread detects slave as unavailable deadlock occurs in Executor thread of that slave.

      stacktrace:

      "Executor #0 for xxxx : executing xxxx #9" daemon prio=10 tid=0x00007f444248b800 nid=0x66e0 waiting on condition [0x00007f448a92f000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
      	at hudson.model.Executor.interrupt(Executor.java:183)
      	at hudson.model.Executor.interrupt(Executor.java:164)
      	at hudson.model.Executor.interrupt(Executor.java:158)
      	at hudson.model.Executor.interrupt(Executor.java:145)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.selfInterrupt(AbstractQueuedSynchronizer.java:825)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:959)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
      	at hudson.model.Executor.abortResult(Executor.java:208)
      	at hudson.model.Build$BuildExecution.doRun(Build.java:165)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
      	at hudson.model.Run.execute(Run.java:1744)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      	at hudson.model.ResourceController.execute(ResourceController.java:98)
      	at hudson.model.Executor.run(Executor.java:374)
      

      This alone is not very bad, but than maintain task of queue kicks in, blocks on Executor's lock and leads to deadlock on Queue lock.
      Stacktrace:

      "AtmostOneTaskExecutor[hudson.model.Queue$1@6a9812a3] [#6684]" daemon prio=10 tid=0x00007f44bf7af000 nid=0x74ec waiting on condition [0x00007f44c827b000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
      	at hudson.model.Executor.isParking(Executor.java:609)
      	at hudson.model.Queue.maintain(Queue.java:1282)
      	at hudson.model.Queue$1.call(Queue.java:334)
      	at hudson.model.Queue$1.call(Queue.java:331)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:101)
      	at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:91)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
      	at java.lang.Thread.run(Thread.java:745)
      

      This blocks all actions on Jenkins, as no new builds can be scheduled and you cannot access Jenkins main page.

      After downgrade to versions 1.606 before https://issues.jenkins-ci.org/browse/JENKINS-27565 all is working good.

      Attachments

        Issue Links

          Activity

            szubster Tomasz Szuba created issue -
            szubster Tomasz Szuba made changes -
            Field Original Value New Value
            Link This issue is related to JENKINS-27565 [ JENKINS-27565 ]
            szubster Tomasz Szuba made changes -
            Description In very specific scenario, when build is running on slave, and PingThread detects slave as unavailable deadlock occurs in Executor thread of that slave.

            stacktrace:
            {code}
            "Executor #0 for xxxx : executing xxxx #9" daemon prio=10 tid=0x00007f444248b800 nid=0x66e0 waiting on condition [0x00007f448a92f000]
               java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
            at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
            at hudson.model.Executor.interrupt(Executor.java:183)
            at hudson.model.Executor.interrupt(Executor.java:164)
            at hudson.model.Executor.interrupt(Executor.java:158)
            at hudson.model.Executor.interrupt(Executor.java:145)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.selfInterrupt(AbstractQueuedSynchronizer.java:825)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:959)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
            at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
            at hudson.model.Executor.abortResult(Executor.java:208)
            at hudson.model.Build$BuildExecution.doRun(Build.java:165)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
            at hudson.model.Run.execute(Run.java:1744)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            at hudson.model.ResourceController.execute(ResourceController.java:98)
            at hudson.model.Executor.run(Executor.java:374)
            {code}
            This alone is not very bad, but than maintain task of queue kicks in, blocks on Executor's lock and leads to deadlock on Queue lock.
            Stacktrace:
            {code}
            "jenkins.util.Timer [#1]" daemon prio=10 tid=0x00007f44b807d800 nid=0x152a waiting on condition [0x00007f44cb77a000]
               java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for <0x00000001ef9027a8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
            at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
            at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
            at hudson.model.Queue.maintain(Queue.java:1270)
            at hudson.model.Queue$MaintainTask.doRun(Queue.java:2457)
            at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:51)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)
            {code}
            This blocks all actions on Jenkins, as no new builds can be scheduled and you cannot access Jenkins main page.

            After downgrade to versions 1.606 before https://issues.jenkins-ci.org/browse/JENKINS-27565 all is working good.
            In very specific scenario, when build is running on slave, and PingThread detects slave as unavailable deadlock occurs in Executor thread of that slave.

            stacktrace:
            {code}
            "Executor #0 for xxxx : executing xxxx #9" daemon prio=10 tid=0x00007f444248b800 nid=0x66e0 waiting on condition [0x00007f448a92f000]
               java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
            at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
            at hudson.model.Executor.interrupt(Executor.java:183)
            at hudson.model.Executor.interrupt(Executor.java:164)
            at hudson.model.Executor.interrupt(Executor.java:158)
            at hudson.model.Executor.interrupt(Executor.java:145)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.selfInterrupt(AbstractQueuedSynchronizer.java:825)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:959)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
            at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
            at hudson.model.Executor.abortResult(Executor.java:208)
            at hudson.model.Build$BuildExecution.doRun(Build.java:165)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
            at hudson.model.Run.execute(Run.java:1744)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            at hudson.model.ResourceController.execute(ResourceController.java:98)
            at hudson.model.Executor.run(Executor.java:374)
            {code}
            This alone is not very bad, but than maintain task of queue kicks in, blocks on Executor's lock and leads to deadlock on Queue lock.
            Stacktrace:
            {code}
            "AtmostOneTaskExecutor[hudson.model.Queue$1@6a9812a3] [#6684]" daemon prio=10 tid=0x00007f44bf7af000 nid=0x74ec waiting on condition [0x00007f44c827b000]
               java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for <0x000000045e3eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
            at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
            at hudson.model.Executor.isParking(Executor.java:609)
            at hudson.model.Queue.maintain(Queue.java:1282)
            at hudson.model.Queue$1.call(Queue.java:334)
            at hudson.model.Queue$1.call(Queue.java:331)
            at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:101)
            at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:91)
            at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
            at java.lang.Thread.run(Thread.java:745)
            {code}
            This blocks all actions on Jenkins, as no new builds can be scheduled and you cannot access Jenkins main page.

            After downgrade to versions 1.606 before https://issues.jenkins-ci.org/browse/JENKINS-27565 all is working good.
            danielbeck Daniel Beck made changes -
            Assignee stephenconnolly [ stephenconnolly ]
            jglick Jesse Glick made changes -
            Labels deadlock executor lock queue deadlock executor lock lts-candidate queue
            jglick Jesse Glick made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            olivergondza Oliver Gondža made changes -
            Attachment deadlock.log [ 29932 ]
            scm_issue_link SCM/JIRA link daemon made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            olivergondza Oliver Gondža made changes -
            Labels deadlock executor lock lts-candidate queue 1.609.2-fixed deadlock executor lock queue
            stephenconnolly Stephen Connolly made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 163544 ] JNJira + In-Review [ 208836 ]
            pajasoft Pavel Janoušek made changes -
            Link This issue is related to JENKINS-37034 [ JENKINS-37034 ]

            People

              stephenconnolly Stephen Connolly
              szubster Tomasz Szuba
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: