• Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Minor Minor
    • remoting
    • None
    • Jenkins 1.625.3
      Windows slaves (JNLP)

      Deadlock situations similar to those described in remoting#36

      However, in these cases the other side of the stack trace was:

      "NioChannelHub keys=3 gen=41: Computer.threadPoolForRemoting [#2]" id=224 (0xe0) state=BLOCKED cpu=76%
          - waiting to lock <0x28a9d2ba> (a hudson.remoting.Channel)
            owned by "Computer.threadPoolForRemoting [#5] for XXXXXXX" id=249 (0xf9)
          at hudson.remoting.Channel.terminate(Channel.java:833)
          at hudson.remoting.Channel$1.terminate(Channel.java:509)
          at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:71)
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:637)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          at java.lang.Thread.run(Thread.java:745)
      

      That is, an abort caused by an CancelledKeyException (see JENKINS-24050).

      The already merged solution in remoting#36 does not seem to cover all cases, as if there are not writable bytes and no one is reading (the channel is in an abnormal situation) the loop may keep going forever, maintaining the deadlock.

      As abort starts by closing the ends of the NIO channel, additional closed state change checks can be introduced in the loop providing a way out.

          [JENKINS-32825] Deadlock in Channel Abort

          Andres Rodriguez created issue -
          Andres Rodriguez made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Andres Rodriguez made changes -
          Remote Link New: This issue links to "remoting#71 (Web Link)" [ 13813 ]
          Andres Rodriguez made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: In Progress [ 3 ] New: Resolved [ 5 ]
          Andres Rodriguez made changes -
          Status Original: Resolved [ 5 ] New: Closed [ 6 ]
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 168517 ] New: JNJira + In-Review [ 209688 ]
          Oleg Nenashev made changes -
          Link New: This issue is related to JENKINS-25218 [ JENKINS-25218 ]
          Oleg Nenashev made changes -
          Assignee Original: Andres Rodriguez [ andresrc ] New: Oleg Nenashev [ oleg_nenashev ]
          Resolution Original: Fixed [ 1 ]
          Status Original: Closed [ 6 ] New: Reopened [ 4 ]
          Oleg Nenashev made changes -
          Link New: This issue duplicates JENKINS-25218 [ JENKINS-25218 ]
          Oleg Nenashev made changes -
          Resolution New: Duplicate [ 3 ]
          Status Original: Reopened [ 4 ] New: Resolved [ 5 ]

            oleg_nenashev Oleg Nenashev
            andresrc Andres Rodriguez
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: