Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-45219

Remoting should terminate() channel after a timeout even if it does not hear from the remote side


    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Minor Minor
    • remoting
    • None

      Currently the channel termination logic depends on the exchange of CloseCommand's between one side and another... sideA and sideB

      1) sideA requests the channel close

      2) CloseCommand goes to sideB

      3) Transport#commandReceiver() fails to invoke the task due to any reason (deadlock, overload, thread death, etc.) and does not send the CloseCommand back

      4) channel.terminate(new OrderlyShutdown(createdAt)) does not get invoked on sideA

      5) If the channel is operational && there is no PingThread, channel.terminate() will be never invoked again on sideA

      6) channel on sideA never closes the Receiver, so Channel#inClosed stays null

      7) If there are pending Request#calls() operations, they may inifinitely hang in this cycle: 

      while(response==null && !channel.isInClosed())
        // I don't know exactly when this can happen, as pendingCalls are cleaned up by Channel,
        // but in production I've observed that in rare occasion it can block forever, even after a channel
        // is gone. So be defensive against that.


      If we set a timeout for Channel termination on close(), it may help to forcefully terminate the channel when sideB does not send the command back after a timeout (e.g. 1 minute)

            Unassigned Unassigned
            oleg_nenashev Oleg Nenashev
            0 Vote for this issue
            3 Start watching this issue