-
Bug
-
Resolution: Fixed
-
Major
Reproducer:
Launch a local agent over ssh/command launcher and stop its process by kill -TSTP $PID. The agent stops responding and Jenkins notices is eventually closing its connection with clear exception.
Actual behavior:
- The channel is never disassociated from its computer so long running operations and other clients that only care for computer.channel != null will keep using it throwing exceptions all over the place. EDIT: The computer is not even temporarily offline and it does not seem to improve after all monitors has run as they all choke on closed channel.
- The channel is in the middle of closing procedure as it is outClosed but not inClosed. The other end does not send the close command for obvious reasons so it is never closed fully. I speculate that specifically is the reason why SlaveComputer#closeChannel() is not called thus causing the previous problem.
Expected behavior:
- The broken/half-closed/fully-closed channel is disassociated from computer that will therefore appear disconnected to all possible clients.
- causes
-
JENKINS-70414 Ping thread failures on agent side were ignored
-
- Closed
-
- links to
[JENKINS-46680] Computer offline by ping thread leaves the channel half open
Description |
Original:
Reproducer: Launch a local agent over ssh/command launcher and stop its process by {{kill -TSTP $PID}}. The agent stops responding and Jenkins notices is eventually closing its connection with clear exception. Actual behavior: - The channel is never disassociated from its computer so long running operations and other clients that only care for {{computer.channel != null}} will keep using it throwing exceptions all over the place. - The channel is in the middle of closing procedure as it is {{outClosed}} but not {{inClosed}}. The other end does not send the close command for obvious reasons so it is never closed fully. I speculate that specifically is the reason why {{SlaveComputer#closeChannel()}} is not called thus causing the previous problem. Expected behavior: - The broken/half-closed/fully-closed channel is disassociated from computer that will therefore appear disconnected to all possible clients. |
New:
Reproducer: Launch a local agent over ssh/command launcher and stop its process by {{kill -TSTP $PID}}. The agent stops responding and Jenkins notices is eventually closing its connection with clear exception. Actual behavior: - The channel is never disassociated from its computer so long running operations and other clients that only care for {{computer.channel != null}} will keep using it throwing exceptions all over the place. EDIT: The computer is not even temporarily offline and it does not seem to improve after all monitors has run as they all choke on closed channel. - The channel is in the middle of closing procedure as it is {{outClosed}} but not {{inClosed}}. The other end does not send the close command for obvious reasons so it is never closed fully. I speculate that specifically is the reason why {{SlaveComputer#closeChannel()}} is not called thus causing the previous problem. Expected behavior: - The broken/half-closed/fully-closed channel is disassociated from computer that will therefore appear disconnected to all possible clients. |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
Assignee | New: Oliver Gondža [ olivergondza ] |
Labels | New: robustness |
Remote Link | New: This issue links to "PR 3005 (Web Link)" [ 17645 ] |
Resolution | New: Fixed [ 1 ] | |
Status | Original: In Progress [ 3 ] | New: Resolved [ 5 ] |
Status | Original: Resolved [ 5 ] | New: In Review [ 10005 ] |
Status | Original: In Review [ 10005 ] | New: Resolved [ 5 ] |
Labels | Original: robustness | New: lts-candidate robustness |
Labels | Original: lts-candidate robustness | New: 2.73.2-rejected lts-candidate robustness |