-
Bug
-
Resolution: Fixed
-
Major
-
None
This improvement should help with the triangilation of JENKINS-31050
Background: I was analysing JIRA issues related to the NIOHub fatal channel termination causing massive disconnection of agents. It appears that the SingleLaneExecutor is not completely correctly used there...
TL;DR: A single packet sent to the channel with pending shutdown may cause the termination of all remoting channels in JNLP1, JNLP2, CLI, and CLI2 protocols. JNLP4 does not seem to be affected.
- is related to
-
JENKINS-31050 Slave goes offline during the build
-
- Open
-
[JENKINS-40491] Preliminary FifoBuffer termination can cause outage of all JNLP1/2 agents
Link | New: This issue is related to JENKINS-31050 [ JENKINS-31050 ] |
Epic Link | New: JENKINS-38833 [ 175240 ] |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Status | Original: Resolved [ 5 ] | New: Closed [ 6 ] |
Assignee | New: Oleg Nenashev [ oleg_nenashev ] | |
Resolution | Original: Fixed [ 1 ] | |
Status | Original: Closed [ 6 ] | New: Reopened [ 4 ] |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Reopened [ 4 ] | New: Resolved [ 5 ] |
Comment |
[ Hi [~oleg_nenashev] I was getting the 'Agent offline during the build' error when I was using Jenkins v2.19.1 for the Jenkins Master and Jenkins-slave v2.62 for the slave pod. After reading up on your fix, upgraded the Jenkins to v 2.37 and the slave to jenkins-slave 3.4 (remoting 3.4). Now I am getting the below error {code:java} Caused by: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:617) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.jenkinsci.remoting.nio.FifoBuffer$CloseCause: Buffer close has been requested at org.jenkinsci.remoting.nio.FifoBuffer.close(FifoBuffer.java:426) at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:332) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:565) ... 6 more {code} Let me know if I need to provide more details. ] |
Description | Original: This improvement should help with the triangilation of JENKINS-31050 |
New:
This improvement should help with the triangilation of JENKINS-31050 Background: I was analysing JIRA issues related to the NIOHub fatal channel termination causing massive disconnection of agents. It appears that the SingleLaneExecutor is not completely correctly used there... TL;DR: A single packet sent to the channel with pending shutdown may cause the termination of all remoting channels in JNLP1, JNLP2, CLI, and CLI2 protocols. JNLP4 does not seem to be affected. |
Summary | Original: Improve diagnostics of the preliminary FifoBuffer termination | New: Preliminary FifoBuffer termination can cause outage of all JNLP1/2 agents |
Issue Type | Original: Improvement [ 4 ] | New: Bug [ 1 ] |