We seem to have some network instability.
Every week or so, all our agents disconnect from Jenkins and Jenkins becomes unresponsive.
Jenkins needs to be restarted afterwards.
Looking at build logs, they all seem to disconnect at the same time (possibly due to network load). Jenkins logs are not showing any errors.
I've managed to reproduce similar symptoms with a local version of jenkins simulating a bad network:
1. Have default jenkins instance running on localhost ("java -Xrs -Xmx256m -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle -jar jenkins.war --httpPort=12000")
2. Open the jenkins webpage and login to landing page 25 times (chrome)
3. Use Clumsy 0.2 to drop all packets for 40s on localhost TCP - IPV4
Expected Result:
Jenkins webpage loads after we stop dropping packets
Actual Result:
Jenkins webpage is unresponsive. Agents cannot no longer connect through JNLP3 or JNLP4