I have just been investigating a problem in our jenkins setup that I think might be related to JENKINS-25218. We're using the EC2 plugin and running builds that generate quite large logs (230 MB). At some point during the build, the master loses track of the log and just starts logging the same block of text from the log over and over as long as I let it. The build completes successfully on the slave and nothing bad appears in the Node log in the jenkins UI. But the master continues to fill up the filesystem with the same repeated text forever. I changed the build to log much less and now this isn't happening. We're running 2.46.2. Could this potentially be one of the edge cases?
- depends on
-
JENKINS-38381 [JEP-210] Optimize log handling in Pipeline and Durable Task
-
- Resolved
-
- duplicates
-
JENKINS-37575 Delays in FileMonitoringTask.WriteLog can cause process output to be resent indefinitely
-
- Closed
-
We do use pipeline. Another variable that might be in play is that we were using an EFS volume for jenkins home. We've since migrated away to using EBS. We were having pretty typical NFS type problems with the master getting hung up with super high load avg yet using no cpu and high network bandwidth.
Since we reduced the log verbosity we haven't had the problem (even before we switched off EFS). I didn't see anything in the system logs when it happened. The thread dump wasn't the same as 25218. It really appeared to be a livelock situation the threads weren't stuck outright. I'll try and reproduce and take some thread dumps.