Found a build that had been “running” for days, on a cloudy slave long since taken offline. Master thread dump said
"Executor #0 for … : executing …" Id=56457 Group=main TIMED_WAITING on [B@5a2312b at java.lang.Object.wait(Native Method) - waiting on [B@5a2312b at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173) at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61) at java.io.FilterInputStream.read(FilterInputStream.java:90) at hudson.util.HeadBufferingStream.fillSide(HeadBufferingStream.java:83) at hudson.FilePath$TarCompression$2.extract(FilePath.java:619) at hudson.FilePath.copyRecursiveTo(FilePath.java:1775) at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:116) …
Could not be killed: pressing the button to terminate the build had no apparent effect.
Potentially an earlier attempt to interrupt had made FastPipedInputStream.read rethrow an InterruptedException as an IOException. This would have caused TarCompression.GZIP.extract to go into a diagnostic catch clause and run fillSide, which could again hang. But interrupting that should have produced a new IOException thrown up the chain (I cannot find any code above this that catches IOException and loops). So why could it not be killed?
Fixed only by restarting Jenkins. At that point the build record is missing since build.xml is only created if the executor thread terminates normally (even in an ABORTED build).
- is related to
-
JENKINS-30713 FAILED status reported for interrupted build
-
- Open
-
-
JENKINS-25698 Archiving artifacts (or FilePath operatons) should properly time out and retry the copy
-
- Open
-