The current design of SimpleXStreamFlowNodeStorage and LogActionImpl, using workflow/$id.xml and $id.log, was considered the minimum necessary for a working 1.0 release, not a serious implementation. It has two major problems:
- When there are a lot of steps, as in
JENKINS-30055, many small files are created, which is bad for I/O performance.
- When there is a large amount of output, WorkflowRun.copyLogs must duplicate it all to log, doubling disk space requirements per build.
It would be better to keep all flow node information in one file. (Perhaps build.xml itself. In principle we could avoid loading non-head nodes with a historical build record, though I believe CpsFlowExecution currently winds up loading them all anyway. Need to check.)
More importantly, there should be a single log file for the build. LogActionImpl should deprecated in favor of an implementation that simply stores a rangeset of offsets into that file. When parallel blocks are producing concurrent output, the single log file will be a bit jumbled (probably still human-readable in most cases), but the rangesets will keep track of what output came from where. The final output produced by WorkflowRun will still be processed to split at line boundaries, add in thread labels, etc. (TBD how and whether JENKINS-30777 could be supported in this mode.)