Status: Resolved (View Workflow)
Since the output stream of the log is wrapped by a remote proxy in StreamBuildListener, every log message on the slave gets transferred to the master. That's a lot of throughput, and it requires a lot of storage for verbose builds. This slows down our builds. There are workarounds, but they don't address the core scalability issue.
I propose that, just like with abstract artifacts (see VirtualFile, ArtifactManager), task/build listeners be made extensible, so that users can choose something other than a StreamBuildListener in a build. Further, the log should no longer be file-based; the Listener provides the input and output stream. This should be easy to serialize in the "remote logging service" perspective: just provide the service configuration, and an API call can be made from the slave to publish to the log, and any authorized user of the log can subscribe. The master can either directly subscribe and pipe the log through the UI, or another service which is authorized to subscribe to the logging service can provide the UI externally.
One of the tougher things about this change is that it will not be compatible with any BuildWrapper and ConsoleLogFilter plugins. Those plugins decorate the OutputStream on master referred to by the slave's proxy stream. Since the plugins are on the master, it forces the slave to send all log lines to the master for decoration. This pattern tightly couples the logging concept with a master-based file. In order to maintain compatibility for those users for whom this is not an issue, I recommend a "high availability mode" feature flag. This flag will enable certain features that improve availability and scalability, and disable those extension points (BuildWrapper, ConsoleLogFilter) which are incompatible with this new flow.
JENKINS-38313 External Build Log storage for Jenkins
- In Progress
This is not a permanent change that affects everybody. I wanted to use a feature flag so that people can turn it on or off as needed. Log annotators aren't a hard requirement for everyone, and if I want zero log annotators, I should have that choice. I want this to be an optional feature. Anybody who doesn't opt in will see Jenkins run the same as always. Folks can keep their absolutely necessary, unsparkly Mask Passwords Plugin.
I should have that choice.
I don't think the proposed implementation makes sense in any official Jenkins releases, as turning off a significant part of its functionality in a hidden feature switch isn't a great approach. Having it be designed to be temporary makes it even worse…
If you want to implement it in your fork, feel free – we may even be able to reuse parts of it.
A new interface/super type for build wrappers to indicate their support for running on the slave, and routing through master if not all build wrappers (and whoever else may be involved) support this feature would be a better approach. With added logging which wrappers prevent direct log transfer to your storage, you could easily remove the culprits from your instance and get the same result.
This should be both backward compatible and future proof.
If this has been discussed in more depth in the tech track of the summit yesterday jglick please share the details.
Not in scope for 2.0. Some Jenkins devs including myself have already been discussing this and related scalability overhauls that could happen over the next couple of years, and will be discussing this in depth in a couple of weeks.
The specific start of a proposal here will probably not fly, since there are some log annotators which are hard requirements rather than “sparkle” (such as password masking), but we will look for minimally incompatible ways forward.