I would propose also a "mutli-masters" <-> "multi-agents" usage based on tags.
Here is my use case:
- our infrastructure runs many Jenkins masters (one per project), to eases administration, permissions mgt, customization, provisioning, decommissioning...
- but we cannot afford to provision/reserve one agent per project... typically when a mac-mini runs xcode/appium on macOS - OK virtual machine or docker may be an option (for windows but not really available for macOS) BUT it may lead to resource-conflict usage if many masters trigger jobs at the same time without notice of other concurrent agents running on same system/resources.
Thanks to a "multi-master" - "multi-agent" support over Kafka remoting, we may achieve resource consolidation without troubles thanks to a two-phase reservation protocol.
Here is my proposal:
- one master sends a "availability" request to all agents registered in kafka for capabilities based on "tags" (windows, dotnet47, macos, xcode, appium...)
- concerned agents (available and capable) answer to master and considered themselves as "no longer available" for a short "timeout period"
- master sends a "reservation" request (broadcast) about the select agent to all masters and all "concerned" agents - which changes states to "available" before timeout period
- master can now trigger job to reserved agent
- when over, master sends a "free" request to agent and notify all masters - so that they may trigger again pending request
- if a master receives no answer to its "availability" request, it retries periodically or immediately after being notified of a "free" request
- agent sets itself as free/available after "idle timeout" if requester master no longer submits job without "free" request, of after "timeout period" if an availability request is pending for reservation
Such a design should allow to interconnect masters and agents in a safe way. For security, messages may only contain: agent names and tags/capabilities.
What do you think about this design proposal?