So, recently at my organization we've started to use the Kubernetes plugin for Jenkins. One of the issues we've come across is persisting data between builds inside of pods.
For example, if I were to check out a repository "java-service" in build #1 for branch "my-branch" of some multibranch pipeline/job "java-service-pipeline", that checkout would not persist for build #2. So for every single build, on every branch, of every pipeline, we're performing tasks like that which could be cached. This is especially problematic for us as we utilize a monorepo, so a single commit may trigger 12 multibranch pipeline checkouts.
The initial solution I used: use a workspaceVolume: type of persistentVolumeClaimWorkspaceVolume where the claimName is set to the same value as the one used by the master Jenkins instance (jenkins by default) (solution described here: https://access.redhat.com/solutions/3461461). With this, we were able to share/persist workspaces between the master and agents as they were spun up inside of pods
However, this solution doesn't scale. Since we handle so many builds at once (possibly 50-100+, more to come), we had to add new node groups into our cluster that pods are randomly assigned to, the key point being that none of these node groups are the same one that the master instance is on. This is problematic because PVCs cannot be mounted in read/write mode on multiple pods that exist across node instances (https://stackoverflow.com/questions/46887118/volume-claim-on-gke-multi-attach-error-for-volume-volume-is-already-exclusivel)
I'm proposing that a new workspaceVolume type be added in and supported wherein the underlying PVC lives as long as a specific job/branch workspace lives. For example, on Jenkins, using the same repository "java-service", branch "my-branch" and pipeline/job "java-service-pipeline" names from above, if we were to examine the workspace this creates on the Jenkins master we would see something like:
If we were to use the name of the current workspace to create a PVC for each job/branch combination, and hook the deletion of such job/branches (for when branches are turned into PRs, or when PRs are merged), we could clear out PVCs associated with them
The one caveat that I see with this is the fact that: if two builds for a specific job/branch were to go off in a close-enough timespan, and your node groups were to fill up fast enough, you could still end up in a situation where a pod for that job/branch is stuck waiting for another to finish before you can mount the workspace PVC to it, but this seems more like a taint/toleration responsibility that falls on the user