We are experiencing extremely degraded performance on build slaves when running builds using the Kubernetes plugin when multiple build jobs run at the same time. The degraded performance seems to be related to high IO on the disk / volume being mounted to slave pods when more than one slave pod runs on the same host. As there does not seem to be a way to configure such that we can specify that each slave pod runs on a different host node, we typically end up with multiple slave pods running on the same host node and since each pod mounts the same device (either a volume on the host, or persistent volume) we end up with extreme performance degradation (jobs that usually take minutes taking hours) because of the high i/o on the backend storage. It would be nice to allow for persistent volume claims to be configured such that each slave pod would dynamically provision its own persistent volume–which is what typically happens in kubernetes itself. Otherwise a mechanism to configure such that each pod could have its own storage or to configure affinity / anti-affinity for slave pods so on a single slave would run per node, would help us past this issue.
In addition for verbosity I have tried to configure several unique pod templates mapped to unique storage via unique persistent volume claims (i.e. worker1, worker2, worker3 mapped to worker1-volume-claim, worker2-volume-claim, worker3-volume-claim), but this configuration does not seem to work as expected. It seems the worker2 and worker3 pod templates are never used to generate slave pods–even though I have configured such that each pod template should only generate one slave node. When more than one job is run the job with either queue and wait for worker1 to finish a build or will generate a second slave based on the template for worker1–and as a result we have the previous performance issue because both pods are mounted to the same storage.
This issue is really impacting our ability to run multiple build jobs at the same time.