Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-29342

to use docker workflow nicely on kubernetes we should turn a docker workflow into a Pod/RC

      when using kubernetes and jenkins docker workflow, its possible to use docker-in-docker (dind) in a slave or to try share the local docker daemon.

      Though ideally it'd be great if using jenkins and kubernetes together (e.g. with Atomic / OpenShift / OpenStack / Google GKE / vanilla kubernetes) that we let kubernetes takes care of provisioning all the docker containers; pulling images and restarting any failed pods if the machine thats running a jenkins workflow has issues (or the pod dies).

      To do that nicely on kubernetes we'd need to turn each Docker Workflow script into a Pod; with a docker container to run the main groovy workflow process; then for each container in the

      docker.image("foo") {}

      block we'd add a container to the pod.

      e.g. this workflow

      docker.image("maven") {
         // some stuff
      }
      docker.image("nodejs") {
         // some stuff
      }
      

      would be turned into a Pod with these containers:

      • workflow
      • maven
      • nodejs

      Then the workflow container could then talk directly to the other docker containers using localhost as the Pod would know all the ports of each docker container - and it'd be easy to share the build volume between each container nicely.

      One thing to be careful of is that right now in Kubernetes; a Pod definition is static; so rather than imperatively iterating through the Groovy DSL for the workflow; we'd have to have a kind of 'compile' stage where we evaluate all the `docker.image` blocks; then once we know them, we can generate a Pod which has the docker images baked into it which we can then start. Once the pod starts; all the containers would be provisioned together on the same host (and atomically destroyed at the end of the build)

          [JENKINS-29342] to use docker workflow nicely on kubernetes we should turn a docker workflow into a Pod/RC

          Jesse Glick added a comment -

          Do not fully follow what you are requesting, but it sounds like something that would require a new DSL feature, if not a new plugin; ndeloof had a withKubernetes step somewhere, but I do not know where.

          Jesse Glick added a comment - Do not fully follow what you are requesting, but it sounds like something that would require a new DSL feature, if not a new plugin; ndeloof had a withKubernetes step somewhere, but I do not know where.

          jglick so the general issue is the combination of 1 docker workflow script starting multiple docker containers; and those containers expecting to share a volume between them (so they can share the same workflow state).

          In kubernetes you can do that by using a persistent (distributed) shared disk and then running each docker container as a separate Pod and mounting this shared disk image; the only downside is thats kinda complex to do in Kubernetes and its highly cloud specific (whether using GKE's shared volumes or Gluster / Ceph et al) and I'm sure there could be all kinds of icky concurrency/distributed issues with disks not quite syncing correctly or out of order things happening etc. Since the workflow is parallel, any container could be reading/writing in any order etc.

          Another approach you can do is run 1 Pod which runs a separate local docker daemon and then run all the docker containers inside that single pod in a separate docker daemon. (So its Docker in Docker - or DIND); the main issue there is that it feels a bit like a hack and none of the kubernetes tooling can see these child docker images; so none of the tools for watching logs or doing 'docker exec' or even the docker CLI would work easily.

          So the other approach in kubernetes is if you put all the containers you're going to need together in a single pod; all those docker containers get put on the same host and can share local regular disk volumes with each other trivially. So turning a Jenkins Docker Workflow script into a Pod (so 1 container to run the workflow and another container for each docker.image() statement) would mean the entire workflow would run completely on a single host; could use shared disk easily (which also gets persisted if the pod dies and gets restarted elsewhere) and would mean all the kubernetes tooling would be able to look inside the workflow directly into any container; so you could watch/grep the logs of any of the containers; run bash inside the containers and see what they're up to etc.

          This last approach feels like the ideal way of combining Jenkins Docker Workflow with Kubernetes; it provides a simple way to do slaves (just run a k8s pod) and provides great tooling for developers to look inside builds that are not behaving as you expect etc - (just use kubernetes directly!).

          The only downside is we'd need a way for the master, when its about to start a Jenkins Workflow instance - to analyse the jenkins workflow DSL to figure out all the docker images that are going to be required; then a Pod can be built (which is basically a list of containers, their images, any CLI / env vars etc) - then the workflow could run as before. So a little bit of ninja Jenkins Workflow DSL stuff would be required to figure out the docker images that would be required before really running the workflow - but if we could do that then we'd have an amazing way to combine Jenkins Docker Workflows with Kubernetes Pods.

          One day - could be years off mind - there might be a way of dynamically adding Containers to a running Pod instance in kubernetes; which would make this really trivial to do in Jenkins Docker Workflow. We'd just start a Jenkins Workflow instance off in a Pod and add new containers as required to the Pod - but that doesn't look like happening any time soon; so we need to add all the containers into the pod up front right now.

          Until then, the "Docker in docker" hack seems the tactical solution though. I'm wondering how hard it'd be to hack up something to know, without invoking any of the steps in the DSL other than docker.image() - what all the docker images are in a Jenkins workflow DSL? (I should know more than I do about it - I created Groovy over a decade ago now and my Groovy DSL ninja stuff isn't what it used to be I'm afraid .

          James Strachan added a comment - jglick so the general issue is the combination of 1 docker workflow script starting multiple docker containers; and those containers expecting to share a volume between them (so they can share the same workflow state). In kubernetes you can do that by using a persistent (distributed) shared disk and then running each docker container as a separate Pod and mounting this shared disk image; the only downside is thats kinda complex to do in Kubernetes and its highly cloud specific (whether using GKE's shared volumes or Gluster / Ceph et al) and I'm sure there could be all kinds of icky concurrency/distributed issues with disks not quite syncing correctly or out of order things happening etc. Since the workflow is parallel, any container could be reading/writing in any order etc. Another approach you can do is run 1 Pod which runs a separate local docker daemon and then run all the docker containers inside that single pod in a separate docker daemon. (So its Docker in Docker - or DIND); the main issue there is that it feels a bit like a hack and none of the kubernetes tooling can see these child docker images; so none of the tools for watching logs or doing 'docker exec' or even the docker CLI would work easily. So the other approach in kubernetes is if you put all the containers you're going to need together in a single pod; all those docker containers get put on the same host and can share local regular disk volumes with each other trivially. So turning a Jenkins Docker Workflow script into a Pod (so 1 container to run the workflow and another container for each docker.image() statement) would mean the entire workflow would run completely on a single host; could use shared disk easily (which also gets persisted if the pod dies and gets restarted elsewhere) and would mean all the kubernetes tooling would be able to look inside the workflow directly into any container; so you could watch/grep the logs of any of the containers; run bash inside the containers and see what they're up to etc. This last approach feels like the ideal way of combining Jenkins Docker Workflow with Kubernetes; it provides a simple way to do slaves (just run a k8s pod) and provides great tooling for developers to look inside builds that are not behaving as you expect etc - (just use kubernetes directly!). The only downside is we'd need a way for the master, when its about to start a Jenkins Workflow instance - to analyse the jenkins workflow DSL to figure out all the docker images that are going to be required; then a Pod can be built (which is basically a list of containers, their images, any CLI / env vars etc) - then the workflow could run as before. So a little bit of ninja Jenkins Workflow DSL stuff would be required to figure out the docker images that would be required before really running the workflow - but if we could do that then we'd have an amazing way to combine Jenkins Docker Workflows with Kubernetes Pods. One day - could be years off mind - there might be a way of dynamically adding Containers to a running Pod instance in kubernetes; which would make this really trivial to do in Jenkins Docker Workflow. We'd just start a Jenkins Workflow instance off in a Pod and add new containers as required to the Pod - but that doesn't look like happening any time soon; so we need to add all the containers into the pod up front right now. Until then, the "Docker in docker" hack seems the tactical solution though. I'm wondering how hard it'd be to hack up something to know, without invoking any of the steps in the DSL other than docker.image() - what all the docker images are in a Jenkins workflow DSL? (I should know more than I do about it - I created Groovy over a decade ago now and my Groovy DSL ninja stuff isn't what it used to be I'm afraid .

          BTW here's our POC of trying to use Jenkins Docker Workflow on Kubernetes with slaves...
          https://github.com/iocanel/jenkins-poc

          so far the DIND was most promising but had issues
          https://github.com/iocanel/jenkins-poc/blob/master/slave-docker-in-docker/README.md

          James Strachan added a comment - BTW here's our POC of trying to use Jenkins Docker Workflow on Kubernetes with slaves... https://github.com/iocanel/jenkins-poc so far the DIND was most promising but had issues https://github.com/iocanel/jenkins-poc/blob/master/slave-docker-in-docker/README.md

          A bit more background on this issue
          https://github.com/fabric8io/fabric8/issues/4340

          James Strachan added a comment - A bit more background on this issue https://github.com/fabric8io/fabric8/issues/4340

          Jesse Glick added a comment -

          1 container to run the workflow

          Does not make sense, Workflow scripts always run on the Jenkins master. Perhaps you are talking about using one container for the Jenkins slave agent corresponding to node.

          another container for each docker.image() statement

          docker.image(…).inside I suppose you mean.

          analyse the jenkins workflow DSL to figure out all the docker images that are going to be required

          Impossible. The only way this could work is if Kubernetes supports dynamic additions to a pod.

          Jesse Glick added a comment - 1 container to run the workflow Does not make sense, Workflow scripts always run on the Jenkins master. Perhaps you are talking about using one container for the Jenkins slave agent corresponding to node . another container for each docker.image() statement docker.image(…).inside I suppose you mean. analyse the jenkins workflow DSL to figure out all the docker images that are going to be required Impossible. The only way this could work is if Kubernetes supports dynamic additions to a pod.

          James Strachan added a comment - - edited

          Yeah - to really use Pods properly on Kubernetes we'd need the master running the workflow to analyse the flow; figure out the pod/containers then create the 'slave' pod and wait for it to finish (and maybe restart it if it dies). If you kinda squint - this is almost like using "docker workflows pods" as a kinda slave in Jenkins (conceptually at least).

          It looks like Kubernetes isn't gonna support dynamic additions to a pod any time soon

          FWIW we've just had some success using Docker Workflow with DIND inside Swarm slaves using Jenkernetes on Kubernetes; which at least looks like it might work.
          https://github.com/iocanel/jenkins-poc/tree/master/swarm

          Its just a shame that all the docker containers inside the Swarm slave pod are kinda invisible from a Kubernetes tooling perspective (web UI / CLI) since they are all inside a docker-in-docker

          James Strachan added a comment - - edited Yeah - to really use Pods properly on Kubernetes we'd need the master running the workflow to analyse the flow; figure out the pod/containers then create the 'slave' pod and wait for it to finish (and maybe restart it if it dies). If you kinda squint - this is almost like using "docker workflows pods" as a kinda slave in Jenkins (conceptually at least). It looks like Kubernetes isn't gonna support dynamic additions to a pod any time soon FWIW we've just had some success using Docker Workflow with DIND inside Swarm slaves using Jenkernetes on Kubernetes; which at least looks like it might work. https://github.com/iocanel/jenkins-poc/tree/master/swarm Its just a shame that all the docker containers inside the Swarm slave pod are kinda invisible from a Kubernetes tooling perspective (web UI / CLI) since they are all inside a docker-in-docker

          Jesse Glick added a comment -

          need the master running the workflow to analyse the flow

          impossible

          Jesse Glick added a comment - need the master running the workflow to analyse the flow impossible

          OK - I guess DIND it is then! Thanks for listening

          James Strachan added a comment - OK - I guess DIND it is then! Thanks for listening

          doesn't seem possible

          James Strachan added a comment - doesn't seem possible

          Jesse Glick added a comment -

          It looks like Kubernetes isn't gonna support dynamic additions to a pod any time soon

          Then you cannot use Kubernetes for this purpose, unless you introduce some non-pod-based network filesystem, which is probably unwanted.

          Swarm (without Kubernetes) should work, in principle; requires changes in Jenkins plugins (docker + docker-workflow) so that --link-from is specified when running containers for Image.inside to ensure that the slave workspace is shared. Not currently planning this work but I have given it some thought.

          Jesse Glick added a comment - It looks like Kubernetes isn't gonna support dynamic additions to a pod any time soon Then you cannot use Kubernetes for this purpose, unless you introduce some non-pod-based network filesystem, which is probably unwanted. Swarm (without Kubernetes) should work, in principle; requires changes in Jenkins plugins ( docker + docker-workflow ) so that --link-from is specified when running containers for Image.inside to ensure that the slave workspace is shared. Not currently planning this work but I have given it some thought.

          We've just got Jenkernetes working; which uses a pod per slave with Swarm
          https://github.com/GoogleCloudPlatform/jenkernetes/

          then using Docker in Docker (DIND) in each pod container so that each slave pod can create other docker containers (all inside the same single slave pod - these containers are invisible to kubernetes). So far its working quite well https://github.com/iocanel/jenkins-poc/tree/master/swarm

          After much head scratching trying to figure out how to get docker workflow, kubernetes, jenkins kubernetes plugin and/or jenkins docker plugin; we might have found a nice permutation!

          Many thanks!

          James Strachan added a comment - We've just got Jenkernetes working; which uses a pod per slave with Swarm https://github.com/GoogleCloudPlatform/jenkernetes/ then using Docker in Docker (DIND) in each pod container so that each slave pod can create other docker containers (all inside the same single slave pod - these containers are invisible to kubernetes). So far its working quite well https://github.com/iocanel/jenkins-poc/tree/master/swarm After much head scratching trying to figure out how to get docker workflow, kubernetes, jenkins kubernetes plugin and/or jenkins docker plugin; we might have found a nice permutation! Many thanks!

          Jesse Glick added a comment -

          Yeah, DIND is one approach to this kind of problem. Does not seem like the right long-term approach, but good enough for now.

          Jesse Glick added a comment - Yeah, DIND is one approach to this kind of problem. Does not seem like the right long-term approach, but good enough for now.

          Agreed

          James Strachan added a comment - Agreed

            ndeloof Nicolas De Loof
            jstrachan James Strachan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: