Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59158

Support running Job DSL scripts in parallel in pipeline with DISABLE or DELETE action

XMLWordPrintable

      One use case for running multiple Job DSL build steps would be to accumulate jobs defined in multiple repositories. Consider a pipeline that does:

        stage ("repo a") {
            checkout "our/repo-a"
            jobDsl "jobs/**/*.groovy"
        }
        stage ("repo b") {
            checkout "our/repo-b"
            jobDsl "jobs/**/*.groovy"
        }
      

      I run multiple Job DSL build steps in order to speed up the Job DSL configuration seed job. I have about 520 active jobs on my Jenkins instance, and they are all managed in a single repository. They are, thankfully, somewhat organized by project codename. My Job DSL pipeline reads something like:

      List<String> folders = ['project-a', 'project-b', 'project-c', ...] // roughly 10-20 projects
      
      def startupTasks = [:]
      
      startupTasks["Folder Setup"] = {
        node("job_dsl") { stage("Folder Setup) {
          checkout scm
          sh "generate_some_files_read_by_job_dsl_code"
          jobDsl targets: "jobs/folders.groovy"
        } }
      }
      
      startupTasks["Self-Check"] = {
        // Checks that all .groovy files in jobs/** are consumed by exactly one jobDsl build step
      }
      
      parallel startupTasks
      
      def allProjectJobs = [:]
      for (int i = 0; i < folders.length; i++) {
          String projname = folders[i]
          allProjectJobs["Project $projname"] = {
              node("job_dsl") { stage("Project $projname") {
                  checkout scm
                  // Other versions of this have used stash&unstash, see commentary below
                  sh "generate_some_files_read_by_job_dsl_code"
                  // Classes in src/ are used to implement templates that set up each project similarly.
                  jobDsl targets: "jobs/$projname/*.groovy", additionalClasspath: "src/"
              } }
          }
      }
      parallel allProjectJobs
      

      I use parallel Job DSL execution because I see slow Job DSL performance on running a lot of Job DSL code to generate this many jobs.
      I have a total of 81 files in my jobs/ folder, and some representative runtimes that I see for each group of files, in the Job DSL build step only, is:

      N files in project total filesize (bytes) runtime
      16 54863 2m 55s
      14 28715 1m 55s
      10 21716 2m 5s
      9 13768 1m 55s
      6 12051 1m 45s
      5 7262 1m 15s
      4 20754 1m 0s
      4 6394 1m 20s
      3 2425 1m 15s
      3 1592 1m 0s
      3 1569 1m 0s
      2 1871 1m 0s
      1 607 0m 45s
      1 275 0m 45s

      Running these in parallel instead of in series or all in one build step provides us with better visibility and ability to debug issues in the Job DSL (can hit an exception in each of these branches and resolve them all in one change), and it reduces the execution time from ~20minutes to ~5minutes. I have repeatedly told my team that we need to set up a unit test environment for Job DSL, but as we are not Java developers, this is a bit of a task. It will probably fall to me as I have some familiarity with Java, but little familiarity with modern Java tooling. I am pretty much the only developer on my team who has used Java (because of some college class experience on my part!), but as we are a hardware R&D lab working on ASIC design, most of my colleagues don't know how to use modern Java tooling at all.

      I suspect that my Job DSL runtimes would be faster if I used a node with a ping of less than 100ms to the Jenkins master for running Job DSL, but for *various and sundry reasons*, it is the way it is. (Does the Job DSL fetch a lot of classes via the remoting channel to run? I am not sure how to debug this performance issue.)

      In any case, I had to separate each project into its own freestyle project to run the Job DSL in parallel like this with the unreferenced job action set to 'DISABLE'. The changes that I made to the pipeline were to create a downstream freestyle job for each sub-folder of jobs/ and to build that job instead of running the jobDsl step directly. As a result, I cannot use stash/unstash to generate the files that my Job DSL templates read with e.g. readFileFromWorkspace and I have to pay the time cost of generating them in each workspace used by the pipeline.

      I think the workaround of splitting the Job DSL scripts to be run into multiple separate jobs is generally acceptable, but with the following drawbacks:

      • Moving a mis-classified job generated by a script in one group to another group is a little tricky and requires some careful sequencing.
      • Generating files consumed by Job DSL scripts only once requires use of a artifact repository or other storage location because the pipeline stash step can't be used in this situation.

            jamietanna Jamie Tanna
            daniel_c_686 Daniel Carrington
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: