One use case for running multiple Job DSL build steps would be to accumulate jobs defined in multiple repositories. Consider a pipeline that does:
stage ("repo a") {
checkout "our/repo-a"
jobDsl "jobs*.groovy"
}
stage ("repo b") {
checkout "our/repo-b"
jobDsl "jobs*.groovy"
}
I run multiple Job DSL build steps in order to speed up the Job DSL configuration seed job. I have about 520 active jobs on my Jenkins instance, and they are all managed in a single repository. They are, thankfully, somewhat organized by project codename. My Job DSL pipeline reads something like:
List<String> folders = ['project-a', 'project-b', 'project-c', ...]
def startupTasks = [:]
startupTasks["Folder Setup"] = {
node("job_dsl") { stage("Folder Setup) {
checkout scm
sh "generate_some_files_read_by_job_dsl_code"
jobDsl targets: "jobs/folders.groovy"
} }
}
startupTasks["Self-Check"] = {
}
parallel startupTasks
def allProjectJobs = [:]
for (int i = 0; i < folders.length; i++) {
String projname = folders[i]
allProjectJobs["Project $projname"] = {
node("job_dsl") { stage("Project $projname") {
checkout scm
sh "generate_some_files_read_by_job_dsl_code"
jobDsl targets: "jobs/$projname/*.groovy", additionalClasspath: "src/"
} }
}
}
parallel allProjectJobs
I use parallel Job DSL execution because I see slow Job DSL performance on running a lot of Job DSL code to generate this many jobs.
I have a total of 81 files in my jobs/ folder, and some representative runtimes that I see for each group of files, in the Job DSL build step only, is:
N files in project |
total filesize (bytes) |
runtime |
16 |
54863 |
2m 55s |
14 |
28715 |
1m 55s |
10 |
21716 |
2m 5s |
9 |
13768 |
1m 55s |
6 |
12051 |
1m 45s |
5 |
7262 |
1m 15s |
4 |
20754 |
1m 0s |
4 |
6394 |
1m 20s |
3 |
2425 |
1m 15s |
3 |
1592 |
1m 0s |
3 |
1569 |
1m 0s |
2 |
1871 |
1m 0s |
1 |
607 |
0m 45s |
1 |
275 |
0m 45s |
Running these in parallel instead of in series or all in one build step provides us with better visibility and ability to debug issues in the Job DSL (can hit an exception in each of these branches and resolve them all in one change), and it reduces the execution time from ~20minutes to ~5minutes. I have repeatedly told my team that we need to set up a unit test environment for Job DSL, but as we are not Java developers, this is a bit of a task. It will probably fall to me as I have some familiarity with Java, but little familiarity with modern Java tooling. I am pretty much the only developer on my team who has used Java (because of some college class experience on my part!), but as we are a hardware R&D lab working on ASIC design, most of my colleagues don't know how to use modern Java tooling at all.
I suspect that my Job DSL runtimes would be faster if I used a node with a ping of less than 100ms to the Jenkins master for running Job DSL, but for *various and sundry reasons*, it is the way it is. (Does the Job DSL fetch a lot of classes via the remoting channel to run? I am not sure how to debug this performance issue.)
In any case, I had to separate each project into its own freestyle project to run the Job DSL in parallel like this with the unreferenced job action set to 'DISABLE'. The changes that I made to the pipeline were to create a downstream freestyle job for each sub-folder of jobs/ and to build that job instead of running the jobDsl step directly. As a result, I cannot use stash/unstash to generate the files that my Job DSL templates read with e.g. readFileFromWorkspace and I have to pay the time cost of generating them in each workspace used by the pipeline.
I have used the Job DSL plugin in production for over a year with the DISABLE action in place and I hit a similar issue when using multiple job dsl build steps in a single pipeline. The fix for this issue changed the behavior of the job dsl plugin when using multiple job dsl build steps in parallel. Is that (parallel job dsl build steps) a use case which is interesting to anyone else?
(The new behavior of the Job DSL plugin ended up disabling many of my jobs. I spent a day with a mostly broken environment because I had to track this down.)
Based on the discussion here, it seems like the correct thing to do is to use the IGNORE behavior on all uses of the job dsl build step except the very last one, which should not run in parallel with any other job dsl build step. Looking at the available hook points in hudson.model.Run and hudson.model.Job, I think that there isn't a great way to solve this. We could register a new hudson.model.listeners.RunListener that disables or deletes jobs after a successful job completes (and runs Job DSL build steps that DISABLE or DELETE jobs). Has that idea already been considered?