I see some issues with this ticket:
- I'm not very familiar about the project/components terms used in Jenkins, however I believe the issue is not about the "parameterized-trigger-plugin" component, but rather about the "build step" used to launch other jobs from within another job.
- The issue is not related to having only 1 executor, but rather to having no available executors when launching too many master/orchestrators jobs that run sub-jobs, causing a deadlock.
I propose updating the Jira ticket to reflect the issue as: "Deadlock/Lockup when using "launching builds on other jobs" and "Block until the triggered jobs finish their builds" when no available executors"
From what I understand, this issue occurs in the following scenario:
- Node/slave with only 2 (or 1) executors available
- 1 upstream job finishes successfully
- 2 downstream (orchestrator) jobs are automatically triggered immediatelly by the upstream job when it finishes successfully
- Each downstream job launches other sub-jobs (dependent jobs) that each require their own executor
Since the last bullet point requires extra executors, and there are none available, the master/orchestrator jobs will enter a deadlock.
There are several ways to resolve this issue:
- (my preference): The launched dependent jobs do not occupy extra executors if run on the same node/slave.
- Distribute the execution of the downstream jobs so that they do not collide and there are available executors for their sub-jobs.
A high-level implementation of the above could be:
- For the first option, this could be automated by the Jenkins orchestrator. Alternatively, a new flag in the build step could also help.
- For the second way, this could be achieved by adding a flag to the trigger-upstream configuration in the downstream job with a parameter similar to the cron H flag. In other words, instead of immediately executing the downstream jobs, they could be distributed over a set period of time (e.g. within an hour).
Also following and in agreement with Vasily that it is expected that if the childjob is asking for a node that is run by the parent node, that it should be allowed access to the node. Without this expectation working, our complex pipeline is unable to properly archive artifacts in the ideal clean manner it was designed for. The alternative is to seek for a solution to archive from nodes