Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19728

Much needed dependency management between jobs

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • Icon: Major Major
    • core

      It seems the basic architecture of Jenkins is such that jobs are individual units of work, meant to be related in very trivial ways with other jobs for the mere purpose of synchronizing the execution of related jobs.

      What seems to be sorely lacking is a robust dependency management scheme that allows jobs to be treated as actual inter-related entities with specific functional and behavioral requirements. Because of this limitation there are numerous plugins and extensions that attempt to workaround this issue, such as the Join, Conditional Build Step, Build Blocker and Locks and Laches plugins.

      Even the "Advanced" job options for "blocking" jobs when upstream and downstream jobs are running is further indication of this lack of dependency management. If JobA depends on JobB and thus triggers the downstream job upon completion, I can't imagine ever wanting the two to run at the same time - ever. The fact that this behavior is optional is quite illuminating.

      This limitation gets even more prevalent when you have large, complex job sequences to orchestrate, with non-linear interdependencies between them. There are countless questions on forums and sites discussing workarounds, often leveraging the features of several related plugins hooked together to "partially" solve these dependency issues, when it seems the problem would be best solved in the Jenkins core functionality.

      The one underlying issue that cross-sects all of these topics, and affects nearly all plugins that I've tried which help work around this limitation, is the problem that jobs that are inter-related in different ways are expected to be independent from one another by default, rather than making the dependency enforcement mandatory.

      Take for example the Join plugin. It provides a very basic ability to define non-linear relationships between jobs, allowing a diamond-pattern relationship between them. So JobA can trigger jobs B and C, and then once these two jobs complete successfully Job D gets triggered. Sounds fine and dandy until you realize that you can quite easily trigger job B to run and, once complete, it will happily trigger Job D even if Job A and C are broken. Similarly, even if all 4 jobs have the "block" when upstream and downstream jobs "advanced" options set, JobD can still be executed in parallel with Jobs B and C.

      Now, some may say that these bugs are probably the not with the Jenkins core but rather with these plugins, and at first glance I would tend to agree. However these limitations are so common and pervasive across nearly all job-management related plugins I have tried that it is hard to deny there is some core feature missing from this tool.

      Maybe there is some magic bullet that helps resolve these issues that I'm missing but I have been administering a Jenkins build farm with 10 PCs and nearly 500 jobs for several months now, and I've tried dozens if not hundreds of plugins to try and orchestrate non-trivial dependency management between jobs which, at best, results in a complex sequencing of many such plugins, and at worst has met with utter failure.

      thoughts
      Perhaps an easy solution would be to provide some kind of a "global" option in Manage Jenkins section that forces all jobs that trigger other jobs to act as if they are actual dependencies of one another rather than just dumb triggers. Then upstream jobs that are running or failing would prevent downstream jobs from running, even when these dependencies follow a complex, non-linear relationship and regardless of which plugins are used to orchestrate these relationships.

      Alternatively, maybe what we need is a new job type, call it "component job" or something. When instantiated it would have options that allow complex dependency management between jobs to be handled automatically.

      Whatever the solution, I strongly feel that this is a very important feature that is badly needed in Jenkins and would help make the tool much more practical for large scale CI needs.

          [JENKINS-19728] Much needed dependency management between jobs

          I just wanted to make one further clarification in response to Jesse's comment above. Correct me if I'm mistaken, but I think in your comment you may be confusing the concept of "code dependencies" or "build dependencies" with "operational dependencies". What I mean to say is that, while it is true that as developers we probably tend to model Jenkins' job dependencies on our code modules' dependencies, there is not always a 1:1 relationship in that mapping. Often times we'll need to have operations executed as part of the automation process that have nothing to do with compilation but are still required by business processes.

          For example, if the code for Module A depends on the code of Module B and we use two separate Jenkins jobs to compile each of these then it's pretty clear what the dependencies must be (ie: this is where make and other such tools come into play). However, suppose we have 3 "phases" of a build process: compile, test, package, each of which managed by a separate job. Who is to say how those three jobs should relate / depend on one another? Should packages be dependent on tests? That depends on the context and the policies that govern your release process. This, imo, is where tools like Jenkins really need to shine. Sure the code dependencies need to be taken into account, and granted most of my earlier examples tend to favor such examples, but they are by no means the only source of dependencies your system needs to model.

          Consequently, I don't think you can ever get away from having to manually model your release process in the tool of your choice, Jenkins or otherwise. At best you can extract parts of that model from tools and build scripts, but you'll never quite get everything you need from there - at least not when you work at scale.

          Kevin Phillips added a comment - I just wanted to make one further clarification in response to Jesse's comment above . Correct me if I'm mistaken, but I think in your comment you may be confusing the concept of "code dependencies" or "build dependencies" with "operational dependencies". What I mean to say is that, while it is true that as developers we probably tend to model Jenkins' job dependencies on our code modules' dependencies, there is not always a 1:1 relationship in that mapping. Often times we'll need to have operations executed as part of the automation process that have nothing to do with compilation but are still required by business processes. For example, if the code for Module A depends on the code of Module B and we use two separate Jenkins jobs to compile each of these then it's pretty clear what the dependencies must be (ie: this is where make and other such tools come into play). However, suppose we have 3 "phases" of a build process: compile, test, package, each of which managed by a separate job. Who is to say how those three jobs should relate / depend on one another? Should packages be dependent on tests? That depends on the context and the policies that govern your release process. This, imo, is where tools like Jenkins really need to shine. Sure the code dependencies need to be taken into account, and granted most of my earlier examples tend to favor such examples, but they are by no means the only source of dependencies your system needs to model. Consequently, I don't think you can ever get away from having to manually model your release process in the tool of your choice, Jenkins or otherwise. At best you can extract parts of that model from tools and build scripts, but you'll never quite get everything you need from there - at least not when you work at scale.

          Kevin Phillips added a comment - - edited

          Not when using Workflow. One job with one Groovy script which can work however you like, with no additional plugins.

          This does sound promising. From what I understand this plugin isn't yet available on the Jenkins plugin 'store', correct? Do you have any thoughts as to when it will be "production ready"? I definitely would like to give it a try when it is deemed "stable".

          The biggest issue I'd have with it would be having to take the time to learn Groovy and the plugin and then to write a script to handle some of these seemingly trivial use cases, but it's one of those things where if there are no other options at our disposal then we may have no other choice.

          The other thing we'd need to do is test the plugin in-house to make sure that adopting a new plugin such as this wouldn't have an adverse effect on the dozens of other plugins we're currently using. As I mentioned before, we have experienced numerous inter-relationship problems between plugins in the past.

          I do not think anything like this will or should become part of “basic Jenkins infrastructure”.

          Obviously I'm not familiar with the internals of the tool itself, nor am I a maintainer or even an active contributor to the project (yet), so obviously I can't speak to whether such features will be incorporated into the core. However, given the importance and benefits of supporting correct dependency management I think it is pretty clear that these features should be incorporated into the core. The architecture would need to model the concepts of dependencies from the bottom up in order for it to be robustly supported - across plugins, across configurations, etc.

          The dependency & triggering logic built into Jenkins core is already far too complex. That is why we created Workflow—the existing system did not scale up to more sophisticated scenarios.

          I suspected this was the case. It sounds like some of that code needs to be refactored to compensate for the added complexity.

          I probably should say that I do understand that what I am proposing here would likely be invasive and would require a lot of work, but as a result I believe that doing so would be a game changer for Jenkins which would further encourage it's adoption in the corporate world where such things are of critical importance. For example, if the core architecture supported dependency management and these features were exposed on the Jenkins UI via easy to understand interfaces then even non-developers can get involved with the automated process management. Exposing this feature via a scripting environment, while very flexible and powerful I'm sure, does preclude / discourage non-developers from using it.

          Kevin Phillips added a comment - - edited Not when using Workflow. One job with one Groovy script which can work however you like, with no additional plugins. This does sound promising. From what I understand this plugin isn't yet available on the Jenkins plugin 'store', correct? Do you have any thoughts as to when it will be "production ready"? I definitely would like to give it a try when it is deemed "stable". The biggest issue I'd have with it would be having to take the time to learn Groovy and the plugin and then to write a script to handle some of these seemingly trivial use cases, but it's one of those things where if there are no other options at our disposal then we may have no other choice. The other thing we'd need to do is test the plugin in-house to make sure that adopting a new plugin such as this wouldn't have an adverse effect on the dozens of other plugins we're currently using. As I mentioned before, we have experienced numerous inter-relationship problems between plugins in the past. I do not think anything like this will or should become part of “basic Jenkins infrastructure”. Obviously I'm not familiar with the internals of the tool itself, nor am I a maintainer or even an active contributor to the project (yet), so obviously I can't speak to whether such features will be incorporated into the core. However, given the importance and benefits of supporting correct dependency management I think it is pretty clear that these features should be incorporated into the core. The architecture would need to model the concepts of dependencies from the bottom up in order for it to be robustly supported - across plugins, across configurations, etc. The dependency & triggering logic built into Jenkins core is already far too complex. That is why we created Workflow—the existing system did not scale up to more sophisticated scenarios. I suspected this was the case. It sounds like some of that code needs to be refactored to compensate for the added complexity. I probably should say that I do understand that what I am proposing here would likely be invasive and would require a lot of work, but as a result I believe that doing so would be a game changer for Jenkins which would further encourage it's adoption in the corporate world where such things are of critical importance. For example, if the core architecture supported dependency management and these features were exposed on the Jenkins UI via easy to understand interfaces then even non-developers can get involved with the automated process management. Exposing this feature via a scripting environment, while very flexible and powerful I'm sure, does preclude / discourage non-developers from using it.

          Jesse Glick added a comment -

          I don't think you can ever get away from having to manually model your release process in the tool of your choice, Jenkins or otherwise. At best you can extract parts of that model from tools and build scripts, but you'll never quite get everything you need from there

          Agreed, and I was not suggesting otherwise. Just saying that there are cases where you have a large number of modules with a completely consistent, homogeneous model—each has a static dependency list on other modules, and each accepts a predefined “build & test” command which is considered a prerequisite for building downstream modules. For this scenario, it is helpful to have some kind of tool which either automatically scans for dependencies, or accepts a DSL with a manually managed yet concise description of dependencies, and implements the minimal build sequence (with topological sorting, or automatic parallelization, etc.). For example, if you are using Maven with a reactor build (one big repository with lots of submodules with SNAPSHOT dependencies), and can determine which modules have changes according to the file list in the changelog of the current build, you can pass that module list as --also-make-dependents B,K,P,Q --threads 2.0C and get an easy parallelized, minimal build.

          There are of course other scenarios where every dependency is idiosyncratic enough that you have to model the whole behavior from scratch. And there is often some kind of setup stage and/or final deployment stage that falls outside a fixed dependency model. Neither poses any problem for Workflow.

          Jesse Glick added a comment - I don't think you can ever get away from having to manually model your release process in the tool of your choice, Jenkins or otherwise. At best you can extract parts of that model from tools and build scripts, but you'll never quite get everything you need from there Agreed, and I was not suggesting otherwise. Just saying that there are cases where you have a large number of modules with a completely consistent, homogeneous model—each has a static dependency list on other modules, and each accepts a predefined “build & test” command which is considered a prerequisite for building downstream modules. For this scenario, it is helpful to have some kind of tool which either automatically scans for dependencies, or accepts a DSL with a manually managed yet concise description of dependencies, and implements the minimal build sequence (with topological sorting, or automatic parallelization, etc.). For example, if you are using Maven with a reactor build (one big repository with lots of submodules with SNAPSHOT dependencies), and can determine which modules have changes according to the file list in the changelog of the current build, you can pass that module list as --also-make-dependents B,K,P,Q --threads 2.0C and get an easy parallelized, minimal build. There are of course other scenarios where every dependency is idiosyncratic enough that you have to model the whole behavior from scratch. And there is often some kind of setup stage and/or final deployment stage that falls outside a fixed dependency model. Neither poses any problem for Workflow.

          Jesse Glick added a comment -

          From what I understand this plugin isn't yet available on the Jenkins plugin 'store', correct?

          There are beta releases available on the experimental update center. Please see its project page for details, or use jenkinsci-dev for questions.

          Do you have any thoughts as to when it will be "production ready"?

          1.0 is expected very soon, for what that’s worth. I cannot promise this would solve all of your requirements (even if and when a changelog operator is implemented), but it is the only plausible candidate, which is why Kohsuke & I have been spending so much time on it. Your use cases are exactly in line with what we envisioned as the interesting problems to be solved—the things that just could not be done as separate jobs with triggers without losing your mind.

          Jesse Glick added a comment - From what I understand this plugin isn't yet available on the Jenkins plugin 'store', correct? There are beta releases available on the experimental update center. Please see its project page for details, or use jenkinsci-dev for questions. Do you have any thoughts as to when it will be "production ready"? 1.0 is expected very soon, for what that’s worth. I cannot promise this would solve all of your requirements (even if and when a changelog operator is implemented), but it is the only plausible candidate, which is why Kohsuke & I have been spending so much time on it. Your use cases are exactly in line with what we envisioned as the interesting problems to be solved—the things that just could not be done as separate jobs with triggers without losing your mind.

          Thanks for clarifying.

          Ironically, I have been reading a lot about Maven lately since our company does have a small Java development team that uses it, and I'm trying to evaluate whether any of those tools may be usable by our native development teams. So far it's not looking good. I do have to say that on some level I am, as a mainly native C++ developer, jealous at the tools available to Java developers, most notably Maven. They do provide a lot of features that are sadly missing or, at best, very difficult to find in native toolsets.

          Kevin Phillips added a comment - Thanks for clarifying. Ironically, I have been reading a lot about Maven lately since our company does have a small Java development team that uses it, and I'm trying to evaluate whether any of those tools may be usable by our native development teams. So far it's not looking good. I do have to say that on some level I am, as a mainly native C++ developer, jealous at the tools available to Java developers, most notably Maven. They do provide a lot of features that are sadly missing or, at best, very difficult to find in native toolsets.

          the things that just could not be done as separate jobs with triggers without losing your mind.

          Very well put! Unfortunately I think I lost my mind on this stuff at least a year ago or more.

          Kevin Phillips added a comment - the things that just could not be done as separate jobs with triggers without losing your mind. Very well put! Unfortunately I think I lost my mind on this stuff at least a year ago or more.

          Donald Morton added a comment -

          Kevin, have you looked at Gradle for your C++ dependency management?

          Donald Morton added a comment - Kevin, have you looked at Gradle for your C++ dependency management?

          I have heard of Gradle but I have never given it much attention. Since you mentioned it I looked into it more and it does look promising. I will definitely give it a test run to see how well it holds up. I'm not convinced just now that it would preclude the need for Jenkins to support robust dependency management as well, but perhaps it could help bridge the gap at least.

          Thanks for the suggestion.

          Kevin Phillips added a comment - I have heard of Gradle but I have never given it much attention. Since you mentioned it I looked into it more and it does look promising. I will definitely give it a test run to see how well it holds up. I'm not convinced just now that it would preclude the need for Jenkins to support robust dependency management as well, but perhaps it could help bridge the gap at least. Thanks for the suggestion.

          I have a graph like this on my dashboard:

          Is is based on each job having three types of dependency:

          • Zero or more Mercurial repositories from Bitbucket. It uses the Bitbucket REST API to determine the changeset of the head of the branch that the job is configured to use.
          • Zero or more artifacts from other jobs. It uses the fingerprints from the build.xml for the last successful build of the upstream job and compares that to the fingerprints from the build.xml of the last completed build of the downstream job.
          • The Jenkins config.xml for the job itself. Periodically I store the sha1sum of this XML file so that changes can be detected.

          If any of these change then the job is considered to be pending and scheduled (in dependency order) to run. I also re-run any job that is:

          • Failing and hasn't run for 24 hours. This means I retry failed jobs frequently.
          • Passing and hasn't run for 1 month. This means that I can be fairly confident that I detect jobs that break due to external influences.

          The only time I configure Jenkins to trigger a job is if I want it to run on a timer such as hourly or daily.

          James Ascroft-Leigh added a comment - I have a graph like this on my dashboard: Is is based on each job having three types of dependency: Zero or more Mercurial repositories from Bitbucket. It uses the Bitbucket REST API to determine the changeset of the head of the branch that the job is configured to use. Zero or more artifacts from other jobs. It uses the fingerprints from the build.xml for the last successful build of the upstream job and compares that to the fingerprints from the build.xml of the last completed build of the downstream job. The Jenkins config.xml for the job itself. Periodically I store the sha1sum of this XML file so that changes can be detected. If any of these change then the job is considered to be pending and scheduled (in dependency order) to run. I also re-run any job that is: Failing and hasn't run for 24 hours. This means I retry failed jobs frequently. Passing and hasn't run for 1 month. This means that I can be fairly confident that I detect jobs that break due to external influences. The only time I configure Jenkins to trigger a job is if I want it to run on a timer such as hourly or daily.

          3 years later, Pipeline seems to be the future of Jenkins. Is there any update on this issue? Most use cases brung up by leedega seem as relevant as ever in the Pipeline world, and I don't see an easy solution, besides building our own dependency resolver, which feels like rebuilding the wheel.

          Francis Therien added a comment - 3 years later, Pipeline seems to be the future of Jenkins. Is there any update on this issue? Most use cases brung up by leedega seem as relevant as ever in the Pipeline world, and I don't see an easy solution, besides building our own dependency resolver, which feels like rebuilding the wheel.

            Unassigned Unassigned
            leedega Kevin Phillips
            Votes:
            11 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated: