Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19728

Much needed dependency management between jobs

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • Icon: Major Major
    • core

      It seems the basic architecture of Jenkins is such that jobs are individual units of work, meant to be related in very trivial ways with other jobs for the mere purpose of synchronizing the execution of related jobs.

      What seems to be sorely lacking is a robust dependency management scheme that allows jobs to be treated as actual inter-related entities with specific functional and behavioral requirements. Because of this limitation there are numerous plugins and extensions that attempt to workaround this issue, such as the Join, Conditional Build Step, Build Blocker and Locks and Laches plugins.

      Even the "Advanced" job options for "blocking" jobs when upstream and downstream jobs are running is further indication of this lack of dependency management. If JobA depends on JobB and thus triggers the downstream job upon completion, I can't imagine ever wanting the two to run at the same time - ever. The fact that this behavior is optional is quite illuminating.

      This limitation gets even more prevalent when you have large, complex job sequences to orchestrate, with non-linear interdependencies between them. There are countless questions on forums and sites discussing workarounds, often leveraging the features of several related plugins hooked together to "partially" solve these dependency issues, when it seems the problem would be best solved in the Jenkins core functionality.

      The one underlying issue that cross-sects all of these topics, and affects nearly all plugins that I've tried which help work around this limitation, is the problem that jobs that are inter-related in different ways are expected to be independent from one another by default, rather than making the dependency enforcement mandatory.

      Take for example the Join plugin. It provides a very basic ability to define non-linear relationships between jobs, allowing a diamond-pattern relationship between them. So JobA can trigger jobs B and C, and then once these two jobs complete successfully Job D gets triggered. Sounds fine and dandy until you realize that you can quite easily trigger job B to run and, once complete, it will happily trigger Job D even if Job A and C are broken. Similarly, even if all 4 jobs have the "block" when upstream and downstream jobs "advanced" options set, JobD can still be executed in parallel with Jobs B and C.

      Now, some may say that these bugs are probably the not with the Jenkins core but rather with these plugins, and at first glance I would tend to agree. However these limitations are so common and pervasive across nearly all job-management related plugins I have tried that it is hard to deny there is some core feature missing from this tool.

      Maybe there is some magic bullet that helps resolve these issues that I'm missing but I have been administering a Jenkins build farm with 10 PCs and nearly 500 jobs for several months now, and I've tried dozens if not hundreds of plugins to try and orchestrate non-trivial dependency management between jobs which, at best, results in a complex sequencing of many such plugins, and at worst has met with utter failure.

      thoughts
      Perhaps an easy solution would be to provide some kind of a "global" option in Manage Jenkins section that forces all jobs that trigger other jobs to act as if they are actual dependencies of one another rather than just dumb triggers. Then upstream jobs that are running or failing would prevent downstream jobs from running, even when these dependencies follow a complex, non-linear relationship and regardless of which plugins are used to orchestrate these relationships.

      Alternatively, maybe what we need is a new job type, call it "component job" or something. When instantiated it would have options that allow complex dependency management between jobs to be handled automatically.

      Whatever the solution, I strongly feel that this is a very important feature that is badly needed in Jenkins and would help make the tool much more practical for large scale CI needs.

          [JENKINS-19728] Much needed dependency management between jobs

          Kevin Phillips created issue -

          Kevin Phillips added a comment - - edited

          Example Use Case:
          Lets examine the simple use case of a diamond-shaped dependency between 4 jobs, as mentioned in the description. You might first start by trying the Join plugin, configuring JobD as a downstream dependency of JobA, and Jobs B and C as the "joining" jobs. Once complete you may look to a Dependency Graph (ie: via the Dependency Graph plugin) to confirm the relation does in fact show the correct orientation - which it does. You then set each job to block when upstream and downstream dependencies are running, so JobA won't run while JobD is running and such.

          But then you try to see how well the 4 jobs orchestrate with one another. Let's make JobC fail for example. If you run JobA and it in turn triggers Jobs B and C, C will fail and that prevents JobD from triggering. However, afterwards, if you force build JobD it happily tries to go about its business even though dependent job C is broken. Next, lets try forcing JobB. Given that both Jobs B and C are upstream dependencies of JobD you would intuitively expect JobD to trigger once JobB completes... however it does not.

          So, for obvious reasons the join plugin falls short of robust dependency management. So lets try the Build Flow plugin. You create your 4 jobs as discussed but instead of using the Join plugin you use the DSL to script a "join" operation, something like:

          build ("jobA")
          parallel {
              build("jobB")
              build("jobC")
          }
          build ("jobD")
          

          Once again, you run the build flow and all looks fine and dandy, until you try to orchestrate each job in isolation. Running JobB directly has no affect on JobD. In fact, the Build Flow plugin doesn't seem to interface at all with the Jenkins job dependency system because you can manually force all 4 jobs to run in parallel if you trigger them directly. So the build flow plugin, again, doesn't provide the necessary results for this simple use case.

          Needless to say I have yet to find a solution for this use case.

          Kevin Phillips added a comment - - edited Example Use Case: Lets examine the simple use case of a diamond-shaped dependency between 4 jobs, as mentioned in the description. You might first start by trying the Join plugin, configuring JobD as a downstream dependency of JobA, and Jobs B and C as the "joining" jobs. Once complete you may look to a Dependency Graph (ie: via the Dependency Graph plugin) to confirm the relation does in fact show the correct orientation - which it does. You then set each job to block when upstream and downstream dependencies are running, so JobA won't run while JobD is running and such. But then you try to see how well the 4 jobs orchestrate with one another. Let's make JobC fail for example. If you run JobA and it in turn triggers Jobs B and C, C will fail and that prevents JobD from triggering. However, afterwards, if you force build JobD it happily tries to go about its business even though dependent job C is broken. Next, lets try forcing JobB. Given that both Jobs B and C are upstream dependencies of JobD you would intuitively expect JobD to trigger once JobB completes... however it does not. So, for obvious reasons the join plugin falls short of robust dependency management. So lets try the Build Flow plugin. You create your 4 jobs as discussed but instead of using the Join plugin you use the DSL to script a "join" operation, something like: build ( "jobA" ) parallel { build( "jobB" ) build( "jobC" ) } build ( "jobD" ) Once again, you run the build flow and all looks fine and dandy, until you try to orchestrate each job in isolation. Running JobB directly has no affect on JobD. In fact, the Build Flow plugin doesn't seem to interface at all with the Jenkins job dependency system because you can manually force all 4 jobs to run in parallel if you trigger them directly. So the build flow plugin, again, doesn't provide the necessary results for this simple use case. Needless to say I have yet to find a solution for this use case.

          Example Use Case:
          Suppose you have two jobs: A and B. Suppose you want job A to trigger job B but only under certain conditions. Maybe job B is some lengthy unit testing functionality that you only want to run at night, and job A is the compilation operation that builds the code and unit tests to be run - so job B clearly "depends" on job A.

          You quickly realize again that you need a plugin for this. You may start with the Conditional Build Step plugin and have JobA run JobB as a build step. Since these two jobs are dependent on one another you may set the build operation in JobA to block while JobB is executing. This works fine when running JobA, however once again, since this plugin does not respect the dependencies between jobs, Jenkins will happily allow JobB to be manually triggered while JobA is broken, and it will be equally happy to run JobA even if JobB has been triggered in some other way (e.g.: from a SCM commit). There are other dependency management problems with this solution as well that I won't get into here - i think you get the point.

          Needless to say this option is out. So next you try to find some kind of post-build trigger to do this - but unfortunately there is none. The only workaround to this limitation that I've found is to use yet another plugin, Flexible Publish, and combine that with the Conditional Build Step. However once you try this solution it is quickly apparent that it too suffers from a similar set of ailments. For example, if you have the Dependency Graph plugin installed, the generated graph doesn't even show JobB as a downstream of A, let alone having Jenkins correctly respect the dependency between the two jobs. Jenkins will still happily allow concurrent execution of both jobs regardless of the "blocking" for upstream/downstream settings.

          Again, yet another trivial dependency management pattern that I have yet to find a solution for.

          Kevin Phillips added a comment - Example Use Case: Suppose you have two jobs: A and B. Suppose you want job A to trigger job B but only under certain conditions. Maybe job B is some lengthy unit testing functionality that you only want to run at night, and job A is the compilation operation that builds the code and unit tests to be run - so job B clearly "depends" on job A. You quickly realize again that you need a plugin for this. You may start with the Conditional Build Step plugin and have JobA run JobB as a build step. Since these two jobs are dependent on one another you may set the build operation in JobA to block while JobB is executing. This works fine when running JobA, however once again, since this plugin does not respect the dependencies between jobs, Jenkins will happily allow JobB to be manually triggered while JobA is broken, and it will be equally happy to run JobA even if JobB has been triggered in some other way (e.g.: from a SCM commit). There are other dependency management problems with this solution as well that I won't get into here - i think you get the point. Needless to say this option is out. So next you try to find some kind of post-build trigger to do this - but unfortunately there is none. The only workaround to this limitation that I've found is to use yet another plugin, Flexible Publish, and combine that with the Conditional Build Step. However once you try this solution it is quickly apparent that it too suffers from a similar set of ailments. For example, if you have the Dependency Graph plugin installed, the generated graph doesn't even show JobB as a downstream of A, let alone having Jenkins correctly respect the dependency between the two jobs. Jenkins will still happily allow concurrent execution of both jobs regardless of the "blocking" for upstream/downstream settings. Again, yet another trivial dependency management pattern that I have yet to find a solution for.

          Example Use Case
          Suppose you have three jobs: A->B->C, where C depends on B which in turn depends on A, and each job has the 'blocking' options set for upstream and downstream jobs. Suppose C is building, during which time someone commits a change to projects A and B at the same time for a single change. Now, with SCM polling enabled it is possible that job B may pick up the commit to its source project before job A. In this case you quickly realize that Jenkins has a FIFO scheduling policy, so Job B gets scheduled to run before Job A. This is true even if both jobs are triggered and queued while job C is running.

          What results is that once job C is complete, job B will execute and fail because the associated change to job A has not yet been picked up. Then Job A runs, and build successfully, after which job B is triggered a second time, after which it is successful now that it has the required output from job A.

          So, back to the plugins we go. There are several plugins that purport to circumvent this limitation including Dependency Queue and Priority Sorter. Luckily these plugins do work pretty well, however they tend to be fragile. For example, the priority sorter plugin relies on having an effective scaling pattern in place and used consistently between all jobs. This is easy to manage with 10 jobs, but not so much with 500. Conversely, the dependency queue plugin seems to rely on the triggering relations between jobs, so if you are forced to use a plugin to relate jobs that doesn't respect this trigger relationship, then it too will fail to properly schedule the jobs based on dependency.

          Kevin Phillips added a comment - Example Use Case Suppose you have three jobs: A->B->C, where C depends on B which in turn depends on A, and each job has the 'blocking' options set for upstream and downstream jobs. Suppose C is building, during which time someone commits a change to projects A and B at the same time for a single change. Now, with SCM polling enabled it is possible that job B may pick up the commit to its source project before job A. In this case you quickly realize that Jenkins has a FIFO scheduling policy, so Job B gets scheduled to run before Job A. This is true even if both jobs are triggered and queued while job C is running. What results is that once job C is complete, job B will execute and fail because the associated change to job A has not yet been picked up. Then Job A runs, and build successfully, after which job B is triggered a second time, after which it is successful now that it has the required output from job A. So, back to the plugins we go. There are several plugins that purport to circumvent this limitation including Dependency Queue and Priority Sorter. Luckily these plugins do work pretty well, however they tend to be fragile. For example, the priority sorter plugin relies on having an effective scaling pattern in place and used consistently between all jobs. This is easy to manage with 10 jobs, but not so much with 500. Conversely, the dependency queue plugin seems to rely on the triggering relations between jobs, so if you are forced to use a plugin to relate jobs that doesn't respect this trigger relationship, then it too will fail to properly schedule the jobs based on dependency.

          My basic point here is that dependency management is hard, and perhaps the original Jenkins authors knew this and as a consequence have largely left it up to plugin makers to fill the missing void. But I strongly believe that dependency management is a core requirement of a good CI system, and I can not see how one could effectively outsource such features to third parties. To make matters worse you need to employ a dozen or more different plugins to achieve any semblance of a complex dependency management system, and thus each of those plugins must inter-operate with the others in a certain key ways for the dependencies to work correctly.

          Even still, there are quite a few problematic issues that I have yet to find workarounds for, like the "blocking when a dependent build is broken" issue described here. In my opinion the core Jenkins developers need to strongly consider adding a robust dependency management framework to the Jenkins core, perhaps providing a plugable API that plugin developers can then use to enhance these feature, focusing on specific sub-sets of dependency manage and isolated use cases.

          Kevin Phillips added a comment - My basic point here is that dependency management is hard, and perhaps the original Jenkins authors knew this and as a consequence have largely left it up to plugin makers to fill the missing void. But I strongly believe that dependency management is a core requirement of a good CI system, and I can not see how one could effectively outsource such features to third parties. To make matters worse you need to employ a dozen or more different plugins to achieve any semblance of a complex dependency management system, and thus each of those plugins must inter-operate with the others in a certain key ways for the dependencies to work correctly. Even still, there are quite a few problematic issues that I have yet to find workarounds for, like the "blocking when a dependent build is broken" issue described here . In my opinion the core Jenkins developers need to strongly consider adding a robust dependency management framework to the Jenkins core, perhaps providing a plugable API that plugin developers can then use to enhance these feature, focusing on specific sub-sets of dependency manage and isolated use cases.

          steve sether added a comment -

          I completely agree. I started using Jenkins Friday. On Friday my goal was to get Jenkins working with 1 project. That one project has 2 dependencies on other projects, one of which is dependent on the other. So a simple linear relationship. It's shocking to me that on the first day of using Jenkins I already needed to extend the functionality of it just to have Jenkins be dependent on another project.

          Navigating plugins can be a daunting task, especially when you're brand new to a tool. Which ones are good? Which ones are waay too complicated? Which one might break Jenkins? Who knows?

          At the very least Jenkins needs something out of the box to support dependencies. Some base functionality, since I just can't imagine that organizations at some point aren't going to need a project that has dependencies on other projects. I wound up installing Copy Artifact, and that works fine. Why can't it, or some other plugin made for dependencies be included in Jenkins?

          The OP is right though. There's far more complex dependency use cases that need to be addressed. Dependencies are simply a wide spread need that should come in the box.

          steve sether added a comment - I completely agree. I started using Jenkins Friday. On Friday my goal was to get Jenkins working with 1 project. That one project has 2 dependencies on other projects, one of which is dependent on the other. So a simple linear relationship. It's shocking to me that on the first day of using Jenkins I already needed to extend the functionality of it just to have Jenkins be dependent on another project. Navigating plugins can be a daunting task, especially when you're brand new to a tool. Which ones are good? Which ones are waay too complicated? Which one might break Jenkins? Who knows? At the very least Jenkins needs something out of the box to support dependencies. Some base functionality, since I just can't imagine that organizations at some point aren't going to need a project that has dependencies on other projects. I wound up installing Copy Artifact, and that works fine. Why can't it, or some other plugin made for dependencies be included in Jenkins? The OP is right though. There's far more complex dependency use cases that need to be addressed. Dependencies are simply a wide spread need that should come in the box.
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-19727 [ JENKINS-19727 ]
          Jesse Glick made changes -
          Labels New: workflow

          Jesse Glick added a comment -

          Have you tried implementing these kinds of use cases using the Workflow system? It does nothing magical in terms of investigating dependencies; it just runs the things you told it to, in the order you told it, under the conditions you specified. Nonetheless it has some aspects which allow it to model the kinds of scenarios you discuss better than the Build Flow DSL plugin. In particular:

          • While there is a build step to call out to “legacy” jobs, you need not use this at all, and can have all steps defined in one flow. This means that there is no chance of some subsection of the script being invoked independently and out of context—unless you made the flow parameterized and wrote its script to expect to skip some stages sometimes.
          • You can have multiple SCM checkouts in one flow, with a single polling/changelog function.
          • archive and unarchive let you pass artifacts between workspaces without using the Copy Artifact plugin. Or, if different stages can all run in one workspace, you need not do any copying at all.

          In common with Build Flow:

          • You can freely pass information from one part of the flow to another, as local variables.
          • You can use parallel to run some things concurrently, with an implicit join.

          Jesse Glick added a comment - Have you tried implementing these kinds of use cases using the Workflow system? It does nothing magical in terms of investigating dependencies; it just runs the things you told it to, in the order you told it, under the conditions you specified. Nonetheless it has some aspects which allow it to model the kinds of scenarios you discuss better than the Build Flow DSL plugin. In particular: While there is a build step to call out to “legacy” jobs, you need not use this at all, and can have all steps defined in one flow. This means that there is no chance of some subsection of the script being invoked independently and out of context—unless you made the flow parameterized and wrote its script to expect to skip some stages sometimes. You can have multiple SCM checkouts in one flow, with a single polling/changelog function. archive and unarchive let you pass artifacts between workspaces without using the Copy Artifact plugin. Or, if different stages can all run in one workspace, you need not do any copying at all. In common with Build Flow: You can freely pass information from one part of the flow to another, as local variables. You can use parallel to run some things concurrently, with an implicit join.

          We had tried the Build Flow plugin, which does meet some of our needs, but one thing in particular it didn't seem to handle was the ability to run sub-sets of the dependency tree depending on which job has changes made to it. For example, if job B depends on job A and someone commits a change to job B we don't want job A to be built. This is necessary to improve build efficiency in our current infrastructure because building (and, by extension, testing) each module in our dependency tree is very time consuming, even for "no-op" builds.

          When you mention the Jenkins "workflow system", I assume you are referring to this plugin project I found on GitHub (which I believe you may be a contributor or maintainer). I have to say I hadn't heard of this plugin until you mentioned it, but it does sound promising. If it supports this sort of partial-build operation I just mentioned I may be interested in trying it. Specifically, if two dependent jobs can be configured with separate / independent SCM URLs so they may be triggered independently, while still allowing the build orchestration based on the job dependencies let me know. Even better, if you can point me to an example of how this may be done using that plugin I will let you know whether it works for our needs.

          Kevin Phillips added a comment - We had tried the Build Flow plugin, which does meet some of our needs, but one thing in particular it didn't seem to handle was the ability to run sub-sets of the dependency tree depending on which job has changes made to it. For example, if job B depends on job A and someone commits a change to job B we don't want job A to be built. This is necessary to improve build efficiency in our current infrastructure because building (and, by extension, testing) each module in our dependency tree is very time consuming, even for "no-op" builds. When you mention the Jenkins "workflow system", I assume you are referring to this plugin project I found on GitHub (which I believe you may be a contributor or maintainer). I have to say I hadn't heard of this plugin until you mentioned it, but it does sound promising. If it supports this sort of partial-build operation I just mentioned I may be interested in trying it. Specifically, if two dependent jobs can be configured with separate / independent SCM URLs so they may be triggered independently, while still allowing the build orchestration based on the job dependencies let me know. Even better, if you can point me to an example of how this may be done using that plugin I will let you know whether it works for our needs.

            Unassigned Unassigned
            leedega Kevin Phillips
            Votes:
            11 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated: