Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-806

Synchronous SCM polling in dependencies order

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • Icon: Major Major
    • other
    • None
    • Platform: All, OS: All

      Introduce a new feature to be able to run SCM polling in a synchronous way in
      the order of the dependencies.

      See
      http://www.nabble.com/Commit-spanning-multiple-interdependant-projects-tf3870814.html#a10966678

          [JENKINS-806] Synchronous SCM polling in dependencies order

          jbq added a comment -

          Created an attachment (id=88)
          Proposed patch

          jbq added a comment - Created an attachment (id=88) Proposed patch

          jbq added a comment -

          Notes on the patch:

          • SCMTrigger cron spec is ignored when sync polling is set to true
          • Requires an API change: Trigger.run() becomes public
          • Creates a Thread to avoid blocking Cron, but that needs to be improved

          jbq added a comment - Notes on the patch: SCMTrigger cron spec is ignored when sync polling is set to true Requires an API change: Trigger.run() becomes public Creates a Thread to avoid blocking Cron, but that needs to be improved

          jbq added a comment -

          Created an attachment (id=89)
          Updated patch with better algorithm for computing list of projects in the order of dependencies. Still need to implement threading to poll for changes in the background.

          jbq added a comment - Created an attachment (id=89) Updated patch with better algorithm for computing list of projects in the order of dependencies. Still need to implement threading to poll for changes in the background.

          jbq added a comment -

          Created an attachment (id=90)
          Updated patch

          jbq added a comment - Created an attachment (id=90) Updated patch

          jbq added a comment -

          In the latest patch:

          • synchronousPolling setting in SCMTrigger instead of Hudson
          • use SCMTrigger executor to poll for changes
          • don't process SCMTrigger twice

          Still left to do:

          • Integrate with UI: allow to set synchronousPolling and to specify a global
            crontab spec

          jbq added a comment - In the latest patch: synchronousPolling setting in SCMTrigger instead of Hudson use SCMTrigger executor to poll for changes don't process SCMTrigger twice Still left to do: Integrate with UI: allow to set synchronousPolling and to specify a global crontab spec

          jbq added a comment -

          Assigning to me

          jbq added a comment - Assigning to me

          bwestrich added a comment -

          Created an attachment (id=143)
          patch that implements synchronous polling when #executors=1

          bwestrich added a comment - Created an attachment (id=143) patch that implements synchronous polling when #executors=1

          bwestrich added a comment -

          add myself as a cc

          bwestrich added a comment - add myself as a cc

          bwestrich added a comment -

          reassigned this issue to me per this note from Jean Baptiste:

          > Dear Brian,
          >
          > First, feel free to assign the issue to yourself. I can't afford to
          > work on this anymore as I'm not using Hudson in my current job.

          bwestrich added a comment - reassigned this issue to me per this note from Jean Baptiste: > Dear Brian, > > First, feel free to assign the issue to yourself. I can't afford to > work on this anymore as I'm not using Hudson in my current job.

          bwestrich added a comment -

          Created an attachment (id=145)
          patch that implements synchronous polling when #executors=1 (version 2)

          bwestrich added a comment - Created an attachment (id=145) patch that implements synchronous polling when #executors=1 (version 2)

          bwestrich added a comment -

          reassigning...

          bwestrich added a comment - reassigning...

          bwestrich added a comment -

          fix incorrect reassignment; meant to actually reassign issue 1313 to Mike, not
          this issue.

          bwestrich added a comment - fix incorrect reassignment; meant to actually reassign issue 1313 to Mike, not this issue.

          mdonohue added a comment -

          JENKINS-1938 largely solves this problem, and it's already released.

          mdonohue added a comment - JENKINS-1938 largely solves this problem, and it's already released.

          Joe Hansche added a comment - - edited

          Sorry for the resurrection, but still seeing this problem 3 years later, thought I'd try to stir the pot again...

          JENKINS-1938 largely solves this problem, and it's already released.

          I don't believe JENKINS-1938 solves the problem identified here – I think the problems are different, and JENKINS-1938 only solves half of it. From the link in the original description (nabble is down currently, but http://osdir.com/ml/java.hudson.user/2007-06/msg00047.html), the problem is related to this setup:

          • Project A
            • SCM Polling: */5
            • Trigger downstream project: Project B
          • Project B
            • SCM Polling: */5
            • Upstream project: Project A
            • Copies build artifacts from Project A

          The problem occurs when you commit changes to both Projects A and B within a short period of time (within the polling period). Because SCM polling itself has no dependency graph to determine when each project should be polled (just the crontab-style polling schedule), there's no guarantee which project will see the new commits and start building first.

          If Project A is polled first, you can enable the option you mention in JENKINS-1938 and therefore Project B will be blocked until Project A is finished, at which point all is well. But if Project B is polled first, and triggers a build with the new code in the B repository, it will be built against an out-of-date artifact from Project A, and will most likely fail. Then Project A will see be triggered because of the SCM change, which will re-trigger Project B. This time Project B will succeed because it has the appropriate artifact version from Project A. It's the initial failure of Project B that should be avoided if possible.

          A solution might be when Project B is preparing to poll SCM, it might check its dependency tree first, and if any upstream projects also have SCM polling enabled, it might prefer to wait until those upstream projects have finished polling before it tries to poll its own SCM. At that point, if the upstream job required a build, it would then be in the queue, and using the upstream/downstream blocking option, both jobs can be pending in the build queue and should complete properly. I admit I haven't looked at the SCM polling logic in Jenkins code yet, so I don't know if that is even feasible. But it certainly isn't fixed by the upstream block option alone.

          EDIT:
          Perhaps another alternative could be an option inside the "Poll SCM" section for "Poll upstream projects for SCM changes first", and if that's selected have it perform synchronous SCM polling for that project's dependency tree from top-down. To avoid/mitigate duplicate polling, you could even disable the "Poll SCM" option of upstream projects, and rely on the downstream project(s) to trigger the upstream polling – however, that could lead to an unintuitive scenario where disabling a downstream project could prevent upstream projects from polling at all... But if you have projects in this scenario, presumably you would know and understand that, as it is kind of an edge case scenario. Or maybe you don't care that the upstream project may be polled more often than expected.

          Joe Hansche added a comment - - edited Sorry for the resurrection, but still seeing this problem 3 years later, thought I'd try to stir the pot again... JENKINS-1938 largely solves this problem, and it's already released. I don't believe JENKINS-1938 solves the problem identified here – I think the problems are different, and JENKINS-1938 only solves half of it. From the link in the original description (nabble is down currently, but http://osdir.com/ml/java.hudson.user/2007-06/msg00047.html ), the problem is related to this setup: Project A SCM Polling: */5 Trigger downstream project: Project B Project B SCM Polling: */5 Upstream project: Project A Copies build artifacts from Project A The problem occurs when you commit changes to both Projects A and B within a short period of time (within the polling period). Because SCM polling itself has no dependency graph to determine when each project should be polled (just the crontab-style polling schedule), there's no guarantee which project will see the new commits and start building first. If Project A is polled first, you can enable the option you mention in JENKINS-1938 and therefore Project B will be blocked until Project A is finished, at which point all is well. But if Project B is polled first, and triggers a build with the new code in the B repository, it will be built against an out-of-date artifact from Project A, and will most likely fail. Then Project A will see be triggered because of the SCM change, which will re-trigger Project B. This time Project B will succeed because it has the appropriate artifact version from Project A. It's the initial failure of Project B that should be avoided if possible. A solution might be when Project B is preparing to poll SCM, it might check its dependency tree first, and if any upstream projects also have SCM polling enabled, it might prefer to wait until those upstream projects have finished polling before it tries to poll its own SCM. At that point, if the upstream job required a build, it would then be in the queue, and using the upstream/downstream blocking option, both jobs can be pending in the build queue and should complete properly. I admit I haven't looked at the SCM polling logic in Jenkins code yet, so I don't know if that is even feasible. But it certainly isn't fixed by the upstream block option alone. EDIT: Perhaps another alternative could be an option inside the "Poll SCM" section for "Poll upstream projects for SCM changes first", and if that's selected have it perform synchronous SCM polling for that project's dependency tree from top-down. To avoid/mitigate duplicate polling, you could even disable the "Poll SCM" option of upstream projects, and rely on the downstream project(s) to trigger the upstream polling – however, that could lead to an unintuitive scenario where disabling a downstream project could prevent upstream projects from polling at all... But if you have projects in this scenario, presumably you would know and understand that, as it is kind of an edge case scenario. Or maybe you don't care that the upstream project may be polled more often than expected.

          bwestrich added a comment -

          jhansche:

          I agree with your analysis, and I see this as still being an issue as well.

          Lately I've used the following approach which is a workaround hack that has some limitations but also an unexpected benefit: I write a (shell) script that polls all hudson jobs in order of dependency (using the remote api). Then I create a new job that runs this script every minute.

          Other than being a hack, the other thing that doesn't work well about this is when 2 related jobs have changes, both the upstream and downstream job get added to the queue. So (as you described as well) the downstream job gets built twice, and the first time it may fail. I tried the advanced options that prevent downstream jobs from building until upstream ones are done (under 'Advanced Project Options'), and they seem to not solve this – the downstream job doesn't build right away but it still gets immediately added to the queue, causing the same problem. I wonder if there would be any downside to changing these options so that instead of preventing builds from building they would actually prevent them from even being added to the queue? I had sent Kohsuke a note asking if he thought this would be a good change, but it was right during the Hudson/Jenkins shift so his bandwidth was limited at the time.

          The unexpected benefit of the above external script approach is that it's easy to turn off svn polling if you need to (e.g. to stop builds during a release, or due to Jenkins maintenance). You simply disable the one hudson job that runs the above script. Off the top of my head, I just realized it might be nice to have a plugin that did that type of thing (if I had the time to write it) .....

          Thanks for stirring the pot on this interesting issue.....

          bwestrich added a comment - jhansche: I agree with your analysis, and I see this as still being an issue as well. Lately I've used the following approach which is a workaround hack that has some limitations but also an unexpected benefit: I write a (shell) script that polls all hudson jobs in order of dependency (using the remote api). Then I create a new job that runs this script every minute. Other than being a hack, the other thing that doesn't work well about this is when 2 related jobs have changes, both the upstream and downstream job get added to the queue. So (as you described as well) the downstream job gets built twice, and the first time it may fail. I tried the advanced options that prevent downstream jobs from building until upstream ones are done (under 'Advanced Project Options'), and they seem to not solve this – the downstream job doesn't build right away but it still gets immediately added to the queue, causing the same problem. I wonder if there would be any downside to changing these options so that instead of preventing builds from building they would actually prevent them from even being added to the queue? I had sent Kohsuke a note asking if he thought this would be a good change, but it was right during the Hudson/Jenkins shift so his bandwidth was limited at the time. The unexpected benefit of the above external script approach is that it's easy to turn off svn polling if you need to (e.g. to stop builds during a release, or due to Jenkins maintenance). You simply disable the one hudson job that runs the above script. Off the top of my head, I just realized it might be nice to have a plugin that did that type of thing (if I had the time to write it) ..... Thanks for stirring the pot on this interesting issue.....

          bwestrich added a comment -

          Just posted a note on this topic to the Jenkins user group to help keep the discussion going...

          https://groups.google.com/d/topic/jenkinsci-users/puY_WCs6olc/discussion

          bwestrich added a comment - Just posted a note on this topic to the Jenkins user group to help keep the discussion going... https://groups.google.com/d/topic/jenkinsci-users/puY_WCs6olc/discussion

          There is an additional dimension in this problem: which appears in our context with a large number of dependend jobs: An upstream job A is triggered by a first checkin in the SCM. The build is okay and it triggers some other jobs (B, C, ...). Job be starts. In the meantime another developer checks in code in for job A and job C. As job C is in the queue, it starts soon later und checks out the out from the last revision of job C, which leads to a failing build. Job A will be triggered later, when the next SCM polling time is reached and the everything builds correctly.

          We solved this issue with a Groovy script, which polls the SCM often for the last revision number. If it hast changed, it determines which jobs belongs to this revison and put all these jobs into the building queue with the revision number as a parameter. Now each job builds with the same revision und no false positives occures.

          Freimut Hennies added a comment - There is an additional dimension in this problem: which appears in our context with a large number of dependend jobs: An upstream job A is triggered by a first checkin in the SCM. The build is okay and it triggers some other jobs (B, C, ...). Job be starts. In the meantime another developer checks in code in for job A and job C. As job C is in the queue, it starts soon later und checks out the out from the last revision of job C, which leads to a failing build. Job A will be triggered later, when the next SCM polling time is reached and the everything builds correctly. We solved this issue with a Groovy script, which polls the SCM often for the last revision number. If it hast changed, it determines which jobs belongs to this revison and put all these jobs into the building queue with the revision number as a parameter. Now each job builds with the same revision und no false positives occures.

          Luc De Graef added a comment -

          Freimut, Is there a possibility to publish that groovy script? It would provide a very hand workaround.

          Luc De Graef added a comment - Freimut, Is there a possibility to publish that groovy script? It would provide a very hand workaround.

          Sean Reque added a comment -

          I've been using jenkins for a week and I've already ran into this feature gap.

          Presumably there's some groovy API where I can query multiple repositories in a specific order and then trigger those builds in the same order? Is there documentation somewhere on how to use this API?

          Sean Reque added a comment - I've been using jenkins for a week and I've already ran into this feature gap. Presumably there's some groovy API where I can query multiple repositories in a specific order and then trigger those builds in the same order? Is there documentation somewhere on how to use this API?

          Sean Reque added a comment -

          For anyone that is interested, you can use the a system groovy script, provided by the groovy plugin, to simulation this functionality so that you can poll and build projects in any order you specify.. The following imports get you mostly to what is available in the groovy console:

          import hudson.*
          import hudson.model.*

          Some useful methods include:

          //get a project by name. returns null if project not found
          def project = jenkins.model.Jenkins.instance.getItem(projectName)

          //run a project synchronously and get the result
          def result = job.scheduleBuild2(0, new hudson.triggers.SCMTrigger.SCMTriggerCause("description here")).get().getResult()

          In my case I was using a custom polling script, so I did not investigate how to programmatically do an SCM poll.

          Sean Reque added a comment - For anyone that is interested, you can use the a system groovy script, provided by the groovy plugin, to simulation this functionality so that you can poll and build projects in any order you specify.. The following imports get you mostly to what is available in the groovy console: import hudson.* import hudson.model.* Some useful methods include: //get a project by name. returns null if project not found def project = jenkins.model.Jenkins.instance.getItem(projectName) //run a project synchronously and get the result def result = job.scheduleBuild2(0, new hudson.triggers.SCMTrigger.SCMTriggerCause("description here")).get().getResult() In my case I was using a custom polling script, so I did not investigate how to programmatically do an SCM poll.

          bwestrich added a comment -

          Nice idea for how to build jobs in a prespecified order! Another option that will not only build the jobs but also do SCM polling is to write a script (shell wget for example) that uses the Jenkins Rest API to poll the jobs in order. This polling also builds the job if there are new SCM changes.

          bwestrich added a comment - Nice idea for how to build jobs in a prespecified order! Another option that will not only build the jobs but also do SCM polling is to write a script (shell wget for example) that uses the Jenkins Rest API to poll the jobs in order. This polling also builds the job if there are new SCM changes.

          Jesse Glick added a comment - - edited

          Unless I am missing something this still has no UI. I would suggest this be removed from core and implemented as a plugin, using a new trigger type.

          Jesse Glick added a comment - - edited Unless I am missing something this still has no UI. I would suggest this be removed from core and implemented as a plugin, using a new trigger type.

            bwestrich bwestrich
            jbq jbq
            Votes:
            14 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated: