Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55512

Safe shutdown/restart should not block completion of complex jobs (that spawn child jobs)

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None
    • from ages ago up till current Jenkins 2,156

      Jenkins supports "safe" behavior to shut down (safeExit) or restart (safeRestart) itself, which seems to be used in plugin updater, thinBackup plugin (doing just a quietDown part of this), operations that can be requested via Jenkins URL (and a link from Jenkins Manage interface), to name a few use-cases.

      This mode waits for currently running jobs to complete and disallows new jobs to progress from scheduled to running. The assumption is that the current jobs will complete, the server will be quiet and can be administratively restarted with no severe interruption to its users. Documented in more detail at https://support.cloudbees.com/hc/en-us/articles/216118748-How-to-Start-Stop-or-Restart-your-Instance- for example.

      This is problematic however when there are complex jobs, such as MultiPhase or entangled pipelines, where one wrapper job calls as its payload over time a number of other jobs that implement certain tests or other operations. When the safe shutdown mode is enabled, these child jobs can not be started, and the parent job stalls indefinitely waiting for their result, and the Jenkins master is left dysfunctional (not restarted for e.g. upgrade overnight, and not running any new builds).

      The expected operational result would be that the currently running wrapper jobs AND any children (and their children) that can get spawned would be allowed to start and awaited to complete (boils down to "completion of all running jobs" as before), after which the safe restart/shutdown normally takes place.

      I believe this could be done with some simple check of the build cause in the code which disallows execution of scheduled builds when the safe shutdown is enabled, to allow building of jobs triggered by a job, but would disallow builds triggered by SCM changes, Polling, Indexing, manually triggered (or maybe that one with an option to go through nonetheless?) etc. This seems like the place (or good starting point): https://github.com/jenkinsci/jenkins/blob/d5eeefd7beb8a00910f8d579ef14124a8be1914c/core/src/main/java/hudson/model/Queue.java#L1786

            jimklimov Jim Klimov
            jimklimov Jim Klimov
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: