-
New Feature
-
Resolution: Fixed
-
Major
-
None
-
Powered by SuggestiMate
Hi,
I have a workflow that need a user input at some stage.
However, due to the frequent commits, which trigger the builds, I get to the case when I have several builds of the same workflow stay idle in the list awaiting the user input.
I would expectr my workflow to discard older builds if the newer build has reached the same stage.
But I cannot succeed that goal.
If I use the workflow like:
stage title: 'DevBuild', concurrency: 1 // do build stage title: 'Integration', concurrency: 1 input message: 'Proceed?' // do integration
then the very first build enters 'Integration' stage and pauses for user input. The other newer builds may superseed each other, but there is always the last build that gets to 'Integration' stage and waits until the very first build completes (because it cannot enter stage Integration).
I would like the newer build to discard the older build which waits for user input in this case, so that newer build can be decided by the user.
Is it possible to implement such a scenario?
Thank you
- is related to
-
JENKINS-32829 Older builds allowed to wait for a throttled stage which a newer build is in
-
- Closed
-
-
JENKINS-29892 Block of stages functioning as a concurrency unit
-
- Resolved
-
-
JENKINS-30757 Use stage as a resource lock
-
- Resolved
-
- links to
[JENKINS-27039] Option for input or stage step to cancel older executions
Try
stage title: 'DevBuild', concurrency: 1 // do build stage title: 'Waiting to integrate' input message: 'Proceed?' stage title: 'Integration', concurrency: 1 // do integration
I have not tried it yet, but I think this should allow later builds to get into the Waiting to integrate stage, so that you can agree to integrate the last one, killing the earlier ones.
(The “CD” demo does not throttle the earlier stages, since it is not necessary in general, so builds only wait to enter Production when they are already approved. This is the simpler approach, but perhaps your DevBuild stage really needs to be single-threaded.)
Hi Jesse,
that does not work. I have created exactly this workflow:
stage name: 'DevBuild', concurrency: 1 echo "doing dev build" stage name: 'Waiting to integrate' input message: 'Proceed?' stage name: 'Integration', concurrency: 1 echo "doing integration"
Then I started 3 builds one after another. All three are now waiting for user input in stage "Waiting to integrate".
I would expect that two earlier builds get discarded once the last one reaches the user input. Otherwise I may get many builds waiting for user interaction.
So you are asking for an option to input to kill older step executions with the same ID. I suppose this could be useful. Independent of the stage step in that case.
Open question what the exact form the option could take. I can think of four possible behaviors, not all of which are necessarily valuable enough to bother implementing and exposing in the UI:
- Current behavior: each input step is independent.
- Kill any older execution when a new input step starts. Thus for a given flow and input id (~ message, by default) there may only be one open prompt at a time.
- Kill any older executions (possibly several) when a newer input step is approved or rejected. Thus the user could choose to approve a build which is not currently the latest, retaining the option to approve subsequent builds which are already waiting for approval.
- Optionally kill older executions upon approval/rejection, but allow the approving/rejecting user to skip that on a case-by-case basis.
The third behavior seems the most useful to me. The second behavior suffers from the problem that a user may spend some time manually vetting a build, only to see it be discarded moments before clicking approval, just because a newer one happened to come in. The fourth behavior seems like it is asking the approver to make a choice that should have been left to the flow designer.
Also “older” vs. “newer” is ambiguous here. The simplest interpretation is by start time, but this could mean that a prompt in an earlier build is considered “newer” than a prompt in a later build, which is probably undesirable from the perspective of a pipeline where later builds are expected to supersede earlier ones. A more useful interpretation is a comparison by build number; this would complicate implementation of the second behavior slightly, since upon entering the step you would not only need to check for other builds which need to be canceled, you would need to check if this build should be canceled.
Raising priority since I think this request is useful even in the advertised “CD” demo.
I would vote for the second solution.
Right now we have really a case in our build system where we compile ca. 20 projects, the number will increase in the near future.
So if every project will show up several times (like now) as idle build in the dashboard view, that will be a mess. It may have sense to keep those multiple jobs waiting for the user input if the job will appear only once in the dashboard, so that we know yes, there is something to do with that job.
But still, when we have approved some build for release then very most likely we will not want to release an older build any more, thus it will be a pain to abort all those older builds manually.
Another reason to automatically kill older builds is when there are frequent commints to the build branch. Even during a day we may have the build triggered 10-20 times. When some build gets to the later stage in a pipeline and has already passed all quality criteria which are performed on earlier stages, then it sounds like the newer build is "better" and there is no reason to focus on an older run.
Having the possibility to choose from the multiple builds to proceed with those might be considered as a flexible feature, but for the automated build system it looks more like a drawback as human have to effort in analysing those multiple choices.
While thinking of the actual issue I found another use case that might be better to move to a separate issue: the user input should be eligible to proceed with defaults automatically if user has not reacted during some period of time (of course, that should be a big value on a practice, like few hours or so, and it should be an optional function). Well, you would really allow this kind of automation only if you are super confident in your automated QA in the build workflow.
Timeout for the input step would definitely be a separate issue. You could probably do this awkwardly today:
def x try { timeout(60) { try { input 'Look good?' } catch (InterruptedException _x) { x = _x // rejected } } } catch (InterruptedException _) { // timeout, proceed } if (x != null) { throw x }
I am wondering if I am focusing wrongly on input, when there can be other steps that wait for external agents to allow the build to proceed. (waitForCond, for example, lets you do this generally.) Arguably this feature really belongs in stage itself. For example:
stage name: 'DevBuild'/* optionally add: concurrency: 1 */ echo "doing dev build" stage name: 'Waiting to integrate', concurrency: 1, eager: true input message: 'Proceed?' stage name: 'Integration', concurrency: 1 echo "doing integration"
Here the eager on the second stage would change the handling of concurrency somewhat. Normally when the stage is at full capacity, a new build coming in will wait to enter it (possibly canceling a somewhat older build already waiting to enter it). With this proposed option, stage would always allow the new build to proceed immediately, but the oldest build running in the stage would be interrupted (meaning any running input would stop waiting, any running sh would be asked to stop, and so on). That would effectively implement my “second behavior” for input (the one you seem to want), but in a more generic way.
I think something like the “third behavior” is also possible with a modified stage semantics (not using concurrency): instead of interrupting older builds upon entering the stage, they would be interrupted upon leaving it. In other words, builds could move through the stage in linear order, but if an older build fails to make it through the stage before a newer build has left it, it would be canceled:
stage name: 'DevBuild'/* optionally add: concurrency: 1 */ echo "doing dev build" stage name: 'Waiting to integrate', linear: true input message: 'Proceed?' stage name: 'Integration', concurrency: 1 echo "doing integration"
I can imagine this could be useful in various scenarios, not just with input.
Hi Jesse,
yes, the option "eager" would be fine. It will probably satisfy my current requirements.
The description on how it will handle the "input" is ok. But I do not completely understand what will happen to other step kinds in this stage, like "sh". What does mean "would be asked to stop"? It should be completely fine if the engine just lets currently running step to complete and just skip all other steps, and of course mark the build as aborted or cancelled.
But in general, I think, the idea of this parameter in stage is absolutely fine.
For me, yet another option would be useful.
I have a stage with "concurrency: 1" that I know isn't a bottleneck and I'd like a "patient: true" type option that would queue up all incoming builds without dropping any.
I think the option would be generally useful but my current problem is with my git step that the build triggers on, if several commits turn up in close order it's possible for some builds to be aborted. The source base is big so I don't want a separate workspace for each parallel build hence "concurrency: 1".
What does mean "would be asked to stop"?
Basically the same thing as if you clicked the red stop button in the UI. Shell steps would be aborted, etc.
I'd like a "patient: true" type option
Probably the abovementioned linear option would suffice, unless you really want to let them run out of order. For that use case this is not really a “stage”, it is just a resource lock, which is probably better handled as a separate step.
Thanks for your thoughtful comment, I'll ditch my request and instead simply +1 your 'linear' proposal.
The reason our builds would be out of order is that our build machines vary widely in performance, but the added complexity of builds occurring out of order is more than I bargained for. Instead the newer build would have to be held until the older build ended, I can understand you not wanting to go that way though...
Well the “resource lock step” is actually likely to be easier to implement than changes to stage, I am just pointing out that you might not want it to work that way.
Are there any plans for this feature? I too have issues with multiple "input" steps and would be very happy to see either "eager" or "linear" solution implemented
I will add my use case for this as well (probably related, or even already stated, by others):
In a simple 2-stage workflow, stage 1 may have many builds while there is a build that is in stage 2 that is several builds behind the latest build in stage 1, and stage 2 and is in a waiting (input) state. Stage 1 may continue to get new builds that never reach stage 2. Well, someone decides that the build is ready to be processed through stage 2, but what if that includes rebuilding the app? While maybe best practice says it shouldn't, in cases such as the Artifactory Release Management plugin that is exactly what happens. When you run the release process, it takes whatever's in the latest SNAPSHOT build and then rebuilds the same source code with the SNAPSHOT removed. This isn't a problem if it's the same code base, but you can see in my example that it may not be: if someone says Proceed on the old stage 2 build, it could process something entirely different than what went through that particular build's stage 1 if there had been commits since then, as there quite often are. We would like to use workflow to control our artifactory release process and this would help.
While we are thinking about this, would it make sense to allow for the final status of the canceled step to reflect the status of the last executed stage? For example, Assume I have a two stage workflow with stage 1 being CI and stage 2 being production build. There will be many stage 1 builds that do not make stage 2, but they could still be successful builds.
There are more cases to consider, specially when the `input` step is not used and builds proceed without human interaction. Think of a Pipeline definition where the time taken by its stages is different from one build to another (for whatever reason), it could lead to newer builds finishing (not in the stage but the full build) earlier than older builds, so your continuous deployment configuration would be deploying an old state even having a newer one.
Probably what it's needed is a way to force builds going in-order through the stage, if a build reaches the stage when a newer one is inside or has already passed through then it has to be aborted. Perhaps this is worth to go into a new specific step managing all concurrency stuff and leave `stage` as a labeling step. This feature in conjunction with the `linear` behavior described in previous comments (JENKINS-32829) would remove the need to manually cancel any older build (which was the initial motivation of this issue).
A vague thought: we could have a step called, say, milestone (checkpoint would be natural but it is already taken), taking a natural number argument. Unlike stage names, which have no order beyond what is discovered as you run, builds would be required to pass through milestones in increasing order (perhaps allowing some builds to skip inapplicable milestones?), and an older build would not be allowed to pass a given milestone after a newer build hit it. Details TBD.
For purposes of the stage view and other UI, we could keep stage as is, or just deprecate it altogether and go with a block-scoped step as is proposed in JENKINS-26107.
Rough sketch:
label('Building') { node { checkout scm sh 'make' stash 'stuff' } } milestone 1 label('Testing') { node { unstash 'stuff' sh 'make test' } } milestone 2 /* optionally: input 'OK to publish?' milestone 3 */ label('Publishing') { node { unstash 'stuff' sh 'scp stuff …' } }
jglick Thanks for your thought, looks good. From my perspective the aim of this step (milestone) is to define a reliable Continuous Delivery and Deployment pipeline, where delivered/deployed code is granted to be up-to-date, so managing concurrency within this new step is needed. So a second configuration field would be required: concurrency, which is similar to stage concurrency but taking care only of the number of builds running the milestone (whether builds enter the milestone or not is managed separately).
As a collateral benefit we have a fix for this issue, as it is granted that only the latest build will be waiting for input when using milestone concurrency: 1, so there will be no need to cancel any old build manually.
Feature wise very interesting - but label already has another meaning in Jenkins (and it is a very general world) - is there anything else it could be called? It is great it is block scoped. Milestones being numbered implies an ordinal structure and they are not blocked scoped. If you are making a human lay out the order of something with numbers - doesn't that mean it should be a list of blocks of actions? (I mean, I love basic too, and line numbering.. but...)
michaelneale As I pointed in a comment in the PR, once JENKINS-30269 is fixed and concurrency options removed from milestone, this step is something that a build does not enter but passes, so having a block scope does not make much sense, IMO.
how does a number match when you have flows including parts of other flows (ie code re-use.) I think the number here is broken.
what is 1 in some flow may be the 20th thing in another.
so if you then say start using 100,101 200,201 in another flow - what happens when you then have a uber pipeline that sums up 6 or so other pipelines that are re-used elsewhere with a smaller or more different flow entierly?
It may be that the flow you are importing is not even maintained by your team and you have no control over whatever numbers it strives to use.
.bq as a collateral benefit we have a fix for this issue, as it is granted that only the latest build will be waiting for input when using milestone concurrency: 1, so there will be no need to cancel any old build manually.
That is very dangerous.. if you cancel currently running thing in the miletestone then you may never deploy anything to production as you are constantly killing the build under test in Jesse's "testing" example above (if that had concurrency 1 due to the fact you hare hardware limited).
how does a number match when you have flows including parts of other flows
That is very dangerous.. if you cancel currently running thing in the miletestone then you may never deploy anything to production
Both aspects have been discussed and fixed in the PR. The current status is: no milestone numbers (they are auto calculated as the flow goes) and a newer build can not cancel an older build that already passed the milestone where the newer is.
Is there any effort to make a proposal such as this work for Pull Requests? For example, cancel older builds for a specific PR, if more pushed commits triggered a new build? But not cancel any builds for a different PR even if it was older?
Why is this new, refiled, pull request not link with this ticket: https://github.com/jenkinsci/pipeline-milestone-step-plugin/pull/1
I'm a bit confused in which state this feature request is.
Why is this new, refiled, pull request not link with this ticket
It is. See the links section in this issue.
I'm a bit confused in which state this feature request is.
Once released, the milestone step will cover the feature requested here.
cancel older builds for a specific PR, if more pushed commits triggered a new build? But not cancel any builds for a different PR
Automatic if you are using a multibranch project.
Code changed in jenkins
User: Antonio Muniz
Path:
.gitignore
pom.xml
src/main/java/org/jenkinsci/plugins/pipeline/milestone/CancelledCause.java
src/main/java/org/jenkinsci/plugins/pipeline/milestone/Milestone.java
src/main/java/org/jenkinsci/plugins/pipeline/milestone/MilestoneAction.java
src/main/java/org/jenkinsci/plugins/pipeline/milestone/MilestoneStep.java
src/main/java/org/jenkinsci/plugins/pipeline/milestone/MilestoneStepExecution.java
src/main/resources/org/jenkinsci/plugins/pipeline/milestone/MilestoneStep/config.jelly
src/main/resources/org/jenkinsci/plugins/pipeline/milestone/MilestoneStep/help-label.html
src/main/resources/org/jenkinsci/plugins/pipeline/milestone/MilestoneStep/help-ordinal.html
src/main/resources/org/jenkinsci/plugins/pipeline/milestone/MilestoneStep/help.html
src/test/java/org/jenkinsci/plugins/pipeline/milestone/MilestoneStepTest.java
http://jenkins-ci.org/commit/pipeline-milestone-step-plugin/802b46166d6dc80bc211c0d279b0ab2c81cdaea4
Log:
Merge pull request #1 from amuniz/JENKINS-27039
JENKINS-27039 New step: milestone
Compare: https://github.com/jenkinsci/pipeline-milestone-step-plugin/compare/24aa55cd556c...802b46166d6d
cancel older builds for a specific PR, if more pushed commits triggered a new build? But not cancel any builds for a different PR
Automatic if you are using a multibranch project.
jglick do you mean running PR build in multibranch should abort if new build for same PR is scheduled? I ask because it does not work for me so I wonder if I understand you right.
uvizhe I believe you have to use 'milestone' for this to work - once you have it in place, it will cancel older builds, there is no special treatment of PRs though.
michaelneale milestone does not help with this, my builds still run concurrently and none is aborted. For milestone to work newer build should somehow run faster then older one so to pass any of milestones prior. How is this possible at all? The only case I see is when older build is stuck for some reason so new one outrun it and abort it hence.
UPD: I understood we speak of different scenarios. In my case there's no input steps, just stages.
uvizhe ah I understand you now. I am not sure of an out of the box way to do that. Maybe some plugin contributes that functionality and it does seem like a nice to have feature. It would be possible to script so that when a job starts, if it nows it is a PR, it looks in the history for older running ones and cancels them, but that may be a tad extreme. But possible. Crazier things have been done.
It would be nice if this was just a config option of course, I agree. I never care about older runs of a PR fork or branch.
Re-opening as neither myself or anyone I know has observed this behavior. This may be a bug vs a feature request.
To reproduce: open a PR, let it start building, push an update to the PR, note that there will be 2 concurrent pipeline runs.
Unless I'm missing something, I don't believe this feature/plugin addresses the original request?
Imagine my pipeline looks like:
...
milestone 100
input "Should I proceed?
milestone 200
...
Commit A for my master branch comes along, the pipeline runs, and I hang waiting on the input.
Commit B comes up 30 minutes later, runs through the pipeline, and is now hung waiting on the input.
With a busy master branch, this ends up wasting a lot of nodes as each input step sits on the node it's running on.
Ideally, I would mark the input with a milestone tag (or something similar) such that when commit B gets to the input, commit A is aborted automatically.
do you mean running PR build in multibranch should abort if new build for same PR is scheduled?
Not unless you use milestone. What I meant was that
not cancel any builds for a different PR
is irrelevant for multibranch folders, since each PR is its own job, and milestone only pays attention to builds of the current job.
such that when commit B gets to the input, commit A is aborted automatically
This is another possible option for milestone, not currently supported. The initial implementation covers the case that A, B, C, …, M are waiting; when you approve (or reject) A, B…L will be stopped in favor of M, but we are guaranteed to make progress.
this ends up wasting a lot of nodes as each input step sits on the node it's running on
Then do not run input inside a node block. See the docs for stash.
This arguably makes the input step dangerous for declarative pipeline users...is there a way to make this experience better for users out of the box?
chrisleck yes there has been some thought on that (cc hrmpw abayer) - some alternative way of expressing input. However, there is no getting around if you want to wait for input and not eat a node, you need to do "work" to ensure you stash and unstash things, which can never really be transparent, as magic isn't real. I wish it was.
Looking at the stock demo (I think this is it)
https://github.com/jenkinsci/workflow-aggregator-plugin/blob/60d8ea3/demo/repo/Jenkinsfile
there seems to be problem for this use case.
The input step in the Staging stage is holding onto the 'staging-server' lock, which means later builds won't proceed until someone tells the input step to Proceed, or cancels the build. (EDIT: I confirmed this behaviour using the docker-based demo.)
In some cases this is what you want, but if you want the input step to cancel older executions (as suggested by the issue title and description), this won't happen. I suppose you would have to do something like this: https://github.com/cloudbees/jenkins-scripts/blob/master/cancel-builds-same-job.groovy
(Also, I think the milestone 3 should be inside a lock, otherwise I suspect a race condition could lead to older builds being deployed into production after newer builds.)
seanf is correct, with current tools the best you can do is have multiple commit's waiting for input, then use milestones so if the latest is approved the previous ones will fail... but only if their own input blocks get interacted with after.
A lock will prevent a newer build from entering an input step, and a milestone can only take affect before or after an input step, but multiple input steps can all be reached and idle at the same time requiring input. Can we make input work like a milestone, failing previouos ones? And why is this marked Resolved when the case in the original post was not handled?
alternatively, every input has an 'id' attribute, can I say somehow to cancel all previous inputs of the same id as my current input?