Sam Van Oort: I already tested a retry step and a retry option on the stages where I poll Jira, but this also seems to produce tons of pipeline steps (I can see a lot of XML-files under the workflow-directory for the build).
Unfortunately I found that my strategy with sh-loops is a dead-end, as the sh-step requires an agent (with node, workspace and executor). As we might have tens or even hundreds of concurrent builds in a Jira-polling stage for several days those stages should not use agents.
I had a look at the Cheetah-documentation you linked to, and it looks really interesting. I think it might solve my problem IF the durability option could be set on a per stage level – then I could make my Jira-polling stages PERFORMANCE_OPTIMIZED, which I assume would avoid the StackOverflow-problem for while/retry loops. I will test the option set on pipeline level now (although we need durability for some of the stages).
Btw I noticed the line "You can force a Pipeline to persist data by pausing it." on the Scaling pipelines page. Can the pipeline be paused/resumed from within itself (using steps?), sort of like a transaction commit?
Problem is here: we're doing a recursive call to get the logs - https://github.com/jenkinsci/workflow-job-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java#L478
Needs to be rewritten as a while(...) loop with the current search candidate inside the loop. However I understand that Jesse Glick is doing some significant rewrites of log storage at the moment, so I'll hold off touching the code to avoid collisions.
Side note, this is explicitly a "don't do that" in my recent Jenkins World talk :-P