-
Bug
-
Resolution: Fixed
-
Major
-
Jenkins v2.89.2
Underlying issue
Reactor run method (https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L115) only catches TunnelException.
If another exception occurs in the task run method,
- done is set to true
- fatal is not set
- but pending variable is not decremented
This makes Reactor execute method wait for ever (https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L266) as pending will never reach 0.
How can task.run throw something else than a TunnelException?
task run method (https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L207) catches exceptions from the runTask method and wraps them in a TunnelException.
But listener methods are not protected by try/catch blocks (except onTaskCompleted for technical reasons)
So if an exception occurs in any of the onTaskStarted or onTaskFailed, it won't be wrapped in a TunnelException, making the problem arise.
For milestones, another type of run method is used (https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L176), calling the listener's onAttained method, unprotected by try/catch.
Even if it can make sense not to protect listener methods, I am wondering if this is a good idea to have different behaviors depending on the listener method as one of them is in fact protected?
Where does a listener throw an exception?
InitReactorRunner provides a listener implementation calling InitReactorRunner's onInitMilestoneAttained method, when onAttained is called on a milestone of type InitMilestone.
onInitMilestoneAttained implementation in Jenkins class (https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/model/Jenkins.java#L1092) can throw if ExtensionList.lookup(ExtensionFinder.class).getComponents(); throws.
Exceptions can occur through a static call to a plugin class in an exception point if this same plugin will not start for a dependency reason
- example of this use case can be found here: https://github.com/jenkinsci/build-failure-analyzer-plugin/pull/80/commits/343a9b21c60a411b6cc7e48b05a9936e6f0c937b#diff-475044ef0a13791f000702f5f3e09874L83, which throws a AssertionException when plugin is null in the static field initialization.
- Symptom (in unit tests) can be seen there: https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fclaim-plugin/detail/PR-39/1/ => InjectedTests never run. Note that this was not the case with Jenkins 2.7.3.
Note that even if this seems an advanced scenario, other implementations of _ReactorListener_s could be provided through https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/InitReactorRunner.java#L63, and they might provide implementations which throw exception.
How to correct?
I would have proposed a fix, but I'm not sure of the direction to take:
- enhance the Reactor component
- by protecting listeners calls ?
- by adding a way for listener methods to mark the task as fatal for milestones?
- work around the limitations on listeners error-handling in Jenkins core code
- by try catching, logging, and swallowing the exception (not sure of the consequences) ?
- or by adding back to the reactor a task designed to throw the exception in case of errors (will occur at some unknown point in the future, but at least will stop Jenkins startup) ?
- just fix all plugins ?
- some other way?
[JENKINS-48725] Jenkins startup workflow can deadlock if exceptions occur during extensions lookup
Description |
Original:
*Underlying issue* _Reactor_ _run_ method ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L115]) only catches _TunnelException_. If another exception occurs in the task run method, * _done_ is set to true * _fatal_ is not set * but _pending_ variable +is not decremented+ This makes _Reactor_ _execute_ method wait for ever ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L266)] as +_pending_ will never reach 0+. ---- *How can task.run throw something else than a TunnelException?* _task_ _run_ method ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L207)] catches exceptions from the _runTask_ method and wraps them in a _TunnelException_. But _listener_ methods are not protected by try/catch blocks (except _onTaskCompleted_ for technical reasons) So if an exception occurs in any of the _onTaskStarted_ or _onTaskFailed_, it won't be wrapped in a _TunnelException_, making the problem arise. For milestones, another type of _run_ method is used ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L176)], calling the listener's _onAttained_ method, unprotected by try/catch. (?) Even if it can make sense not to protect listener methods, I am wondering if this is a good idea to have different behaviors depending on the listener method as one of them is in fact protected? ---- *Where does a listener throw an exception?* _InitReactorRunner_ provides a listener implementation calling _InitReactorRunner_'s _onInitMilestoneAttained_ method, when _onAttained_ is called on a milestone of type _InitMilestone_. [https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/InitReactorRunner.java#L83] _onInitMilestoneAttained_ implementation in Jenkins class ([https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/model/Jenkins.java#L1092)] can throw if _ExtensionList.lookup(ExtensionFinder.class).getComponents();_ throws. Exceptions can occur through a static call to a plugin class in an exception point if this same plugin will not start for a dependency reason * example of this use case can be found here: [https://github.com/jenkinsci/build-failure-analyzer-plugin/pull/80/commits/343a9b21c60a411b6cc7e48b05a9936e6f0c937b#diff-475044ef0a13791f000702f5f3e09874L83|https://github.com/jenkinsci/build-failure-analyzer-plugin/pull/80/commits/343a9b21c60a411b6cc7e48b05a9936e6f0c937b#diff-475044ef0a13791f000702f5f3e09874L83)], which throws a AssertionException when plugin is null in the static field initialization. * Symptom (in unit tests) can be seen there: [https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fclaim-plugin/detail/PR-39/1/] => InjectedTests never run. Note that this was not the case with Jenkins 2.7.3. (?) Note that even if this seems an advanced scenario, other implementations of _ReactorListener_s could be provided through [https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/InitReactorRunner.java#L63,] and they might be unprotected also. ---- *How to correct?* I would have proposed a fix, but I'm not sure of the direction to take: * enhance the Reactor component ** by protecting listeners calls ? ** by adding a way for listener methods to mark the task as fatal for milestones? * work around the limitations on listeners error-handling in Jenkins core code ** by try catching, logging, and swallowing the exception (not sure of the consequences) ? ** or by adding back to the reactor a task designed to throw the exception in case of errors (will occur at some unknown point in the future, but at least will stop Jenkins startup) ? * just fix all plugins ? * some other way? |
New:
*Underlying issue* _Reactor_ _run_ method ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L115]) only catches _TunnelException_. If another exception occurs in the task run method, * _done_ is set to true * _fatal_ is not set * but _pending_ variable +is not decremented+ This makes _Reactor_ _execute_ method wait for ever ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L266)] as +_pending_ will never reach 0+. ---- *How can task.run throw something else than a TunnelException?* _task_ _run_ method ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L207)] catches exceptions from the _runTask_ method and wraps them in a _TunnelException_. But _listener_ methods are not protected by try/catch blocks (except _onTaskCompleted_ for technical reasons) So if an exception occurs in any of the _onTaskStarted_ or _onTaskFailed_, it won't be wrapped in a _TunnelException_, making the problem arise. For milestones, another type of _run_ method is used ([https://github.com/jenkinsci/lib-task-reactor/blob/master/src/main/java/org/jvnet/hudson/reactor/Reactor.java#L176)], calling the listener's _onAttained_ method, unprotected by try/catch. (?) Even if it can make sense not to protect listener methods, I am wondering if this is a good idea to have different behaviors depending on the listener method as one of them is in fact protected? ---- *Where does a listener throw an exception?* _InitReactorRunner_ provides a listener implementation calling _InitReactorRunner_'s _onInitMilestoneAttained_ method, when _onAttained_ is called on a milestone of type _InitMilestone_. [https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/InitReactorRunner.java#L83] _onInitMilestoneAttained_ implementation in Jenkins class ([https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/model/Jenkins.java#L1092)] can throw if _ExtensionList.lookup(ExtensionFinder.class).getComponents();_ throws. Exceptions can occur through a static call to a plugin class in an exception point if this same plugin will not start for a dependency reason * example of this use case can be found here: [https://github.com/jenkinsci/build-failure-analyzer-plugin/pull/80/commits/343a9b21c60a411b6cc7e48b05a9936e6f0c937b#diff-475044ef0a13791f000702f5f3e09874L83|https://github.com/jenkinsci/build-failure-analyzer-plugin/pull/80/commits/343a9b21c60a411b6cc7e48b05a9936e6f0c937b#diff-475044ef0a13791f000702f5f3e09874L83)], which throws a AssertionException when plugin is null in the static field initialization. * Symptom (in unit tests) can be seen there: [https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fclaim-plugin/detail/PR-39/1/] => InjectedTests never run. Note that this was not the case with Jenkins 2.7.3. (?) Note that even if this seems an advanced scenario, other implementations of _ReactorListener_s could be provided through [https://github.com/jenkinsci/jenkins/blob/a82a47a8f2a90727d873c637c99aa0d087500dfe/core/src/main/java/jenkins/InitReactorRunner.java#L63,] and they might provide implementations which throw exception. ---- *How to correct?* I would have proposed a fix, but I'm not sure of the direction to take: * enhance the Reactor component ** by protecting listeners calls ? ** by adding a way for listener methods to mark the task as fatal for milestones? * work around the limitations on listeners error-handling in Jenkins core code ** by try catching, logging, and swallowing the exception (not sure of the consequences) ? ** or by adding back to the reactor a task designed to throw the exception in case of errors (will occur at some unknown point in the future, but at least will stop Jenkins startup) ? * just fix all plugins ? * some other way? |
Issue Type | Original: Improvement [ 4 ] | New: Bug [ 1 ] |
Labels | Original: exception robustness | New: exception lib-task-reactor robustness |
Assignee | New: Arnaud TAMAILLON [ greybird ] |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
The issue description looks correct to me. I think the best way would be to fix the Task Reactor Component (at least by wrapping exceptions in listeners).
I have some changes pending there (component refresh + extra API enhancements for SECURITY-667). If you submit a PR, please ping me there