-
Bug
-
Resolution: Fixed
-
Major
-
None
mansion-client seems to be removing nodes shortly after they go offline. Common stack traces include:
https://gist.github.com/recampbell/fc711322922a99a7b3da
and more frequently:
13:43:24 FATAL: null
13:43:24 java.lang.NullPointerException
13:43:24 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:549)
13:43:24 at hudson.model.Run.execute(Run.java:1665)
13:43:24 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
13:43:24 at hudson.model.ResourceController.execute(ResourceController.java:88)
13:43:24 at hudson.model.Executor.run(Executor.java:246)
The basic problem seems to be that hudson.model.Executor#isIdle checks if Executor#executable is null. This is assigned in a synchronized block in Executor#run:217.
Fundamentally, we can't check for idleness and atomically ensure that the idle state holds. Any cloud plugin likely suffers from this problem, but mansion-cloud has a unique exposure since it adds and removes slaves so quickly.
Related to CloudBees ZD-19508
- is related to
-
JENKINS-23764 MansionComputer removed before build can run
- Resolved