-
New Feature
-
Resolution: Unresolved
-
Minor
This is a feature idea that I'm willing to implement but I'd like to hear maintainers' thoughts on this first.
Case:
I manage slaves with puppet. Bringing them up is easy - configure, run java. Shutting down (say, for a reboot or vm teardown) is not so easy - I'm very likely to kill a running job. So I have a bash loop that counts the java processes not including the swarm process itself. If the count is 0, I can shut down. But that's crude and unreliable - not all subprocesses will be java, and there is certainly a chance of a new job starting in the time it takes to kill the swarm instance.
The proper way is of course to interact with the master - mark offline, wait, reboot. But this requires the swarm nodes to have extensive knowledge of the master, which seems to contradict the purpose of swarm (it's managed from the slave side without any master interaction, and thus should be able to dynamically come and go).
Things I'm considering:
- java -jar swarm.jar SOME COMMAND
- http port in the slave jvm, ie "curl http://localhost:1234/control/SOME_COMMAND"
- Signals, ie "kill -11 SWARM PID"
SOME COMMAND may be "go offline", "shutdown", "block till idle" etc - but may also be something that can return status - ie "is idle?", "is offline?", etc (obviously not for the Signals approach)
There are definitely some problems with these solutions, so I'm curious what others think. It's also possible that I'm overlooking a simpler way.