Status: Closed (View Workflow)
When a connection to a slave is unexpectedly broken during a build, an Xvfb process started by the plugin is not killed. After the slave is reconnected, subsequent builds on it fail immediately because the plugin tries to start a new Xvfb process on the same display as the old one.
It would be helpful if the plugin could kill left over Xvfb processes before trying to start a new one. There could be a configuration option to kill all Xvfb processes. More robustly, it could probably find just the problematic one by command line.
thanks for reporting this, I would have probably never come across this. I've made a change in the way the plugin handles shutdown of Xvfb: it now detects the communication failure on shutdown, and keeps a list of Xvfb processes that need to be killed when the offending node comes back online. So when the node is back online the zombie Xvfb process will be (attempted to be) killed, and the temporary frame buffer directory will be deleted.
This does not help in case of master failure because the list is not persisted, but in that case you master fails I guess that you have greater problems than zombie Xvfb processes.
I've tested this by killing ssh connection between master and the slave, I would be very grateful if you could test it in your setup as well.
You'll need to use the experimental update site (http://updates.jenkins-ci.org/experimental/update-center.json) configured in the Advanced tab of the Plugin manager, the released version is 1.0.9-beta-2.
I was able to install 1.0.9-beta-2 manually and it works exactly as you describe. Thanks for the quick fix. BTW, I doubt it's specific to this plugin, but I have not yet been able to force Jenkins to update its plugin list. I changed the update URL to the experimental one, but neither clicking the "Check Now" button nor restarting Jenkins caused it to see Xvfb 1.0.9-beta-2. Jenkins definitely updates the list itself periodically but having to wait some unknown amount of time after changing the URL isn't very practical. Do you have any idea why it behaves like this?
Code changed in jenkins
User: Zoran Regvart
JENKINS-20758Xvfb processes remain after slave disconnect