-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Ubuntu Linux 10.04
When jenkins restarts itself (for instance after plugin updates) the process name changes from java to exe. This means that the /etc/init.d/jenkins stop script then fails to find the main jenkins program and also it complicates process monitoring (nagios etc.).
I have observed this on a number of Jenkins/Hudson versions but have only just got around to investigating now. I think I've got a good handle on what is going on (and a suggested fix - although no patch).
In normal operation (without builds in progress) I can see two jenkins processes "daemon" and "java". pgrep finds those too.
richm@royalcounty:~$ ps -ef | grep jenkins
jenkins 9782 1 0 13:29 ? 00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid – /usr/bin/java -jar /usr/share/jenkins/jenkins.war --webroot=/var/run/jenkins/war --httpPort=8082 --ajp13Port=-1
jenkins 9783 9782 99 13:29 ? 00:00:08 /usr/bin/java -jar /usr/share/jenkins/jenkins.war --webroot=/var/run/jenkins/war --httpPort=8082 --ajp13Port=-1
richm 10340 21318 0 13:29 pts/0 00:00:00 grep jenkins
richm@royalcounty:~$ pgrep -u jenkins java
9783
richm@royalcounty:~$ pgrep -u jenkins exe
richm@royalcounty:~$
Now if I force jenkins to restart itself (for instance by upgrading a plugin) I still see jenkins with the same process id but a pgrep finds it with a new name of exe
richm@royalcounty:~$ ps -ef | grep jenkins
jenkins 9782 1 0 13:29 ? 00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid – /usr/bin/java -jar /usr/share/jenkins/jenkins.war --webroot=/var/run/jenkins/war --httpPort=8082 --ajp13Port=-1
jenkins 9783 9782 99 13:39 ? 00:00:54 /usr/bin/java -jar /usr/share/jenkins/jenkins.war --webroot=/var/run/jenkins/war --httpPort=8082 --ajp13Port=-1
richm 26138 21318 0 13:39 pts/0 00:00:00 grep jenkins
richm@royalcounty:~$ pgrep -u jenkins java
richm@royalcounty:~$ pgrep -u jenkins exe
9783
richm@royalcounty:~$
I am pretty sure that the problem lies in UnixLifecycle.java in the call to LIBC.execv
// exec to self
LIBC.execv(
Daemon.getCurrentExecutable(),
new StringArray(args.toArray(new String[args.size()])));
The Daemon.getCurrentExecutable() call returns
String name = "/proc/" + pid + "/exe";
On linux this is a symbolic link to the real executable java. If the /exe name is used in execv that sets the process name to be exe. Really the execv call should be performed using the target of the symbolic link and that would keep the name found by pgrep as being java.
richm@royalcounty:~$ sudo ls -l /proc/9783/exe
lrwxrwxrwx 1 jenkins nogroup 0 2011-03-31 13:39 /proc/9783/exe -> /usr/lib/jvm/java-6-openjdk/jre/bin/java
Note that this also breaks the /etc/init.d/jenkins stop command because it is no longer able to find the java process owned by user jenkins.
It isn't clear to me whether the correct fix would lie in the Daemon class (from akuma-1.4.jar) or whether it would be appropriate to lookup the symlink in UnixLifecycle itself.