[JENKINS-4611] hudson.proc.RemoteProc.kill() does not work - Jenkins Jira

Type: Bug
Resolution: Fixed
Priority: Critical
Component/s: core
Labels:
None
Environment:
Platform: All, OS: All

Similar Issues:
Powered by SuggestiMate

Show

A few days ago we had a Mercurial server outage, with the result that all Hg
processes running at the time hung. (For technical reasons relating to network
config, the connections do not time out - they just hang forever.)

For those jobs running on master, the Hg polling was killed after an hour due to
issue #4461.

But for those jobs running on a slave,
SCMTrigger.DescriptorImpl.queue.inProgress shows them still active, even though
their polling log claims they were killed after an hour. A thread dump on master
confirms this:

"SCM polling for hudson.model.FreeStyleProject@164e3e2[apitest]" prio=10
tid=0xa0e0a400 nid=0x746e in Object.wait() [0xf77ff000..0xf77ff554]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at hudson.remoting.Request$1.get(Request.java:185)

locked <0x69424868> (a hudson.remoting.UserRequest)
at hudson.remoting.Request$1.get(Request.java:165)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:290)
at
hudson.plugins.mercurial.MercurialSCM.joinWithTimeout(MercurialSCM.java:233)
at hudson.plugins.mercurial.MercurialSCM.pollChanges(MercurialSCM.java:192)
at hudson.model.AbstractProject.pollSCMChanges(AbstractProject.java:1032)
at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:317)
at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:344)
at
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:114)

It seems that even though proc.kill() was called in another thread, proc.join()
is still waiting.

Looking at the implementation, it is no wonder kill() does not work:
Request.callAsynch's Future.cancel just returns false and does nothing!

Shouldn't it call channel.send(new Cancel(id)) or abort(...) or something like this?

is blocking

JENKINS-4461 Hg polling can hang indefinitely

Closed

Assignee:: Unassigned

Reporter:: Jesse Glick

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2009-10-06 09:32

Updated:: 2011-02-10 19:07

Resolved:: 2009-11-06 10:34

Details

Description

Attachments

Issue Links

Activity

People

Dates