Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4611

hudson.proc.RemoteProc.kill() does not work

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • core
    • None
    • Platform: All, OS: All

      A few days ago we had a Mercurial server outage, with the result that all Hg
      processes running at the time hung. (For technical reasons relating to network
      config, the connections do not time out - they just hang forever.)

      For those jobs running on master, the Hg polling was killed after an hour due to
      issue #4461.

      But for those jobs running on a slave,
      SCMTrigger.DescriptorImpl.queue.inProgress shows them still active, even though
      their polling log claims they were killed after an hour. A thread dump on master
      confirms this:

      "SCM polling for hudson.model.FreeStyleProject@164e3e2[apitest]" prio=10
      tid=0xa0e0a400 nid=0x746e in Object.wait() [0xf77ff000..0xf77ff554]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      at java.lang.Object.wait(Object.java:485)
      at hudson.remoting.Request$1.get(Request.java:185)

      • locked <0x69424868> (a hudson.remoting.UserRequest)
        at hudson.remoting.Request$1.get(Request.java:165)
        at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
        at hudson.Proc$RemoteProc.join(Proc.java:290)
        at
        hudson.plugins.mercurial.MercurialSCM.joinWithTimeout(MercurialSCM.java:233)
        at hudson.plugins.mercurial.MercurialSCM.pollChanges(MercurialSCM.java:192)
        at hudson.model.AbstractProject.pollSCMChanges(AbstractProject.java:1032)
        at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:317)
        at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:344)
        at
        hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:114)

      It seems that even though proc.kill() was called in another thread, proc.join()
      is still waiting.

      Looking at the implementation, it is no wonder kill() does not work:
      Request.callAsynch's Future.cancel just returns false and does nothing!

      Shouldn't it call channel.send(new Cancel(id)) or abort(...) or something like this?

            Unassigned Unassigned
            jglick Jesse Glick
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: