Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68656

SSH Slaves Plugin Deadlock while spinning up a new agent

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • None
    • Jenkins 2.332.3, OpenJDK 11.0.15, running on Ubuntu 20.04
      SSH Slaves Plugin 1.814.vc82988f54b_10 (tested with 1.33.0 as well)
      Anka Build Plugin 2.7.0
    • 1.821.vd834f8a_c390e

      The error observed is agents simply hanging while starting. This happens about 5% of the VMs started in this manner.

      Anka Build plugin is used and the VM which is spun by it is 100% functional.

      Investigating the tread dump shows a deadlock between launch and 

      teardownConncetion methods in SSHLauncher.

      I have attached stack trace of both threads as files.

       

      The launch method seems to be hanging while executing this:
      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      at java.lang.Object.wait(java.base@11.0.15/Native Method)

      • waiting on <no object reference available>
        at hudson.remoting.Request.call(Request.java:177)
      • waiting to re-lock in wait() <0x00000005f9721350> (a hudson.remoting.UserRequest)
        at hudson.remoting.Channel.call(Channel.java:999)
        at hudson.FilePath.act(FilePath.java:1194)
        at hudson.FilePath.act(FilePath.java:1183)
        at hudson.FilePath.exists(FilePath.java:1748)
        at jenkins.branch.WorkspaceLocatorImpl.load(WorkspaceLocatorImpl.java:254)
        at jenkins.branch.WorkspaceLocatorImpl.access$500(WorkspaceLocatorImpl.java:86)
        at jenkins.branch.WorkspaceLocatorImpl$Collector.onOnline(WorkspaceLocatorImpl.java:601)
      • locked <0x00000005f97214e0> (a java.lang.String)
        at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:727)
        at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:437)
        at hudson.plugins.sshslaves.SSHLauncher.startAgent(SSHLauncher.java:645)
        at hudson.plugins.sshslaves.SSHLauncher.lambda$launch$0(SSHLauncher.java:458)
        at hudson.plugins.sshslaves.SSHLauncher$$Lambda$393/0x0000000840c2c040.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(java.base@11.0.15/FutureTask.java:264)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.15/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.15/ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(java.base@11.0.15/Thread.java:829)

            ifernandezcalvo Ivan Fernandez Calvo
            niv_keidan_veertu niv keidan
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: