Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19776

Deadlock of AsyncFutureImpl.get() during massive submission of distributed jobs

XMLWordPrintable

      1) I trigger jobs from Parametrized Trigger Plugin. Job submits about 64 parallel jobs with "Hello, world!" output and waits till their completion
      2) At some point monitoring of jobs hangs. When all slave jobs finish, master job stills waiting..
      3) According to logs, hudson.remoting.AsyncFutureImpl.get() hangs, because "completed" was initially false. Then, wait() cycle never returns. Seems that AsyncFutureImpl:set() has not been called for one of the jobs.

      Additional analysis:

      • Submission works well on local host w/o additional remote node
      • In the log I see log rotation errors only() // see below
      • All executor thread have been finished for their jobs

      Call stack of the job (0x00000007866656f0 is not used by other threads):

      "Executor #7 for master : executing Test_MassiveSubmission #8" prio=6 tid=0x000000001148e000 nid=0x16bfc in Object.wait() [0x000000000d12e000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <0x00000007866656f0> (a hudson.model.queue.FutureImpl)
        at java.lang.Object.wait(Object.java:503)
        at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:73)
      • locked <0x00000007866656f0> (a hudson.model.queue.FutureImpl)
        at hudson.plugins.parameterizedtrigger.TriggerBuilder.perform(TriggerBuilder.java:135)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
        at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
        at hudson.model.Build$BuildExecution.build(Build.java:199)
        at hudson.model.Build$BuildExecution.doRun(Build.java:160)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
        at hudson.model.Run.execute(Run.java:1592)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:237)

      Error log contains only following errors:

      SEVERE: Failed to rotate log
      java.io.IOException: C:\Users\nenashev\Documents\Work\Jenkins\contrib\parameterized-trigger-plugin\.\work\jobs\Test_MassiveSubmissionSlave\builds\2013-09-26_15-36-16 is in use
      at hudson.model.Run.delete(Run.java:1380)
      at hudson.tasks.LogRotator.perform(LogRotator.java:133)
      at hudson.model.Job.logRotate(Job.java:404)
      at hudson.model.Run.execute(Run.java:1655)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:237)

            oleg_nenashev Oleg Nenashev
            oleg_nenashev Oleg Nenashev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: