Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19776

Deadlock of AsyncFutureImpl.get() during massive submission of distributed jobs

      1) I trigger jobs from Parametrized Trigger Plugin. Job submits about 64 parallel jobs with "Hello, world!" output and waits till their completion
      2) At some point monitoring of jobs hangs. When all slave jobs finish, master job stills waiting..
      3) According to logs, hudson.remoting.AsyncFutureImpl.get() hangs, because "completed" was initially false. Then, wait() cycle never returns. Seems that AsyncFutureImpl:set() has not been called for one of the jobs.

      Additional analysis:

      • Submission works well on local host w/o additional remote node
      • In the log I see log rotation errors only() // see below
      • All executor thread have been finished for their jobs

      Call stack of the job (0x00000007866656f0 is not used by other threads):

      "Executor #7 for master : executing Test_MassiveSubmission #8" prio=6 tid=0x000000001148e000 nid=0x16bfc in Object.wait() [0x000000000d12e000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <0x00000007866656f0> (a hudson.model.queue.FutureImpl)
        at java.lang.Object.wait(Object.java:503)
        at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:73)
      • locked <0x00000007866656f0> (a hudson.model.queue.FutureImpl)
        at hudson.plugins.parameterizedtrigger.TriggerBuilder.perform(TriggerBuilder.java:135)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
        at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
        at hudson.model.Build$BuildExecution.build(Build.java:199)
        at hudson.model.Build$BuildExecution.doRun(Build.java:160)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
        at hudson.model.Run.execute(Run.java:1592)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:237)

      Error log contains only following errors:

      SEVERE: Failed to rotate log
      java.io.IOException: C:\Users\nenashev\Documents\Work\Jenkins\contrib\parameterized-trigger-plugin\.\work\jobs\Test_MassiveSubmissionSlave\builds\2013-09-26_15-36-16 is in use
      at hudson.model.Run.delete(Run.java:1380)
      at hudson.tasks.LogRotator.perform(LogRotator.java:133)
      at hudson.model.Job.logRotate(Job.java:404)
      at hudson.model.Run.execute(Run.java:1655)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:237)

          [JENKINS-19776] Deadlock of AsyncFutureImpl.get() during massive submission of distributed jobs

          Oleg Nenashev created issue -
          Oleg Nenashev made changes -
          Assignee Original: huybrechts [ huybrechts ]
          Oleg Nenashev made changes -
          Link New: This issue is related to JENKINS-16679 [ JENKINS-16679 ]
          Jesse Glick made changes -
          Labels Original: remote New: remoting threads
          Jesse Glick made changes -
          Issue Type Original: New Feature [ 2 ] New: Bug [ 1 ]
          Oleg Nenashev made changes -
          Assignee New: Oleg Nenashev [ oleg_nenashev ]
          Oleg Nenashev made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Oleg Nenashev made changes -
          Resolution New: Cannot Reproduce [ 5 ]
          Status Original: In Progress [ 3 ] New: Resolved [ 5 ]
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 151295 ] New: JNJira + In-Review [ 193861 ]

            oleg_nenashev Oleg Nenashev
            oleg_nenashev Oleg Nenashev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: