Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48685

Deadlock when running a Multijob with multiple slaves

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • multijob-plugin
    • None

      After upgrading from 2.73.3 to 2.89.2 our Jenkins has started to experience deadlock.

      We use the Multijob plugin to run any number of other jobs that extend a common template. When the Multijob kicks off, it will spin up as many AWS slaves as it needs to run all of the child jobs in parallel (Test-Suites in the stack trace). Every time we run one of these Multijob jobs, Jenkins locks up.

      Attached is the deadlock stack traces from a thread dump.

      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 is in deadlock with Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307
      
      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 - priority:5 - threadId:0x00007f8fe4118800 - nativeId:0x3455 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008cdab698> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      
      Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307 - priority:5 - threadId:0x00007f8ff868e000 - nativeId:0x32e9 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008d744a90> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)

      We tried downgrading Jenkins again, but we had already updated all of the other plugins and after downgrading the majority of the plugins were not compatible.

          [JENKINS-48685] Deadlock when running a Multijob with multiple slaves

            Unassigned Unassigned
            ketchumm Mark Ketchum
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: