Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48685

Deadlock when running a Multijob with multiple slaves

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • multijob-plugin
    • None

      After upgrading from 2.73.3 to 2.89.2 our Jenkins has started to experience deadlock.

      We use the Multijob plugin to run any number of other jobs that extend a common template. When the Multijob kicks off, it will spin up as many AWS slaves as it needs to run all of the child jobs in parallel (Test-Suites in the stack trace). Every time we run one of these Multijob jobs, Jenkins locks up.

      Attached is the deadlock stack traces from a thread dump.

      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 is in deadlock with Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307
      
      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 - priority:5 - threadId:0x00007f8fe4118800 - nativeId:0x3455 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008cdab698> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      
      Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307 - priority:5 - threadId:0x00007f8ff868e000 - nativeId:0x32e9 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008d744a90> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)

      We tried downgrading Jenkins again, but we had already updated all of the other plugins and after downgrading the majority of the plugins were not compatible.

            Unassigned Unassigned
            ketchumm Mark Ketchum
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: