Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65501

high queue lock contention when provisioning large number of k8s nodes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • core, kubernetes-plugin
    • None
    • Jenkins 2.277.3
      Kubernetes Plugin 1.29.2
    • 2.294

      When requesting many nodes at once, provisioning locks Queue and prevents regular calls to maintain. Jobs are stuck for minutes.

      In our case we have jobs which request 100+ nodes of different type from k8s. Each REST call to k8s api takes ~2sec. All of them are executed within one withLock which basically blocks everything else from happening on jenkins for that time. To make it worse it seems it then recurses down and does the same again.

      It even gets worse when cluster is at high load and pods can not be scheduled anymore, then it seems like waiting for the pod startup timeout also adds to the time.

      As soon as the nodes are available or load decreases, calls to maintain get back to normal levels.

            raihaan Raihaan Shouhell
            scddev Dietmar Scheidl
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: