Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-45876

Jenkins becomes extremely slow while running a lot of tests parallel

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • pipeline
    • None
    • Jenkins Version: 2.46.1
      Java-Version: JRE1.8.0_131
      OS: Windows

      Detailed information: See the attachment

      We have got a big Jenkins Set-Up (1 Master, 5 Slaves, each 16 Executor-Slots) and running a lot of independent tests (~800 tests aka "nodes"). After upgrading to Jenkins 2 (LTS 2.46.1) we developed a Jenkins-File for our Continous-Integration Process.

      Each test structured similar:

      1. copying files and previous compiled application from master to slaves
      2. run application (.exe) (Output is redirected to a file)
      3. Checking content of the output
      4. write a junit-result-xml file

      If we split up the 800 tests into four stages (each 200 tests) the build works in general, but through the four stages other obvious environmental performance problems occure (e.g.: wait for the end of one stage before starting the new one). For this the build needs approx. 4h.

      To avoid the performance issues we try to run all tests (aka "nodes") in parallel instead of splitting them into stages. This results in a big build queue with about 700 tests. With this long queue the build starts much slower and becomes slower exponentially over its duration until a point is reached where one pipeline step (e.g.: "open a new declerative node" needs 1 minute). It seems as something within the "master process" which controls the pipeline is the crux in this matter since the slaves and their executors seem to wait until the "master process" gives them a new command and this needs extremely long. With this setup the build starts slower than the "four stage setup" and doesn't even finish after 40h.

      In our opinion the extremly slow Jenkins-Pipeline is caused by an unknown overload of the master process. There seems to be a direct connection between "nodes per stage" and "becoming slower" or "nodes_in_queue" and "becoming slower".

      Previous experiments:

      • More heap for the JRE
      • Other GC Settings (G1)
      • reduced stash-Operations to a minimum
      • reduced Output/logging to a minimum
      • Sorting build Queue (long tests at the beginning, short tests at the end)

      So far all without success. Jenkins still nearly hangs up after a while.

        1. environmentvariables.txt
          3 kB
        2. Plugins.pdf
          44 kB
        3. Systeminformation.txt
          5 kB
        4. thread_dump.txt
          131 kB

            Unassigned Unassigned
            johannes_b Johannes Benkert
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: