Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73476

parallel stages on different nodes scale sh commands poorly


    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None

      We have a primary build pipeline that uses workflow-cps parallel step to run around ~50 pods (this behavior is true also to static linux agents, not just pods/containers) at the same time, and it's heavily sh dependent. 
      As more agents are added to the parallel step, sh commands take longer to complete if they are wrapped in groovy script.

      I've simplified the pipeline so it will be easier to reproduce:

      pipeline {
          agent none
          environment {
              POD_YAML = """
      apiVersion: v1
      kind: Pod
        name: test-pod
          - name: test-build
            image: ubuntu
                memory: 4000Mi
                cpu: 4
                memory: 4000Mi
                cpu: 4
            command: ['sleep']
            args: ['6h']
            tty: true
          stages {
              stage('Build and Test') {
                  parallel {
                      stage('Stage1') {
                          agent {
                              kubernetes {
                                  yaml env.POD_YAML
                          steps {
                              container('test-build') {
                                  script {
                                      for (int i = 0; i < 30; i++) {
                                          sh 'cat /etc/hosts'
                      stage('Stage2') {
                          agent {
                              kubernetes {
                                  yaml env.POD_YAML
                          steps {
                              container('test-build') {
                                  script {
                                      for (int i = 0; i < 30; i++) {
                                          sh 'cat /etc/hosts'
                      stage('Stage3') {
                          agent {
                              kubernetes {
                                  yaml env.POD_YAML
                          steps {
                              container('test-build') {
                                  script {
                                      for (int i = 0; i < 30; i++) {
                                          sh 'cat /etc/hosts'
                       // ....
                      // .....
                     // you can keep adding stages



      My results are as follows:
      7 stages: ~50 seconds for each agent to finish
      17 stages: ~120 seconds for each agent to finish
      35 stages: ~270 for each agent to finish

      If I leave only one stage at the parallel part, then run the same build X50 times at the same time, each build finishes very fast (around 15 seconds)
      So I don't think it's a general load issue, but rather maybe parallel step is throttling the sh response in some way, making them hang. (maybe CpsFlowExecution thread?)

      P.S: "Default Speed / Durability LevelDefault Speed / Durability Level" is already set to performance optimized. 

            Unassigned Unassigned
            simpleniko Niko
            1 Vote for this issue
            3 Start watching this issue
