Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56939

pipeline gets stuck: root cause unclear

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • kubernetes-plugin
    • None
    • Jenkins ver. 2.164.1, running on kuberenetes pod with 6 Cpu request and alot of memory.
      Kubernetes plugin 1.14.10-SNAPSHOT 8/4/19 (also tested with 1.14.9)

    Description

      I'm running following pipeline in 10 build at the same time (so 10 *30 = 300 "stages")

      def label = "v5-$USER"
      def batchList = "batchList"
      def batchListArray = []
      
      timestamps {
      podTemplate(
          label: label,
          serviceAccount: "jenkins",
          idleMinutes: 0,
          podRetention: never(),
          containers: [
              containerTemplate(
                  name: 'kubectl',
                  image: 'lachlanevenson/k8s-kubectl:v1.10.7',
                  ttyEnabled: true,
                  command: 'cat'
              ),
              containerTemplate(
                  name: 'helm',
                  image: 'lachlanevenson/k8s-helm:v2.9.1',
                  ttyEnabled: true,
                  command: 'cat'
              )
          ]
      ){node(label) {
          stage('Setup') {
              stage('Get list batches') {
                  container("kubectl") {
                      sh """
                          echo 1 > $batchList
                          echo 2 >> $batchList
                          echo 3 >> $batchList
                          echo 4 >> $batchList
                          echo 5 >> $batchList
                          echo 6 >> $batchList
                          echo 7 >> $batchList
                          echo 8 >> $batchList
                          echo 9 >> $batchList
                          echo 10 >> $batchList
                          echo 11 >> $batchList
                          echo 12 >> $batchList
                          echo 13 >> $batchList
                          echo 14 >> $batchList
                          echo 15 >> $batchList
                          echo 16 >> $batchList
                          echo 17 >> $batchList
                          echo 18 >> $batchList
                          echo 19 >> $batchList
                          echo 20 >> $batchList
                          echo 21 >> $batchList
                          echo 22 >> $batchList
                          echo 23 >> $batchList
                          echo 24 >> $batchList
                          echo 25 >> $batchList
                          echo 26 >> $batchList
                          echo 27 >> $batchList
                          echo 28 >> $batchList
                          echo 29 >> $batchList
                          echo 30 >> $batchList
                      """
                  }            batchListFile = readFile "$batchList"
                  lines = batchListFile.readLines()
                  for (batch in lines) {
                      batchListArray.add(batch)
                  }
              }
          }    stage ('Parallel') {
              def parallelData = [:]        batchListArray.each { batch ->
                  parallelData[batch] = {
                      stage("Deploy - $batch") {
                          container("helm") {
                              sh """
                                  sleep 5
                              """
                          }
                      }                stage("Queue - $batch") {
                          container("kubectl") {
                              sh """
                                  sleep 600
                              """
                          }
                      }
                  }
              }
              parallel parallelData
          }    stage('Clean up') {
              container("helm") {
                  sh """
                      sleep 10
                  """
              }
          }
      }
      }
      }
      

      This is a simplified version of the real pipeline we're running.

      For 1 build, this runs fine, but when you run 10 builds at once, it starts degrading.

      What is observed is that the first job is quickly able to launch all the sleeps, even the sleeps in the later stages of the pipeline, but when the amount of build increase, the stages don't seem to be able to start any longer, slowing down the build drastically and failing eventually.

      Note that the agent pods and containers are starting fine.

      Also:   java.io.IOException: Timed out waiting for websocket connection. You should increase the value of system property org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout currently set at 30 seconds
      		at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.doLaunch(ContainerExecDecorator.java:369)
      		at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.launch(ContainerExecDecorator.java:252)
      		at hudson.Launcher$ProcStarter.start(Launcher.java:455)
      		at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:206)
      		at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:99)
      		at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:305)
      		at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:268)
      		at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:176)
      		at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      		at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      		at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      		at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
      		at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      		at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      		at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
      		at WorkflowScript.run(WorkflowScript:88)
      java.io.IOException: Timed out waiting for websocket connection. You should increase the value of system property org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout currently set at 30 seconds
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.doLaunch(ContainerExecDecorator.java:369)
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.launch(ContainerExecDecorator.java:252)
      	at hudson.Launcher$ProcStarter.start(Launcher.java:455)
      	at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:206)
      	at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:99)
      	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:305)
      	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:268)
      	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:176)
      	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:810)
      	at groovy.lang.GroovyObjectSupport.invokeMethod(GroovyObjectSupport.java:46)
      	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
      	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      	at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
      	at WorkflowScript.run(WorkflowScript:80)
      	at ___cps.transform___(Native Method)
      	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
      	at sun.reflect.GeneratedMethodAccessor274.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:182)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Finished: FAILURE
      

      So, with the latest SNAPSHOT, you do get following output:

      Timed out waiting for websocket connection. You should increase the value of system property org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout currently set at 30 seconds

      The strange thing is that the node/pod and its containers are already there, so don't see a good reason why it would help?

       

      I've tried many different settings and env tweaks, but none seem to make any difference.

      I'm out of ideas on how to do proper root cause analysis on this, any ideas?

      br,

      Jan

      ps: When I run a similar pipeline, but no containers, all 'on the master', there is no degradation in performance seen:

      def batchList = "batchList"
      def batchListArray = []
      
      node('master') {
          stage('Setup') {
              stage('Get list batches') {
                      sh """
                          echo 1 > $batchList
                          echo 2 >> $batchList
                          echo 3 >> $batchList
                          echo 4 >> $batchList
                          echo 5 >> $batchList
                          echo 6 >> $batchList
                          echo 7 >> $batchList
                          echo 8 >> $batchList
                          echo 9 >> $batchList
                          echo 10 >> $batchList
                          echo 11 >> $batchList
                          echo 12 >> $batchList
                          echo 13 >> $batchList
                          echo 14 >> $batchList
                          echo 15 >> $batchList
                          echo 16 >> $batchList
                          echo 17 >> $batchList
                          echo 18 >> $batchList
                          echo 19 >> $batchList
                          echo 20 >> $batchList
                          echo 21 >> $batchList
                          echo 22 >> $batchList
                          echo 23 >> $batchList
                          echo 24 >> $batchList
                          echo 25 >> $batchList
                          echo 26 >> $batchList
                          echo 27 >> $batchList
                          echo 28 >> $batchList
                          echo 29 >> $batchList
                          echo 30 >> $batchList
                      """
                  batchListFile = readFile "$batchList"
                  lines = batchListFile.readLines()
                  for (batch in lines) {
                      batchListArray.add(batch)
                  }
              }
          }    stage ('Parallel') {
              def parallelData = [:]        batchListArray.each { batch ->
                  parallelData[batch] = {
                      stage("Deploy - $batch") {
                              sh """
                                  sleep 5
                              """
                      }                stage("Queue - $batch") {
                              sh """
                                  sleep 600
                              """
                      }
                  }
              }
              parallel parallelData
          }    stage('Clean up') {
                  sh """
                      sleep 10
                  """
          }
      }
      

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              heyleke Jan Heylen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: