Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-58161

Agent time out with P4Groovy sync after 5 minutes?

    • Icon: Improvement Improvement
    • Resolution: Postponed
    • Icon: Major Major
    • p4-plugin
    • 1.10.0 P4Plugin
      2.176.1 (Cloudbees)

      When using 'p4.run("sync","//PATH/...") the Windows 10 slave drops the connection after 5 minutes every time. This was proven on a Webex using Wireshark.

      If 'p4sync()' is used the command ran to completion (20 minutes). Potentially there is a problem with P4Groovy commands not displaying any output till the end of the command that is triggering this problem.

      This ticket has been created to record the occurrence, allow me to try and reproduce the problem and to be a place holder where other reports of the same problem can be recorded.

      Note: Cloudbees support were unable to find a problem on their side.

       

      Example jenkinsfile (in editor):

      pipeline{
          agent{
              label 'Win10'
          }
          stages{
              stage('P4 Sync'){
                  steps{
                      script{
                          def p4 = p4 credential: 'MasterCredential', workspace: manualSpec(charset: 'utf8', name: 'CLIENT', pinHost: false, spec: clientSpec(allwrite: true, backup: false, clobber: false, compress: false, line: 'WIN', locked: false, modtime: false, rmdir: false, streamName: '', type: 'WRITABLE', view: '//depot/... //CLIENT/...'))
                          p4.run("sync","-f", "//depot/PATH/...#0")
                          p4.run("sync","-f", "//depot/PATH/...")
                      }
                  }
              }
          }
      }
      

          [JENKINS-58161] Agent time out with P4Groovy sync after 5 minutes?

          I'm running into this variation when running P4Groovy:

          It times out after 30 seconds if there's no output because of this hard-coded beauty:

           

          https://github.com/jenkinsci/remoting/blob/master/src/main/java/hudson/remoting/Request.java:171

            // wait until the response arrives
          t.setName(name+" / waiting for "channel.getName()" id="+id);
          while(response==null && !channel.isInClosed())
          // I don't know exactly when this can happen, as pendingCalls are cleaned up by Channel,
          // but in production I've observed that in rare occasion it can block forever, even after a channel
          // is gone. So be defensive against that.
          wait(30*1000);

          if (response==null)
          // channel is closed and we still don't have a response
          throw new RequestAbortedException(null);

          Alexander Boczar added a comment - I'm running into this variation when running P4Groovy: It times out after 30 seconds if there's no output because of this hard-coded beauty:   https://github.com/jenkinsci/remoting/blob/master/src/main/java/hudson/remoting/Request.java :171   // wait until the response arrives t.setName(name+" / waiting for " channel.getName() " id="+id); while(response==null && !channel.isInClosed()) // I don't know exactly when this can happen, as pendingCalls are cleaned up by Channel, // but in production I've observed that in rare occasion it can block forever, even after a channel // is gone. So be defensive against that. wait(30*1000); if (response==null) // channel is closed and we still don't have a response throw new RequestAbortedException(null);

          Jonathan Hurtado added a comment - - edited

          I wanted to add that this issue also occurs when running a p4groovy.run integrate command.

          > def integrateResults = perforce.run("integ""-c${changeListNum{color}}""//${SOURCE_BRANCH{color}}/...""//${TARGET_BRANCH{color}}/...")

          The merge lasts longer than 5 minutes because there are so many files to integrate, and, as a result, the build process is aborted.

          If we can change that hard-coded timeout to a setting that we can modify, that would be ideal.  Until then, my current workaround is to call a bat script that calls p4 integ.

          Jonathan Hurtado added a comment - - edited I wanted to add that this issue also occurs when running a p4groovy.run integrate command. > def  integrateResults = perforce.run( "integ" ,  "-c ${ changeListNum{color}}" ,  "// ${ SOURCE_BRANCH{color}}/..." ,  "// ${ TARGET_BRANCH{color}}/..." ) The merge lasts longer than 5 minutes because there are so many files to integrate, and, as a result, the build process is aborted. If we can change that hard-coded timeout to a setting that we can modify, that would be ideal.  Until then, my current workaround is to call a bat script that calls p4 integ.

          Emanuel May added a comment -

          Hello,

          We also run into this timeout problem when using p4.run with longer-running commands, like merge, integrate, reconcile or submit. Right now we are forced to use the command line for those cases. We would prefer to use p4.run though, as we get easier to parse return objects from the api.

          Thanks!

          Emanuel May added a comment - Hello, We also run into this timeout problem when using p4.run with longer-running commands, like merge, integrate, reconcile or submit. Right now we are forced to use the command line for those cases. We would prefer to use p4.run though, as we get easier to parse return objects from the api. Thanks!

          Paul Allen added a comment -

          An advanced setting in the Credential sets a 'tick' interval to print out a 'waiting' message to the log during long running commands. The value needs to be less than 30,000ms to avoid Jenkins automatic disconnect. Default value is 0, which disables the tick message.

          https://github.com/jenkinsci/p4-plugin/commit/5ad18cf7746fb47e58e842ddb3d7b023d5463735

          Paul Allen added a comment - An advanced setting in the Credential sets a 'tick' interval to print out a 'waiting' message to the log during long running commands. The value needs to be less than 30,000ms to avoid Jenkins automatic disconnect. Default value is 0, which disables the tick message. https://github.com/jenkinsci/p4-plugin/commit/5ad18cf7746fb47e58e842ddb3d7b023d5463735

          Keith Yates added a comment -

          The 'tick' doesn't occur on these run commands.

          Keith Yates added a comment - The 'tick' doesn't occur on these run commands.

          Karl Wirth added a comment -

          Hi kyates - Thanks. I confirm I have been able to reproduce this also so will raise the priority to highlight it to the developers.

           

          (1) Create a Linux P4D server that has the following trigger to delay 'p4 sizes' for 360 seconds:

                  Slow_Sizes command pre-user-sizes "sleep 360"
          

          (2) Under the Perforce credential set the 'Tick interval' to 10000 (ms).

          (3) Use the following Jenkinsfile code:

          node ('LinuxDesktop')
          {
          timestamps {
              
                  stage('Test') {
                  // Define workspace
                  def clientSpecName = 'jenkins-${JOB_NAME}'
                  def ws = [$class: 'ManualWorkspaceImpl', name: 'jenkins-${NODE_NAME}-${JOB_NAME}-src', spec: [view: '//depot/... //jenkins-${NODE_NAME}-${JOB_NAME}-src/...']]        // Create object
                  def p4 = p4(credential: 'JenkinsMaster', workspace: ws)
                  
                  def response = p4.run('info')
                  echo (response.toString())      
               
                  echo "Running sizes using a precommand trigger that delays for 6 minutes"
                  def resultOfSizes = p4.run('sizes','//depot/f1')
                  echo "Sizes of file in workspace '${clientSpecName}' ... done (${resultOfSizes})."
                  
                  }
            }
          }
          

          (4) Run job.

          (5) Wait for 6 minutes.

          (6) Job fails and console log shows.

          14:27:54  Running sizes using a precommand trigger that delays for 6 minutes
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // timestamps
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] End of Pipeline
          java.lang.InterruptedException
          	at java.lang.Object.wait(Native Method)
          	at hudson.remoting.Request.call(Request.java:177)
          	at hudson.remoting.Channel.call(Channel.java:997)
          	at hudson.FilePath.act(FilePath.java:1069)
          	at hudson.FilePath.act(FilePath.java:1058)
          	at org.jenkinsci.plugins.p4.groovy.P4Groovy.run(P4Groovy.java:62)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
          	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
          	at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:47)
          	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
          	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:163)
          	at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
          	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:157)
          	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:161)
          	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:165)
          	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
          	at WorkflowScript.run(WorkflowScript:24)
          	at ___cps.transform___(Native Method)
          	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
          	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
          	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
          	at sun.reflect.GeneratedMethodAccessor315.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
          	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
          	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
          	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
          	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
          	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
          	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
          	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
          	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
          	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
          	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:400)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:312)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:276)
          	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          	at java.lang.Thread.run(Thread.java:748)
          Finished: FAILURE
          

          (7) Modify the trigger to only sleep for 20 seconds:

               Slow_Sizes command pre-user-sizes "sleep 20"
          

          (8) Job now works OK because the delay is short.

           

           

           

           

          Karl Wirth added a comment - Hi kyates - Thanks. I confirm I have been able to reproduce this also so will raise the priority to highlight it to the developers.   (1) Create a Linux P4D server that has the following trigger to delay 'p4 sizes' for 360 seconds: Slow_Sizes command pre-user-sizes "sleep 360" (2) Under the Perforce credential set the 'Tick interval' to 10000 (ms). (3) Use the following Jenkinsfile code: node ( 'LinuxDesktop' ) { timestamps { stage( 'Test' ) { // Define workspace def clientSpecName = 'jenkins-${JOB_NAME}' def ws = [$class: 'ManualWorkspaceImpl' , name: 'jenkins-${NODE_NAME}-${JOB_NAME}-src' , spec: [view: ' //depot/... //jenkins-${NODE_NAME}-${JOB_NAME}-src/...' ]] // Create object def p4 = p4(credential: 'JenkinsMaster' , workspace: ws) def response = p4.run( 'info' ) echo (response.toString()) echo "Running sizes using a precommand trigger that delays for 6 minutes" def resultOfSizes = p4.run( 'sizes' , ' //depot/f1' ) echo "Sizes of file in workspace '${clientSpecName}' ... done (${resultOfSizes})." } } } (4) Run job. (5) Wait for 6 minutes. (6) Job fails and console log shows. 14:27:54 Running sizes using a precommand trigger that delays for 6 minutes [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // timestamps [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline java.lang.InterruptedException at java.lang. Object .wait(Native Method) at hudson.remoting.Request.call(Request.java:177) at hudson.remoting.Channel.call(Channel.java:997) at hudson.FilePath.act(FilePath.java:1069) at hudson.FilePath.act(FilePath.java:1058) at org.jenkinsci.plugins.p4.groovy.P4Groovy.run(P4Groovy.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022) at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:47) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:163) at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:157) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:161) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:165) at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) at WorkflowScript.run(WorkflowScript:24) at ___cps.transform___(Native Method) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83) at sun.reflect.GeneratedMethodAccessor315.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) at com.cloudbees.groovy.cps.Next.step(Next.java:83) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163) at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129) at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:400) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:312) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:276) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Finished: FAILURE (7) Modify the trigger to only sleep for 20 seconds: Slow_Sizes command pre-user-sizes "sleep 20" (8) Job now works OK because the delay is short.        

          Ian Boudreaux added a comment -

          I am hitting this issue as well using P4Groovy's p4.run with manual syncs on specific directories that can sometimes take longer than 5 minutes.

          p4karl Has there been any movement on this bug? It has been a few months since your last post.

          Ian Boudreaux added a comment - I am hitting this issue as well using P4Groovy's p4.run with manual syncs on specific directories that can sometimes take longer than 5 minutes. p4karl  Has there been any movement on this bug? It has been a few months since your last post.

          Karl Wirth added a comment -

          Hi ianboudreaux - not yet

          FYI p4paul - Can we include this one in the next sprint planning session please.

          Karl Wirth added a comment - Hi ianboudreaux - not yet FYI p4paul - Can we include this one in the next sprint planning session please.

          Paul Allen added a comment -

          I modified P4Groovy so that TaskListener was serialised (not transient).  The 'tick' is now streamed from a remote slave, however this does not prevent the InterruptedException.

          There seems to be a fixed 5min time out in CPS workflow:

          https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java#L181

          In the past proposals were added to extend this, but nothing committed... https://github.com/jenkinsci/workflow-cps-plugin/pull/313

          Paul Allen added a comment - I modified P4Groovy so that TaskListener was serialised (not transient).  The 'tick' is now streamed from a remote slave, however this does not prevent the InterruptedException. There seems to be a fixed 5min time out in CPS workflow: https://github.com/jenkinsci/workflow-cps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java#L181 In the past proposals were added to extend this, but nothing committed... https://github.com/jenkinsci/workflow-cps-plugin/pull/313

          Paul Allen added a comment - - edited

          Blocked by fixed 5min timeout on CPS workflow.

          https://issues.jenkins.io/browse/JENKINS-58817

          Paul Allen added a comment - - edited Blocked by fixed 5min timeout on CPS workflow. https://issues.jenkins.io/browse/JENKINS-58817

            p4paul Paul Allen
            p4karl Karl Wirth
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: