Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52842

xUnit plugin blocks PingThread responses

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      2.3.8

      Description

      If your xUnit parsing takes a long time (don't ask), the entire build agent can be kicked offline, by passing the timeout threshhold on pings (default 4 minutes).

       

       

      Agent:

      12:39:02 [xUnit] [INFO] - [NUnit-2 (default)] - 110 test report file(s) were found with the pattern '*/TestResult.xml' relative to '/home/builder/jenkins/workspace/foo' for the testing framework 'NUnit-2 (default)'.
      12:44:28 [xUnit] [ERROR] - The plugin hasn't been performed correctly: Remote call on JNLP4-connect connection from 1.1.1.1/1.1.1.1:1880 failed

      Server:

      INFO: Ping failed. Terminating the channel JNLP4-connect connection from 1.1.1.1/1.1.1.1:7468.
      java.util.concurrent.TimeoutException: Ping started at 1532979041327 hasn't completed by 1532979281327
      at hudson.remoting.PingThread.ping(PingThread.java:134)
      at hudson.remoting.PingThread.run(PingThread.java:90)

        Attachments

          Activity

          Hide
          nfalco Nikolas Falco added a comment -

          Please keep in mind that transformation is a CPU intensive operation. If you have complex XSLT, large XML report and heavy volumn, 100% CPU may be expected this maybe cause JVM threads agent freeze and not serve the PingThread.

          Let me make some try threadpool, sleep time after 10 transformation update saxon libraries...

          Show
          nfalco Nikolas Falco added a comment - Please keep in mind that transformation is a CPU intensive operation. If you have complex XSLT, large XML report and heavy volumn, 100% CPU may be expected this maybe cause JVM threads agent freeze and not serve the PingThread. Let me make some try threadpool, sleep time after 10 transformation update saxon libraries...
          Hide
          nfalco Nikolas Falco added a comment -

          Reopen if happens again

          Show
          nfalco Nikolas Falco added a comment - Reopen if happens again
          Hide
          johnlengeling John Lengeling added a comment -

          Nikolas,

          Looks like I have run into this issue.   We generate a lot of XML files..sometimes 800+.  Looks like the ping thread killed it.

          We are running Jenkins 2.204.1 and xunit 2.37.  Is there a workaround or a snapshot to test?

          Thanks!

           

          Error message:

          Processing xunit results failed, archiving test result files foo/wr8-64/build/_TestArtifacts*/*.xml for troubleshooting
          [Pipeline] archiveArtifacts
          10:07:27 EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82) was marked offline: Connection was broken: java.util.concurrent.TimeoutException: Ping started at 1580227324607 hasn't completed by 1580227564613
          10:07:27 at hudson.remoting.PingThread.ping(PingThread.java:133)
          10:07:27 at hudson.remoting.PingThread.run(PingThread.java:89)
          {{10:07:27 }}
          {{[Pipeline] }}}
          [Pipeline] // script
          Error when executing always post condition:
          java.lang.IllegalArgumentException: Failed to prepare archiveArtifacts step
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeDescribable(DSL.java:419)}}
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:182)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)}}
          {{ at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)}}
          {{ at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)}}
          {{ at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)}}
          {{ at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)}}
          {{ at com.fooCorp.pipeline.fooTool.publishUnitTests(fooTool.groovy:597)}}
          {{ at com.fooCorp.pipeline.fooTool.publishfooToolReports(fooTool.groovy:666)}}
          {{ at com.fooCorp.pipeline.fooTool.publish(fooTool.groovy:610)}}
          {{ at WorkflowScript.run(WorkflowScript:61)}}
          {{ at __cps.transform__(Native Method)}}
          {{ at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)}}
          {{ at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)}}
          {{ at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)}}
          {{ at sun.reflect.GeneratedMethodAccessor960.invoke(Unknown Source)}}
          {{ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
          {{ at java.lang.reflect.Method.invoke(Method.java:498)}}
          {{ at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)}}
          {{ at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55)}}
          {{ at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45)}}
          {{ at sun.reflect.GeneratedMethodAccessor988.invoke(Unknown Source)}}
          {{ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
          {{ at java.lang.reflect.Method.invoke(Method.java:498)}}
          {{ at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)}}
          {{ at com.cloudbees.groovy.cps.impl.LocalVariableBlock$LocalVariable.get(LocalVariableBlock.java:39)}}
          {{ at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)}}
          {{ at com.cloudbees.groovy.cps.impl.LocalVariableBlock.evalLValue(LocalVariableBlock.java:28)}}
          {{ at com.cloudbees.groovy.cps.LValueBlock$BlockImpl.eval(LValueBlock.java:55)}}
          {{ at com.cloudbees.groovy.cps.LValueBlock.eval(LValueBlock.java:16)}}
          {{ at com.cloudbees.groovy.cps.Next.step(Next.java:83)}}
          {{ at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)}}
          {{ at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)}}
          {{ at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)}}
          {{ at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)}}
          {{ at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)}}
          {{ at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)}}
          {{ at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:405)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:317)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:281)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)}}
          {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
          {{ at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)}}
          {{ at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)}}
          {{ at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)}}
          {{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
          {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
          {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
          {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
          {{ at java.lang.Thread.run(Thread.java:748)}}
          Caused by: org.codehaus.groovy.runtime.InvokerInvocationException: java.io.IOException: Unable to create live FilePath for EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82)
          {{ at org.jenkinsci.plugins.workflow.cps.CpsStepContext.replay(CpsStepContext.java:496)}}
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:317)}}
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeDescribable(DSL.java:417)}}
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:182)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)}}
          {{ at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)}}
          {{ at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)}}
          {{ at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)}}
          {{ at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)}}
          {{ ... 41 more}}
          Caused by: java.io.IOException: Unable to create live FilePath for EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82)
          {{ at org.jenkinsci.plugins.workflow.support.steps.FilePathDynamicContext.get(FilePathDynamicContext.java:64)}}
          {{ at org.jenkinsci.plugins.workflow.support.steps.FilePathDynamicContext.get(FilePathDynamicContext.java:47)}}
          {{ at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:94)}}
          {{ at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:138)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)}}
          {{ at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)}}
          {{ at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:67)}}
          {{ at org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:264)}}
          {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:263)}}
          {{ ... 48 more}}
          {{ Suppressed: hudson.model.Computer$TerminationRequest: Termination requested at Tue Jan 28 10:06:04 CST 2020 by Thread[Ping thread for channel hudson.remoting.Channel@1f0be3d7:EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82),5,main] [id=79994]}}
          {{ at hudson.model.Computer.recordTermination(Computer.java:226)}}
          {{ at hudson.model.Computer.disconnect(Computer.java:490)}}
          {{ at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:727)}}
          {{ at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:198)}}
          {{ at hudson.remoting.PingThread.ping(PingThread.java:133)}}
          {{ at hudson.remoting.PingThread.run(PingThread.java:89)}}

           

          Show
          johnlengeling John Lengeling added a comment - Nikolas, Looks like I have run into this issue.   We generate a lot of XML files..sometimes 800+.  Looks like the ping thread killed it. We are running Jenkins 2.204.1 and xunit 2.37.  Is there a workaround or a snapshot to test? Thanks!   Error message: Processing xunit results failed, archiving test result files foo/wr8-64/build/_TestArtifacts*/*.xml for troubleshooting [Pipeline] archiveArtifacts 10:07:27 EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82) was marked offline: Connection was broken: java.util.concurrent.TimeoutException: Ping started at 1580227324607 hasn't completed by 1580227564613 10:07:27 at hudson.remoting.PingThread.ping(PingThread.java:133) 10:07:27 at hudson.remoting.PingThread.run(PingThread.java:89) {{10:07:27 }} {{ [Pipeline] }}} [Pipeline] // script Error when executing always post condition: java.lang.IllegalArgumentException: Failed to prepare archiveArtifacts step {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeDescribable(DSL.java:419)}} {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:182)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)}} {{ at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)}} {{ at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)}} {{ at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)}} {{ at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)}} {{ at com.fooCorp.pipeline.fooTool.publishUnitTests(fooTool.groovy:597)}} {{ at com.fooCorp.pipeline.fooTool.publishfooToolReports(fooTool.groovy:666)}} {{ at com.fooCorp.pipeline.fooTool.publish(fooTool.groovy:610)}} {{ at WorkflowScript.run(WorkflowScript:61)}} {{ at __ cps.transform __(Native Method)}} {{ at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)}} {{ at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)}} {{ at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)}} {{ at sun.reflect.GeneratedMethodAccessor960.invoke(Unknown Source)}} {{ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} {{ at java.lang.reflect.Method.invoke(Method.java:498)}} {{ at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)}} {{ at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55)}} {{ at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45)}} {{ at sun.reflect.GeneratedMethodAccessor988.invoke(Unknown Source)}} {{ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} {{ at java.lang.reflect.Method.invoke(Method.java:498)}} {{ at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)}} {{ at com.cloudbees.groovy.cps.impl.LocalVariableBlock$LocalVariable.get(LocalVariableBlock.java:39)}} {{ at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)}} {{ at com.cloudbees.groovy.cps.impl.LocalVariableBlock.evalLValue(LocalVariableBlock.java:28)}} {{ at com.cloudbees.groovy.cps.LValueBlock$BlockImpl.eval(LValueBlock.java:55)}} {{ at com.cloudbees.groovy.cps.LValueBlock.eval(LValueBlock.java:16)}} {{ at com.cloudbees.groovy.cps.Next.step(Next.java:83)}} {{ at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)}} {{ at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)}} {{ at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)}} {{ at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)}} {{ at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)}} {{ at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)}} {{ at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:405)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:317)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:281)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)}} {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}} {{ at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)}} {{ at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)}} {{ at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)}} {{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}} {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}} {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}} {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}} {{ at java.lang.Thread.run(Thread.java:748)}} Caused by: org.codehaus.groovy.runtime.InvokerInvocationException: java.io.IOException: Unable to create live FilePath for EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82) {{ at org.jenkinsci.plugins.workflow.cps.CpsStepContext.replay(CpsStepContext.java:496)}} {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:317)}} {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeDescribable(DSL.java:417)}} {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:182)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)}} {{ at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)}} {{ at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)}} {{ at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)}} {{ at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)}} {{ ... 41 more}} Caused by: java.io.IOException: Unable to create live FilePath for EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82) {{ at org.jenkinsci.plugins.workflow.support.steps.FilePathDynamicContext.get(FilePathDynamicContext.java:64)}} {{ at org.jenkinsci.plugins.workflow.support.steps.FilePathDynamicContext.get(FilePathDynamicContext.java:47)}} {{ at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:94)}} {{ at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:138)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)}} {{ at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)}} {{ at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:67)}} {{ at org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:264)}} {{ at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:263)}} {{ ... 48 more}} {{ Suppressed: hudson.model.Computer$TerminationRequest: Termination requested at Tue Jan 28 10:06:04 CST 2020 by Thread [Ping thread for channel hudson.remoting.Channel@1f0be3d7:EC2 (foo-es-aws) - team-foo.fooTool-large (i-06488c01860a95d82),5,main] [id=79994] }} {{ at hudson.model.Computer.recordTermination(Computer.java:226)}} {{ at hudson.model.Computer.disconnect(Computer.java:490)}} {{ at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:727)}} {{ at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:198)}} {{ at hudson.remoting.PingThread.ping(PingThread.java:133)}} {{ at hudson.remoting.PingThread.run(PingThread.java:89)}}  
          Hide
          johnlengeling John Lengeling added a comment -

          We are seeing the pingthread killing xunit processsing as described.   Provided stacktrace and error mesages.

          Show
          johnlengeling John Lengeling added a comment - We are seeing the pingthread killing xunit processsing as described.   Provided stacktrace and error mesages.
          Hide
          nfalco Nikolas Falco added a comment - - edited

          Which step is at fooTool.groovy:597?

          I have no idea what to do, and the reason is that not only do I not know the exact point where it is blocked but also the reason. The only thing I can assume is that the CPU is busy by the JVM by the XSLT. The only test I can do is add a sort of thread sleep after 50 transformations.
           
           
          Obviously this change may have no effect if the ping thread only monitors before and after the execution of a callable (means the whole step)

          Show
          nfalco Nikolas Falco added a comment - - edited Which step is at fooTool.groovy:597? I have no idea what to do, and the reason is that not only do I not know the exact point where it is blocked but also the reason. The only thing I can assume is that the CPU is busy by the JVM by the XSLT. The only test I can do is add a sort of thread sleep after 50 transformations.     Obviously this change may have no effect if the ping thread only monitors before and after the execution of a callable (means the whole step)
          Show
          nfalco Nikolas Falco added a comment - Please try with this: https://ci.jenkins.io/job/Plugins/job/xunit-plugin/job/feature%252FJENKINS-52842/2/artifact/org/jenkins-ci/plugins/xunit/2.3.8-rc831.cbb77af6dfed/xunit-2.3.8-rc831.cbb77af6dfed.hpi
          Hide
          johnlengeling John Lengeling added a comment -

          Sorry I sent you the stacktrace from the job console which is after the job is aborted. fooTool.groovy is our Jenkins pipeline library which calls has a publishUnitTests method which calls XUnitBuilder.

          {{
          try {
          steps.step([$class : 'XUnitBuilder', testTimeMargin: '3000', thresholdMode: 2,
          thresholds: [
          [
          $class : 'FailedThreshold',
          failureNewThreshold : '100',
          failureThreshold : '100',
          unstableNewThreshold: '100',
          unstableThreshold : '100'
          ],
          [
          $class : 'SkippedThreshold',
          failureNewThreshold : '100',
          failureThreshold : '100',
          unstableNewThreshold: '100',
          unstableThreshold : '100'
          ]
          ],
          tools : [
          [
          $class : 'GoogleTestType',
          deleteOutputFiles : true,
          failIfNotNew : false,
          pattern : filePattern,
          skipNoTestFiles : true,
          stopProcessingIfError: true
          ]
          ]
          ])

          }}

          Here is the thread dump of the thread that is hanging in xunit:

          {{
          "org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution 2624 / waiting for EC2 (foo-aws) - team-bar-large (i-086328a69a4a0b130) id=8001805" daemon prio=5 TIMED_WAITING
          java.lang.Object.wait(Native Method)
          hudson.remoting.Request.call(Request.java:177)
          hudson.remoting.Channel.call(Channel.java:954)
          hudson.FilePath.act(FilePath.java:1069)
          hudson.FilePath.act(FilePath.java:1058)
          org.jenkinsci.plugins.xunit.XUnitProcessor.processTestsReport(XUnitProcessor.java:195)
          org.jenkinsci.plugins.xunit.XUnitProcessor.process(XUnitProcessor.java:159)
          org.jenkinsci.plugins.xunit.XUnitBuilder.perform(XUnitBuilder.java:126)
          org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:80)
          org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:67)
          }}

          Show
          johnlengeling John Lengeling added a comment - Sorry I sent you the stacktrace from the job console which is after the job is aborted. fooTool.groovy is our Jenkins pipeline library which calls has a publishUnitTests method which calls XUnitBuilder. {{ try { steps.step([$class : 'XUnitBuilder', testTimeMargin: '3000', thresholdMode: 2, thresholds: [ [ $class : 'FailedThreshold', failureNewThreshold : '100', failureThreshold : '100', unstableNewThreshold: '100', unstableThreshold : '100' ], [ $class : 'SkippedThreshold', failureNewThreshold : '100', failureThreshold : '100', unstableNewThreshold: '100', unstableThreshold : '100' ] ], tools : [ [ $class : 'GoogleTestType', deleteOutputFiles : true, failIfNotNew : false, pattern : filePattern, skipNoTestFiles : true, stopProcessingIfError: true ] ] ]) }} Here is the thread dump of the thread that is hanging in xunit: {{ "org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution 2624 / waiting for EC2 (foo-aws) - team-bar-large (i-086328a69a4a0b130) id=8001805" daemon prio=5 TIMED_WAITING java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) hudson.FilePath.act(FilePath.java:1069) hudson.FilePath.act(FilePath.java:1058) org.jenkinsci.plugins.xunit.XUnitProcessor.processTestsReport(XUnitProcessor.java:195) org.jenkinsci.plugins.xunit.XUnitProcessor.process(XUnitProcessor.java:159) org.jenkinsci.plugins.xunit.XUnitBuilder.perform(XUnitBuilder.java:126) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:80) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:67) }}
          Hide
          johnlengeling John Lengeling added a comment -

          Will test the plugin version that you provided.

          Show
          johnlengeling John Lengeling added a comment - Will test the plugin version that you provided.
          Hide
          johnlengeling John Lengeling added a comment -

          Nickolas,

          I had 7 successful builds before the pingThread killed the node during xunit processing when running xunit version 2.3.8-rc831.cbb77af6dfed.

           

          Console Output:Console Output:

          [2020-02-10T13:35:50.172Z] WARNING: XUnitBuilder step is deprecated since 2.x, it has been replaced by XUnitPublisher. This builer will be remove in version 3.x
          [2020-02-10T13:35:50.175Z] INFO: Starting to record.
          [2020-02-10T13:35:50.175Z] INFO: Processing GoogleTest-1.8[2020-02-10T13:35:50.250Z] INFO: [GoogleTest-1.8] - 959 test report file(s) were found with the pattern 'j/wr/build/_TestArtifacts*/*.xml' relative to '/home/jenkins/workspace/kb/os' for the testing framework 'GoogleTest-1.8'.
          

          Theads related to this aws node i-0b11d0f08d8d59c51:

          dev-large (i-0b11d0f08d8d59c51) 
          "org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution [#93] / waiting for EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) id=166000" daemon prio=5 TIMED_WAITING java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) hudson.FilePath.act(FilePath.java:1069) hudson.FilePath.act(FilePath.java:1058) org.jenkinsci.plugins.xunit.XUnitProcessor.processTestsReport(XUnitProcessor.java:195) org.jenkinsci.plugins.xunit.XUnitProcessor.process(XUnitProcessor.java:159) org.jenkinsci.plugins.xunit.XUnitBuilder.perform(XUnitBuilder.java:126) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:80) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:67) org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47) org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$$Lambda$339/847132060.run(Unknown Source) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) java.util.concurrent.FutureTask.run(FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.lang.Thread.run(Thread.java:748)
           
          "Channel reader thread: EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51)" daemon prio=5 WAITING java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) com.trilead.ssh2.channel.FifoBuffer.read(FifoBuffer.java:212) com.trilead.ssh2.channel.Channel$Output.read(Channel.java:127) com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:933) com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58) com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79) hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:91) hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
           
          "Monitoring EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) for Remoting Version / waiting for EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) id=166965" daemon prio=5 TIMED_WAITING java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) hudson.plugin.versioncolumn.VersionMonitor$1.monitor(VersionMonitor.java:58) hudson.plugin.versioncolumn.VersionMonitor$1.monitor(VersionMonitor.java:55) hudson.node_monitors.AbstractNodeMonitorDescriptor.monitor(AbstractNodeMonitorDescriptor.java:154) hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:306)
          

           

          Show
          johnlengeling John Lengeling added a comment - Nickolas, I had 7 successful builds before the pingThread killed the node during xunit processing when running xunit version  2.3.8-rc831.cbb77af6dfed.   Console Output:Console Output: [2020-02-10T13:35:50.172Z] WARNING: XUnitBuilder step is deprecated since 2.x, it has been replaced by XUnitPublisher. This builer will be remove in version 3.x [2020-02-10T13:35:50.175Z] INFO: Starting to record. [2020-02-10T13:35:50.175Z] INFO: Processing GoogleTest-1.8[2020-02-10T13:35:50.250Z] INFO: [GoogleTest-1.8] - 959 test report file(s) were found with the pattern 'j/wr/build/_TestArtifacts*/*.xml' relative to '/home/jenkins/workspace/kb/os' for the testing framework 'GoogleTest-1.8'. Theads related to this aws node i-0b11d0f08d8d59c51: dev-large (i-0b11d0f08d8d59c51)  "org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution [#93] / waiting for EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) id=166000" daemon prio=5 TIMED_WAITING java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) hudson.FilePath.act(FilePath.java:1069) hudson.FilePath.act(FilePath.java:1058) org.jenkinsci.plugins.xunit.XUnitProcessor.processTestsReport(XUnitProcessor.java:195) org.jenkinsci.plugins.xunit.XUnitProcessor.process(XUnitProcessor.java:159) org.jenkinsci.plugins.xunit.XUnitBuilder.perform(XUnitBuilder.java:126) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:80) org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:67) org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47) org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$$Lambda$339/847132060.run(Unknown Source) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) java.util.concurrent.FutureTask.run(FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.lang.Thread.run(Thread.java:748)   "Channel reader thread: EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51)" daemon prio=5 WAITING java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) com.trilead.ssh2.channel.FifoBuffer.read(FifoBuffer.java:212) com.trilead.ssh2.channel.Channel$Output.read(Channel.java:127) com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:933) com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58) com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79) hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:91) hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)   "Monitoring EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) for Remoting Version / waiting for EC2 (foo-aws) - dev-large (i-0b11d0f08d8d59c51) id=166965" daemon prio=5 TIMED_WAITING java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) hudson.plugin.versioncolumn.VersionMonitor$1.monitor(VersionMonitor.java:58) hudson.plugin.versioncolumn.VersionMonitor$1.monitor(VersionMonitor.java:55) hudson.node_monitors.AbstractNodeMonitorDescriptor.monitor(AbstractNodeMonitorDescriptor.java:154) hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:306)  
          Hide
          nfalco Nikolas Falco added a comment -

          I can try to increase the sleep time after 20 processed input files and maybe also try to setup the maximun number of thread used by saxon. By default saxon use the maximun number of thread to make CPU full load.

          Show
          nfalco Nikolas Falco added a comment - I can try to increase the sleep time after 20 processed input files and maybe also try to setup the maximun number of thread used by saxon. By default saxon use the maximun number of thread to make CPU full load.
          Hide
          johnlengeling John Lengeling added a comment -

          Other than the 1 hung job soon after I loaded xunit-2.3.8-rc831.cbb77af6dfed, the issue looks to be improved.  

          The test job has been running hourly for the past 2 days with 0 hangs versus several hangs per day with 2.3.7.  So you might be on the right track.

          I did also increase the ping thread interval and timeout values to 1500/1200, but I have now restored the ping thread interval/timeout values back to the defaults (300/240).  I will let that bake for a few days.

           

          Show
          johnlengeling John Lengeling added a comment - Other than the 1 hung job soon after I loaded xunit-2.3.8-rc831.cbb77af6dfed, the issue looks to be improved.   The test job has been running hourly for the past 2 days with 0 hangs versus several hangs per day with 2.3.7.  So you might be on the right track. I did also increase the ping thread interval and timeout values to 1500/1200, but I have now restored the ping thread interval/timeout values back to the defaults (300/240).  I will let that bake for a few days.  
          Hide
          nfalco Nikolas Falco added a comment -

          I had complete the work. The sleep time is not configurable so a futher changes in the code should not be needed.

          To be back compatible is 0ms for old configurations (is not the case of pipelines). By default for new definition is *10*ms.

          John Lengeling output log warn a message that you are using a deprecated class and you instantiate it by reflection by means pipeline (steps.step([$class : 'XUnitBuilder', testTimeMargin: '3000', ....) that is subject any class changes. I heavy suggest to replace with xunit step (that is not XUnitBuilder) that manage the configuration of sleep parameter (XUnitBuilder does not).

          If I do not remember bad you had test 20ms each 20 processed report. The default is now 10ms every 10 processed reports.

          xunit thresholdMode: 2, thresholds: [failed(failureThreshold: '100', unstableThreshold: '100'), skipped(failureThreshold: '100', unstableThreshold: '100')], tools: [GoogleTest(deleteOutputFiles: true, failIfNotNew: false, pattern: $filePattern, skipNoTestFiles: true, stopProcessingIfError: true)], sleepTime: 15
          
          Show
          nfalco Nikolas Falco added a comment - I had complete the work. The sleep time is not configurable so a futher changes in the code should not be needed. To be back compatible is 0ms for old configurations (is not the case of pipelines). By default for new definition is *10*ms. John Lengeling output log warn a message that you are using a deprecated class and you instantiate it by reflection by means pipeline ( steps.step([$class : 'XUnitBuilder', testTimeMargin: '3000', .... ) that is subject any class changes. I heavy suggest to replace with xunit step (that is not XUnitBuilder) that manage the configuration of sleep parameter (XUnitBuilder does not). If I do not remember bad you had test 20ms each 20 processed report. The default is now 10ms every 10 processed reports. xunit thresholdMode: 2, thresholds: [failed(failureThreshold: '100' , unstableThreshold: '100' ), skipped(failureThreshold: '100' , unstableThreshold: '100' )], tools: [GoogleTest(deleteOutputFiles: true , failIfNotNew: false , pattern: $filePattern, skipNoTestFiles: true , stopProcessingIfError: true )], sleepTime: 15

            People

            Assignee:
            nfalco Nikolas Falco
            Reporter:
            directhex Jo Shields
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: