• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • durable-task-plugin
    • Jenkins ver. 1.580.1.1-beta-6 (Jenkins Enterprise by CloudBees 14.11)

      org.jenkinsci.plugins.workflow.cps.steps.ParallelStepException: Parallel step long running test task failed
      	at org.jenkinsci.plugins.workflow.cps.steps.ParallelStep$ResultHandler$Callback.checkAllDone(ParallelStep.java:126)
      	at org.jenkinsci.plugins.workflow.cps.steps.ParallelStep$ResultHandler$Callback.onFailure(ParallelStep.java:105)
      	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$FailureAdapter.receive(CpsBodyExecution.java:295)
      	at com.cloudbees.groovy.cps.impl.ThrowBlock$1.receive(ThrowBlock.java:68)
      	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      	at com.cloudbees.groovy.cps.Next.step(Next.java:58)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:145)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:164)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:262)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$000(CpsThreadGroup.java:70)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:174)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:172)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:47)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      Caused by: hudson.AbortException: script returned exit code -1
      	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:205)
      	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:159)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
      	... 3 more
      Finished: FAILURE
      

          [JENKINS-25727] Occasional exit status -1 with long latencies

          The fake exit code "-1" signifies that the process has disappeared without leaving the exit code file behind.

          Looking into why this is the case.

          Kohsuke Kawaguchi added a comment - The fake exit code "-1" signifies that the process has disappeared without leaving the exit code file behind. Looking into why this is the case.

          varmenise confirmed that the master and a slave has a large latency between them, and that the issue happens about once in 10. So the race condition hypothesis feels more feasible.

          Kohsuke Kawaguchi added a comment - varmenise confirmed that the master and a slave has a large latency between them, and that the issue happens about once in 10. So the race condition hypothesis feels more feasible.

          Jesse Glick added a comment -

          88aed02 was apparently not enough. We have a PID for the wrapper script, and jenkins-result.txt has not been created, yet the wrapper script does not seem to be running any more. Unclear what leads to this situation.

          Jesse Glick added a comment - 88aed02 was apparently not enough. We have a PID for the wrapper script, and jenkins-result.txt has not been created, yet the wrapper script does not seem to be running any more. Unclear what leads to this situation.

          Jesse Glick added a comment -

          kohsuke suggests that ShellController.exitStatus hits a race condition: it calls exitStatus while the process is running, which returns null, then the process finishes, then isAlive is called and says it is not running.

          The probable fix is to recheck exitStatus before returning -1.

          Jesse Glick added a comment - kohsuke suggests that ShellController.exitStatus hits a race condition: it calls exitStatus while the process is running, which returns null, then the process finishes, then isAlive is called and says it is not running. The probable fix is to recheck exitStatus before returning -1.

          Jesse Glick added a comment -

          In fact he already attempted that fix in https://github.com/jenkinsci/durable-task-plugin/commit/10a3ebdc1e4825fd334cfe58ecf294c9384d5f06 though this is not complete.

          Jesse Glick added a comment - In fact he already attempted that fix in https://github.com/jenkinsci/durable-task-plugin/commit/10a3ebdc1e4825fd334cfe58ecf294c9384d5f06 though this is not complete.

          Jesse Glick added a comment -

          Released attempted fix in Durable Task plugin 1.0.

          Jesse Glick added a comment - Released attempted fix in Durable Task plugin 1.0.

          tested with the plugin version 1.0. It worked

          valentina armenise added a comment - tested with the plugin version 1.0. It worked

          A C added a comment -

          Reopening. A similar hang is occasionally occurring again with Jenkins 1.6.13 - 1.6.15 and Workflow 1.6 during bat steps in Windows. The process that workflow is waiting for has successfully ended according to log output, there are no rogue generated batch files and directories left either, so it still seems like a race condition exists.

          Why is this concatenating to a jenkins-results.txt file anyway, that doesn't seem very robust? Can't we just re-pipe standard out and standard error directly?

          A C added a comment - Reopening. A similar hang is occasionally occurring again with Jenkins 1.6.13 - 1.6.15 and Workflow 1.6 during bat steps in Windows. The process that workflow is waiting for has successfully ended according to log output, there are no rogue generated batch files and directories left either, so it still seems like a race condition exists. Why is this concatenating to a jenkins-results.txt file anyway, that doesn't seem very robust? Can't we just re-pipe standard out and standard error directly?

          Jesse Glick added a comment -

          sumdumgai I am not sure what bug you are seeing but it sounds different than this one. Better to file separately. Note that 1.7 included a fix in a related area.

          Jesse Glick added a comment - sumdumgai I am not sure what bug you are seeing but it sounds different than this one. Better to file separately. Note that 1.7 included a fix in a related area.

          A C added a comment -

          OK. WF 1.7 seems to have made this problem worse, tracking a possibly related symptom in JENKINS-28604.

          A C added a comment - OK. WF 1.7 seems to have made this problem worse, tracking a possibly related symptom in JENKINS-28604 .

            jglick Jesse Glick
            varmenise valentina armenise
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: