• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • durable-task-plugin
    • jenkins 2.19.4
      durable-task 1.12
      Linux server
      Windows slave
      java 1.8.0_51

      Pipeline jobs occasionally hang on bat steps. This seems similar to JENKINS-34150 but we are using durable-task 1.12 which has the fix for that.

      Thread dump from the job (after 18 hours):

      Thread #26
      	at DSL.bat(awaiting process completion in C:\j\w\<folder>\<job>@tmp\durable-56b1eae1 on <slave>)
      	at WorkflowScript.run(WorkflowScript:349)
      	at DSL.withEnv(Native Method)
      	at WorkflowScript.run(WorkflowScript:242)
      	at DSL.stage(Native Method)
      	at WorkflowScript.run(WorkflowScript:156)
      	at DSL.node(running on <slave>)
      	at WorkflowScript.run(WorkflowScript:34)
      

      The bat step is running a batch file:

      cmd /c call test.bat ....
      

      which in turn is running a python script which (in this case) is throwing an exception (I can see from inspecting log files on the slave). Looking on the slave the "durable-56b1eae1" folder is present with jenkins-log.txt, jenkins-main.bat and jenkins-wrap.bat inside of it. There is no sign of the batch process on the slave so I presume that it has completed. The build continues to occupy a slot on the executor. There are also several flyweight tasks from the matrix plugin on the same slave.

      Please let me know if there is anything else I can do to help diagnose this.

          [JENKINS-41482] Pipeline bat step hangs (after restart)

          Russell Gallop created issue -
          Russell Gallop made changes -
          Description Original: Pipeline jobs occasionally hang on bat steps. This seems similar to JENKINS-34150 but we are using durable-task 1.12 which has the fix for that.

          Thread dump from the job (after 18 hours):
          {code}
          Thread #26
          at DSL.bat(awaiting process completion in C:\j\w\<folder>\<job>@tmp\durable-56b1eae1 on <slave>)
          at WorkflowScript.run(WorkflowScript:349)
          at DSL.withEnv(Native Method)
          at WorkflowScript.run(WorkflowScript:242)
          at DSL.stage(Native Method)
          at WorkflowScript.run(WorkflowScript:156)
          at DSL.node(running on jagent-win14)
          at WorkflowScript.run(WorkflowScript:34)
          {code}

          The bat step is running a batch file:
          {code}
          cmd /c call test.bat ....
          {code}
          which in turn is running a python script which (in this case) is throwing an exception (I can see from inspecting log files on the slave). Looking on the slave the "durable-56b1eae1" folder is present with jenkins-log.txt, jenkins-main.bat and jenkins-wrap.bat inside of it. There is no sign of the batch process on the slave. The build continues to occupy a slot on the executor. There are also several flyweight tasks from the matrix plugin on the same slave.

          Please let me know if there is anything else I can do to help diagnose this.
          New: Pipeline jobs occasionally hang on bat steps. This seems similar to JENKINS-34150 but we are using durable-task 1.12 which has the fix for that.

          Thread dump from the job (after 18 hours):
          {code}
          Thread #26
          at DSL.bat(awaiting process completion in C:\j\w\<folder>\<job>@tmp\durable-56b1eae1 on <slave>)
          at WorkflowScript.run(WorkflowScript:349)
          at DSL.withEnv(Native Method)
          at WorkflowScript.run(WorkflowScript:242)
          at DSL.stage(Native Method)
          at WorkflowScript.run(WorkflowScript:156)
          at DSL.node(running on jagent-win14)
          at WorkflowScript.run(WorkflowScript:34)
          {code}

          The bat step is running a batch file:
          {code}
          cmd /c call test.bat ....
          {code}
          which in turn is running a python script which (in this case) is throwing an exception (I can see from inspecting log files on the slave). Looking on the slave the "durable-56b1eae1" folder is present with jenkins-log.txt, jenkins-main.bat and jenkins-wrap.bat inside of it. There is no sign of the batch process on the slave so I presume that it has completed. The build continues to occupy a slot on the executor. There are also several flyweight tasks from the matrix plugin on the same slave.

          Please let me know if there is anything else I can do to help diagnose this.
          Russell Gallop made changes -
          Description Original: Pipeline jobs occasionally hang on bat steps. This seems similar to JENKINS-34150 but we are using durable-task 1.12 which has the fix for that.

          Thread dump from the job (after 18 hours):
          {code}
          Thread #26
          at DSL.bat(awaiting process completion in C:\j\w\<folder>\<job>@tmp\durable-56b1eae1 on <slave>)
          at WorkflowScript.run(WorkflowScript:349)
          at DSL.withEnv(Native Method)
          at WorkflowScript.run(WorkflowScript:242)
          at DSL.stage(Native Method)
          at WorkflowScript.run(WorkflowScript:156)
          at DSL.node(running on jagent-win14)
          at WorkflowScript.run(WorkflowScript:34)
          {code}

          The bat step is running a batch file:
          {code}
          cmd /c call test.bat ....
          {code}
          which in turn is running a python script which (in this case) is throwing an exception (I can see from inspecting log files on the slave). Looking on the slave the "durable-56b1eae1" folder is present with jenkins-log.txt, jenkins-main.bat and jenkins-wrap.bat inside of it. There is no sign of the batch process on the slave so I presume that it has completed. The build continues to occupy a slot on the executor. There are also several flyweight tasks from the matrix plugin on the same slave.

          Please let me know if there is anything else I can do to help diagnose this.
          New: Pipeline jobs occasionally hang on bat steps. This seems similar to JENKINS-34150 but we are using durable-task 1.12 which has the fix for that.

          Thread dump from the job (after 18 hours):
          {code}
          Thread #26
          at DSL.bat(awaiting process completion in C:\j\w\<folder>\<job>@tmp\durable-56b1eae1 on <slave>)
          at WorkflowScript.run(WorkflowScript:349)
          at DSL.withEnv(Native Method)
          at WorkflowScript.run(WorkflowScript:242)
          at DSL.stage(Native Method)
          at WorkflowScript.run(WorkflowScript:156)
          at DSL.node(running on <slave>)
          at WorkflowScript.run(WorkflowScript:34)
          {code}

          The bat step is running a batch file:
          {code}
          cmd /c call test.bat ....
          {code}
          which in turn is running a python script which (in this case) is throwing an exception (I can see from inspecting log files on the slave). Looking on the slave the "durable-56b1eae1" folder is present with jenkins-log.txt, jenkins-main.bat and jenkins-wrap.bat inside of it. There is no sign of the batch process on the slave so I presume that it has completed. The build continues to occupy a slot on the executor. There are also several flyweight tasks from the matrix plugin on the same slave.

          Please let me know if there is anything else I can do to help diagnose this.
          Russell Gallop made changes -
          Environment Original: jenkins 2.19.4
          durable-task 1.12
          Linux server
          Windows10 slave
          java 1.8.0_51
          New: jenkins 2.19.4
          durable-task 1.12
          Linux server
          Windows slave
          java 1.8.0_51
          Russell Gallop made changes -
          Summary Original: Pipeline bat step hangs New: Pipeline bat step hangs (after restart)
          Andrew Bayer made changes -
          Component/s New: workflow-durable-task-step-plugin [ 21715 ]
          Component/s Original: pipeline [ 21692 ]
          Jesse Glick made changes -
          Component/s Original: workflow-durable-task-step-plugin [ 21715 ]
          Labels New: windows
          Vivek Pandey made changes -
          Labels Original: windows New: triaged-2018-11 windows

            Unassigned Unassigned
            rg Russell Gallop
            Votes:
            3 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: