Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-43607

Jenkins pipeline not aborted when the machine running docker container goes offline

    • Icon: New Feature New Feature
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • Jenkins ver. 2.53
      Pipeline job /
      Pipeline: Nodes and Processes plugins : ver. 2.10

       Preconditions

      Jenkins pipeline job is configured to run parallel actions in different docker swarm nodes.

      Procedure

      1. Run job
      2. Force disconnect of a node running a part of this job

      Actual outcome

      Job will never terminate. The pipeline part will remain stuck in:

      Cannot contact swarm-xxxxxxxx: hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException

      The exception is catched by workflow-durable-task-step-plugin and used to display the log above.

      Expected outcome

      The pipeline part execution should generate an exception that can be catched.

      This will can allow implementing a retry strategy in Pipeline job.

          [JENKINS-43607] Jenkins pipeline not aborted when the machine running docker container goes offline

          Aymen Bouaziz created issue -
          Aymen Bouaziz made changes -
          Description Original:  

          *Preconditions*

          Jenkins pipeline job is configured to run parallel actions in different docker swarm nodes.

          *Procedure*
           # Run job
           # Force disconnect of a node running a part of this job

          *Actual outcome*

          Job will never terminate. The pipeline part will remain stuck in:
          Cannot contact swarm-xxxxxxxx: hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException
          The exception catched by workflow-durable-task-step-plugin and used to display the log above.

          *Expected outcome*

          The pipeline part execution should generate an exception that can be catched.

          This will can allow implementing a retry strategy in Pipeline job.
          New:  *Preconditions*

          Jenkins pipeline job is configured to run parallel actions in different docker swarm nodes.

          *Procedure*
           # Run job
           # Force disconnect of a node running a part of this job

          *Actual outcome*

          Job will never terminate. The pipeline part will remain stuck in:
          {noformat}
          Cannot contact swarm-xxxxxxxx: hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException{noformat}
          The exception is catched by workflow-durable-task-step-plugin and used to display the log above.

          *Expected outcome*

          The pipeline part execution should generate an exception that can be catched.

          This will can allow implementing a retry strategy in Pipeline job.
          Jesse Glick made changes -
          Issue Type Original: Bug [ 1 ] New: New Feature [ 2 ]
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-36013 [ JENKINS-36013 ]

          Jesse Glick added a comment -

          As with JENKINS-36013, currently the model is that a node may go offline and later be reconnected, in which case the step will quietly resume printing output and exit normally. For Swarm or other cloud-like node schemes, a disconnection may be followed by an actual permanent removal of the node definition, in which case it would be desirable for the step to abort.

          Jesse Glick added a comment - As with  JENKINS-36013 , currently the model is that a node may go offline and later be reconnected, in which case the step will quietly resume printing output and exit normally. For Swarm or other cloud-like node schemes, a disconnection may be followed by an actual permanent removal of the node definition, in which case it would be desirable for the step to abort.

          jglick  should this get more attention? there are a number of tickets and questions turning up online as  ephemeral nodes are becoming way more common. GKE in particular makes its very cheap and easy.

          Michael McCallum added a comment - jglick   should this get more attention? there are a number of tickets and questions turning up online as  ephemeral nodes are becoming way more common. GKE in particular makes its very cheap and easy.
          Jesse Glick made changes -
          Link New: This issue duplicates JENKINS-49707 [ JENKINS-49707 ]
          Jesse Glick made changes -
          Resolution New: Duplicate [ 3 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

            Unassigned Unassigned
            aymen_parrot Aymen Bouaziz
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: