Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54643

A connection interruption causes the pipeline to fail when USE_WATCHING=true

      Run Jenkins with -Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

      Add a pipeline job with this script:

      node('mynode') {
          sh '''#!/bin/sh -e
              for n in $(seq 100); do
                  echo "$n"
                  sleep 1
              done
          '''
          sh 'echo OK'
      }
      

      Run the pipeline. When it starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

      What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

      • Some of the output is missing from the log.
      • The "echo OK" step doesn't run.
      • The pipeline fails with an EOFException.

      I'm attaching a full example log.

      By contrast, with USE_WATCHING=false the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.

          [JENKINS-54643] A connection interruption causes the pipeline to fail when USE_WATCHING=true

          Roman Donchenko created issue -
          Vivek Pandey made changes -
          Labels Original: regression New: regression triaged-2018-11
          Jesse Glick made changes -
          Assignee New: Jesse Glick [ jglick ]
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-52165 [ JENKINS-52165 ]
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-41854 [ JENKINS-41854 ]
          Roman Donchenko made changes -
          Description Original: Run Jenkins with {{-Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true}}. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

          Add a pipeline job with this script:

          {code:groovy}
          node('mynode') {
              sh '''#!/bin/sh -e
                  for n in $(seq 100); do
                      echo "$n"
                      sleep 1
                  done
              '''
              sh 'echo OK'
          }
          {code}

          Run the pipeline. When is starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

          What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

          * Some of the output is missing from the log.
          * The "echo OK" step doesn't run.
          * The pipeline fails with an {{EOFException}}.

          I'm attaching a full example log.

          By contrast, with {{USE_WATCHING=false}} the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.
          New: Run Jenkins with {{-Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true}}. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

          Add a pipeline job with this script:

          {code:groovy}
          node('mynode') {
              sh '''#!/bin/sh -e
                  for n in $(seq 100); do
                      echo "$n"
                      sleep 1
                  done
              '''
              sh 'echo OK'
          }
          {code}

          Run the pipeline. When it starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

          What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

          * Some of the output is missing from the log.
          * The "echo OK" step doesn't run.
          * The pipeline fails with an {{EOFException}}.

          I'm attaching a full example log.

          By contrast, with {{USE_WATCHING=false}} the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-56851 [ JENKINS-56851 ]
          Jesse Glick made changes -
          Resolution New: Duplicate [ 3 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

            jglick Jesse Glick
            rdonchen_intel Roman Donchenko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: