Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34376

Cannot easily/safely catch exceptions without disrupting job abortion

    XMLWordPrintable

Details

    • Improvement
    • Status: Resolved (View Workflow)
    • Major
    • Resolution: Duplicate
    • pipeline
    • None
    • Jenkins ver. 1.656, pipeline 2.0

    Description

      There doesn't seem to be a safe method of catching errors that might occur while running a shell command without disrupting the normal job stopping process. I would like to be able to catch the error (as it might just be a failing test), log it, and continue running the pipeline script. My main concern is being able to still allow stopping or aborting the pipeline. This would suggest to me that I need to allow exceptions to bubble up if it is due to an abort.

      Manually aborting (by pressing red stop button in the gui) will either raise an AbortException or a FlowInterruptedException depending on whether the pipeline is running a shell command or not. This makes it difficult to determine if the shell command failed or a job abort was requested. If a shell command is aborted, the AbortException will at least have a message about an exit code 143. (I think)

      To further confuse things, I am wrapping some shell scripts with a timeout step. Timeout also raises FlowInterruptedException but at least the exception will have ExceededTimeout as the cause.

      I thought maybe I could look at currentBuild.result but it seems to be always null. I looked at 'catchError' but that doesn't set the result to 'ABORTED' when stopping during a shell command. (It also seems to ignore when a user is trying to abort the job)

      Basically, to be able to recover from errors but also allow stopping pipelines, I think I have to use this monstrosity:

      onErrorMarkUnstable("sleep step") {
          // stopping here raises FlowInterruptedException
          sleep 10
      }
      
      onErrorMarkUnstable("sleep shell") {
          node {
              // error here raises AbortException, stopping here raises AbortException
      
              // a script that fails.. we want to be able to log that it failed and continue testing.
              sh "sleep 10 && false"
          }
      }
      
      onErrorMarkUnstable("sleep shell with timeout") {
          // we want to log that it timed out and continue.
          timeout (time: 5, unit: 'SECONDS')  {
              node {
                  // error here raises AbortException, stopping here raises AbortException, timeout raises FlowInterruptedException
                  sh "sleep 10 && false"
              }
          }
      
      }
      
      echo "last line, done!"
      
      def onErrorMarkUnstable(desc, Closure body) {
          echo "${desc}"
          try {
              body.call()
          } catch (org.jenkinsci.plugins.workflow.steps.FlowInterruptedException e) {
              def allowInterruption = true;
              for (int i = 0; i < e.causes.size(); i++) {
                  // white list ExceededTimeout
                  if (e.causes[0] instanceof org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.ExceededTimeout) {
                      echo "${desc}: An error occured (${e.causes[0]}) marking build as failed."
                      currentBuild.result = "UNSTABLE"
                      allowInterruption = false;
                  }
              }
              if (allowInterruption) { throw e; }
          } 
          catch (hudson.AbortException e) {
              def m = e.message =~ /(?i)script returned exit code (\d+)/
              if (m) {
                  def exitcode = m.group(1).toInteger()
                  if (exitcode >= 128) {
                      throw e;    //  killed because of abort, letting through
                  }
              }
              echo "${desc}: An error occured (${e}) marking build as failed."
              currentBuild.result = "UNSTABLE"
          } 
      }
      
      

      Attachments

        Issue Links

          Activity

            clayg Clay Gerrard added a comment -

            Can we get the JIRA component tags inline with the project name?

            https://wiki.jenkins-ci.org/display/JENKINS/workflow+plugin

            clayg Clay Gerrard added a comment - Can we get the JIRA component tags inline with the project name? https://wiki.jenkins-ci.org/display/JENKINS/workflow+plugin
            stephansta stephan stachurski added a comment - - edited

            I'm facing the same issue and this short summary helped me understand the rules of what's going on.

            org.jenkinsci.plugins.workflow.steps.FlowInterruptedException can be raised by:
               * org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.ExceededTimeout
               * When pipeline is not in a shell step, and a user presses abort
            
            hudson.AbortException can be raise by:
               * A shell step returns with a nonzero exit code
               * When pipeline is in a shell step, and a user presses abort
            

            The problem described in the issue is general; anyone who wants to handle aborts and timeouts correctly will run into this at some point, but the code samples in the description are specific to this person's use case, for example, conditionally marking builds as unstable when a timeout occurs.

            I think the problem in general can be described something like this:

            It's impossible to tell what is really meant by FlowInterruptedException and AbortException without looking inside the object, and that's really unwieldy in a pipeline script. There should be a way to handle it with minimal logic and inspection.

            stephansta stephan stachurski added a comment - - edited I'm facing the same issue and this short summary helped me understand the rules of what's going on. org.jenkinsci.plugins.workflow.steps.FlowInterruptedException can be raised by: * org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.ExceededTimeout * When pipeline is not in a shell step, and a user presses abort hudson.AbortException can be raise by: * A shell step returns with a nonzero exit code * When pipeline is in a shell step, and a user presses abort The problem described in the issue is general; anyone who wants to handle aborts and timeouts correctly will run into this at some point, but the code samples in the description are specific to this person's use case, for example, conditionally marking builds as unstable when a timeout occurs. I think the problem in general can be described something like this: It's impossible to tell what is really meant by FlowInterruptedException and AbortException without looking inside the object, and that's really unwieldy in a pipeline script. There should be a way to handle it with minimal logic and inspection.
            stephansta stephan stachurski added a comment - - edited

            The following snippet attempts to show a way you can remove the ambiguity in exceptions described above, by checking the conditions and rethrowing a UserInterrupedException, or the original exception.

            https://gist.github.com/stephansnyt/3ad161eaa6185849872c3c9fce43ca81

            def doTheThing(Closure doMe) {
                try {
                    return doMe()
                } catch (org.jenkinsci.plugins.workflow.steps.FlowInterruptedException fie) {
                    // this ambiguous condition means a user probably aborted
                    if (fie.causes.size() == 0) {
                        throw new UserInterruptedException(fie)
                    } else {
                        throw fie
                    }
                } catch (hudson.AbortException ae) {
                    // this ambiguous condition means during a shell step, user probably aborted
                    if (ae.getMessage().contains('script returned exit code 143')) {
                        throw new UserInterruptedException(ae)
                    } else {
                        throw ae
                    }
                }
            }
            
            stephansta stephan stachurski added a comment - - edited The following snippet attempts to show a way you can remove the ambiguity in exceptions described above, by checking the conditions and rethrowing a UserInterrupedException, or the original exception. https://gist.github.com/stephansnyt/3ad161eaa6185849872c3c9fce43ca81 def doTheThing(Closure doMe) { try { return doMe() } catch (org.jenkinsci.plugins.workflow.steps.FlowInterruptedException fie) { // this ambiguous condition means a user probably aborted if (fie.causes.size() == 0) { throw new UserInterruptedException(fie) } else { throw fie } } catch (hudson.AbortException ae) { // this ambiguous condition means during a shell step, user probably aborted if (ae.getMessage().contains( 'script returned exit code 143' )) { throw new UserInterruptedException(ae) } else { throw ae } } }

            People

              jglick Jesse Glick
              sonneveldsmartward Nick Sonneveld
              Votes:
              4 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: