Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55075

Pipeline: If job fails it will run again on next poll.

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • p4-plugin
    • P4-Plugin 1.9.3
      Jenkins 2.138.1

      When a job detected by polling fails, it will run again the next time polling runs even when there are no new changes since last poll. For example.

      (1) Build works.
      (2) No new changes = Next poll no execution.
      (3) New change made = Next poll execution.
      (4) Build fails = Next poll execution.
      (5) Build fails = Next poll execution.
      (6) Build works = Next poll no execution.

       

      Reproductions steps:

      (1) Create a pipeline job with the following Jenkinsfile and set it up to poll every minutes (or use the PollNow plugin):

      pipeline {  agent { label 'master' }
        
        stages {
          stage("Run script") {
            steps {
              script {
              def failJob = input(message: 'Do you wish this job to work?', 
      	                ok: 'OK', 
                              parameters: [booleanParam(defaultValue: true, 
                              description: 'If you want this to fail untick yes and click YES',
                              name: 'Yes?')])	   
                 echo "Input Result:" + failJob
                 if (failJob == false)
                 {
                    sh 'Arghhhhh'
                 }
      	   echo "P4_CLIENT is:"
      	   echo env.P4_CLIENT
              }
            }
          }
        }
      }
      
      

      (2) Submit a changelist to the polled view.

      (3) Wait for job to run, go into the console for the job execution and click on 'Input requested', Untick 'Yes?' then click 'OK'. Job will be marked as 'FAILURE'.

      (4) Wait for next poll or click 'Poll Now'. Job will trigger again.

      (5) On 'Input requested just click 'OK'. Job will be marked as 'SUCCESS'.

      (6) Wait for next poll or click 'Poll Now'. Job will NOT trigger again.

       

       

          [JENKINS-55075] Pipeline: If job fails it will run again on next poll.

          Mykola Ulianytskyi added a comment - - edited

          After internal discussion we know of at least a few customers that would want the build to retry on failure.

          An ideal solution is therefore if we could have a tickbox against a job to switch between rebuild on OS/Jenkins fail and ignore fail.

          SCM Plugin should trigger build only once if new changes found
          and never retry it on failure because build loops occur.

          All existing SCM plugins (Git, CVS, SVN, etc) don't retry builds on failure.
           
          Users can use built-in Jenkins features for:
           
          1) Steps Retry:

          stage('Deploy') {
              steps {
                  retry(3) {
                     sh './deploy.sh'
                  }
              }
          }
          

          https://jenkins.io/doc/pipeline/tour/running-multiple-steps/#timeouts-retries-and-more
           

          2) Entire Pipeline Retry:

          pipeline {
              options { 
                  retry(3) 
              }
          ...

          https://jenkins.io/doc/book/pipeline/syntax/#options

          Mykola Ulianytskyi added a comment - - edited After internal discussion we know of at least a few customers that would want the build to retry on failure. An ideal solution is therefore if we could have a tickbox against a job to switch between rebuild on OS/Jenkins fail and ignore fail. SCM Plugin should trigger build only once if new changes found and never retry it on failure because build loops occur. All existing SCM plugins (Git, CVS, SVN, etc) don't retry builds on failure.   Users can use built-in Jenkins features for:   1) Steps Retry: stage( 'Deploy' ) { steps { retry(3) { sh './deploy.sh' } } } https://jenkins.io/doc/pipeline/tour/running-multiple-steps/#timeouts-retries-and-more   2) Entire Pipeline Retry: pipeline { options { retry(3) } ... https://jenkins.io/doc/book/pipeline/syntax/#options

          Brad Wehmeier added a comment - - edited

          Customers who what to retry failed builds should use a Jenkins plugin designed for that. e.g. https://wiki.jenkins.io/display/JENKINS/Naginator+Plugin

          If you still decide a checkbox is necessary, please set the default it to NOT trigger another build on failure since that is the behavior of other SCM plugins for Jenkins.

          Brad Wehmeier added a comment - - edited Customers who what to retry failed builds should use a Jenkins plugin designed for that. e.g.  https://wiki.jenkins.io/display/JENKINS/Naginator+Plugin If you still decide a checkbox is necessary, please set the default it to NOT trigger another build on failure since that is the behavior of other SCM plugins for Jenkins.

          Karl Wirth added a comment -

          Hi bradleywehmeier and lystor. Thank you very much. That is great feedback.
          FYI - cbopardikar

          Karl Wirth added a comment - Hi bradleywehmeier and lystor . Thank you very much. That is great feedback. FYI - cbopardikar

          Alisdair Robertson added a comment - - edited

          I've got an issue with the change to polling behaviour that appears to have been altered in this ticket.

          It seems that now in cases where polling shows no changes since the previous build we report no changes (good) but we also report no changes when there was a polling error, and don't take any action to correct or notify about the polling error (bad). 

          This has caused a few branches of mine that are set to poll nightly to not be built for a few days before we noticed, with polling logs that look like this for each workspace used in the build (we use multiple p4sync steps in parallel stages):

           
          P4: Polling on: master with:<workspace name>
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.

          From looking at the code it looks to me as though this is caused by no changes being attached to the previous build for some unknown reason so https://github.com/jenkinsci/p4-plugin/blob/master/src/main/java/org/jenkinsci/plugins/p4/tagging/TagAction.java#L300 would return an empty arraylist. However I haven't actually done any debugging.

          While I can understand that builds being constantly triggered just because of a polling error is not desirable, I find it even more undesirable to almost silently skip triggering a build indefinitely without an administrator being made aware that there is a polling issue rather than just no changes being made in the workspace.

          Could there be limited (re)triggering on polling failures or an notification system so that an administrator can swoop in to diagnose issues or manually trigger the job anew as necessary in the event of polling failures?

          Note that I explicitly do not want to retry failed builds, because as far as I can tell from logs, although the build prior to the polling error was a failure in compilation, the perforce syncs in the job all completed successfully.

          Alisdair Robertson added a comment - - edited I've got an issue with the change to polling behaviour that appears to have been altered in this ticket. It seems that now in cases where polling shows no changes since the previous build we report no changes (good) but we also report no changes when there was a polling error, and don't take any action to correct or notify about the polling error (bad).  This has caused a few branches of mine that are set to poll nightly to not be built for a few days before we noticed, with polling logs that look like this for each workspace used in the build (we use multiple p4sync steps in parallel stages):   P4: Polling on: master with:<workspace name> P4: Polling: No changes in previous build. P4: Polling error; no previous change. From looking at the code it looks to me as though this is caused by no changes being attached to the previous build for some unknown reason so https://github.com/jenkinsci/p4-plugin/blob/master/src/main/java/org/jenkinsci/plugins/p4/tagging/TagAction.java#L300 would return an empty arraylist. However I haven't actually done any debugging. While I can understand that builds being constantly triggered just because of a polling error is not desirable, I find it even more undesirable to almost silently skip triggering a build indefinitely without an administrator being made aware that there is a polling issue rather than just no changes being made in the workspace. Could there be limited (re)triggering on polling failures or an notification system so that an administrator can swoop in to diagnose issues or manually trigger the job anew as necessary in the event of polling failures? Note that I explicitly do not want to retry failed builds, because as far as I can tell from logs, although the build prior to the polling error was a failure in compilation, the perforce syncs in the job all completed successfully.

          Karl Wirth added a comment -

          Hi alisdair_robertson, Can you please provide an example of a polling error you are seeing so we can try it out here (for example the bad polling log).

          Thanks in advance,

          Karl

          Karl Wirth added a comment - Hi alisdair_robertson , Can you please provide an example of a polling error you are seeing so we can try it out here (for example the bad polling log). Thanks in advance, Karl

          Hey p4karl, the only content under the branch job polling log for the last poll is as follows (workspace names changes, but they include node name job name and stage name):

          Started on 01/05/2019 8:16:00 PM
          P4: Polling on: master with:workspace-1-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          P4: Polling on: master with:workspace-2-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          P4: Polling on: master with:workspace-3-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          P4: Polling on: master with:workspace-4-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          P4: Polling on: master with:workspace-5-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          P4: Polling on: master with:workspace-6-name
          P4: Polling: No changes in previous build.
          P4: Polling error; no previous change.
          Done. Took 1.8 sec
          No changes

          I see normal polling output for the other branches of the multibranch pipeline where the most recent build was not a failure.

          This multibranch pipeline never automatically runs the 'scan multibranch pipeline' job, we only run that manually when we have new branches that need to be added. The Jenkinsfile configures explicit SCM polling once each evening.

          Alisdair Robertson added a comment - Hey p4karl , the only content under the branch job polling log for the last poll is as follows (workspace names changes, but they include node name job name and stage name): Started on 01/05/2019 8:16:00 PM P4: Polling on: master with:workspace-1-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. P4: Polling on: master with:workspace-2-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. P4: Polling on: master with:workspace-3-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. P4: Polling on: master with:workspace-4-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. P4: Polling on: master with:workspace-5-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. P4: Polling on: master with:workspace-6-name P4: Polling: No changes in previous build. P4: Polling error; no previous change. Done. Took 1.8 sec No changes I see normal polling output for the other branches of the multibranch pipeline where the most recent build was not a failure. This multibranch pipeline never automatically runs the 'scan multibranch pipeline' job, we only run that manually when we have new branches that need to be added. The Jenkinsfile configures explicit SCM polling once each evening.

          Dave Miller added a comment -

          I just upgraded to 1.10.0 to address a different issue, and I'm trying to understand why commit 283834eabea30f31cde59b1ab3b743a01b2f47cb, which claims to be associated with this ticket, is now throwing un-actionable "Severe warnings" when it finds duplicate SyncIDs. Why do duplicate syncIDs justify a severe log entry, what do they have to to with this ticket, and how should they be addressed? These "Severe warnings" are dramatically clogging our server logs. p4paul?

          Dave Miller added a comment - I just upgraded to 1.10.0 to address a different issue, and I'm trying to understand why commit 283834eabea30f31cde59b1ab3b743a01b2f47cb, which claims to be associated with this ticket, is now throwing un-actionable "Severe warnings" when it finds duplicate SyncIDs. Why do duplicate syncIDs justify a severe log entry, what do they have to to with this ticket, and how should they be addressed? These "Severe warnings" are dramatically clogging our server logs.  p4paul ?

          Karl Wirth added a comment -

          Hi feelingmimsy - We have had a lot of polling problems recently. Some of them came down to problems in P4-Jenkins but many came down to customers using the same workspace with different views in the same job. For each sync we store a sync ID keyed on the workspace name and if there are two we could easily be syncing and building the wrong changelists or missing changelists. Therefore we decided to highlight this with the message you see.

          I'd like to investigate this further but will be asking for some potentially confidential information. Would you be willing to send an email to 'support@perforce.com' for my attention so that I can get this information from you?

          Karl Wirth added a comment - Hi feelingmimsy - We have had a lot of polling problems recently. Some of them came down to problems in P4-Jenkins but many came down to customers using the same workspace with different views in the same job. For each sync we store a sync ID keyed on the workspace name and if there are two we could easily be syncing and building the wrong changelists or missing changelists. Therefore we decided to highlight this with the message you see. I'd like to investigate this further but will be asking for some potentially confidential information. Would you be willing to send an email to 'support@perforce.com' for my attention so that I can get this information from you?

          Karl Wirth added a comment -

          Have created the following bug to document here what the 'duplicate syncID found' means and suggest a messaging improvement:

          JENKINS-58067

          Karl Wirth added a comment - Have created the following bug to document here what the 'duplicate syncID found' means and suggest a messaging improvement: JENKINS-58067

          Released in 1.9.7

          Charusheela Bopardikar added a comment - Released in 1.9.7

            cbopardikar Charusheela Bopardikar
            p4karl Karl Wirth
            Votes:
            4 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: