Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53140

Intermittent Random Stage Failures - No Changes Detected - Parallel Stages

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      On a new Jenkins server, we're setting up new Jenkinsfiles from scratch.  We're using parallel stages, which seems like it might be a factor based on searches for related issues. 

      We have about 15 of these jobs with 4 parallel stages.  For about 1 in 10 jobs, a stage will randomly fail in 1 second or less.  If you hover over the stage it shows: 

      Failed to parse /var/lib/jenkins/jobs/MyOrg/jobs/Projects/jobs/myProject/builds/252/changelog3.xml

      If you click the Stage Log it says: 

      No changes for <my_scm_url> since the previous build`

      This can happen when the build is triggered by either of the following means:

      • From an upstream job
      • Manually by clicking "build now"

       

      Of note, we use SVN as our SCM with the following configuration:

      We use "Jenkinsfile from SCM", using URL like

      http://myserver/myrepo/_automation

      In the GUI, we do not configure SCM polling 

      In the Jenkinsfile, we configure SCM polling, every 2 minutes

      In the Jenkinsfile, we configure SCM checkout of 

      http://myserver/myrepo`|http://myserver/myrepo

       

      The reason I include our SVN/SCM backstory is because we're also seeing another anomaly related to SCM which might be relevant. . 

      For each SCM polling period, the Subversion Polling Log show approximately multiple entries all at the same timestamp.   These entries are some randomized mix of polling the two relevant URL's :

      http://myserver/myrepo

      and 

      http://myserver/myrepo/_automation

      Interestingly, we've figured something out about these duplicate "Subversion Polling Entries" . There is 1 entry checking

      http://myserver/myrepo/_automation

      for changes for each stage in the build for the ". So, if we have 1 "Checkout Stage", 1 "Parallel Stages" block, and 3 "Parallel Stages", we see 5 total "Subversion Polling Entries" for

      http://myserver/myrepo/_automation

      . Not sure if this is by design or not.

       

       

       

       

       

      We have the latest versions of all plugins. 

        Attachments

          Issue Links

            Activity

            Hide
            solvingj jerry wiltse added a comment - - edited

            Very relevant update. For one of the jobs, I took the stages out of the `parallel` block, and then scheduled it to run every 24 hours. We have seen had no failures on that job, which is about 20 runs. Again, previously, it was failing for 1 out of every 10 or so.

            Show
            solvingj jerry wiltse added a comment - - edited Very relevant update. For one of the jobs, I took the stages out of the `parallel` block, and then scheduled it to run every 24 hours. We have seen had no failures on that job, which is about 20 runs. Again, previously, it was failing for 1 out of every 10 or so.
            Hide
            abayer Andrew Bayer added a comment -

            Any chance you can attach the Jenkinsfile that was producing the problem?

            Show
            abayer Andrew Bayer added a comment - Any chance you can attach the Jenkinsfile that was producing the problem?
            Hide
            solvingj jerry wiltse added a comment -

            Sure thing, here it is:

            pipeline {
                agent none
                environment {
                    SVN_CREDENTIALS = credentials('credentialsid')
                }
                triggers {
                    pollSCM('H/2 * * * *')
                    upstream(
                        upstreamProjects: 'several/upstream/projects',
                        threshold: hudson.model.Result.SUCCESS
                    ) 
                }
                stages {
                    stage('Checkout') {
                        agent {
                            label 'docker && windows'
                        }
                        steps {
                            checkout([
                                $class: 'SubversionSCM', 
                                locations: [[
                                    credentialsId: '{credentialsid}', 
                                    local: '.', 
                                    remote: '{my_project_url}']], 
                                    workspaceUpdater: [$class: 'UpdateUpdater']])
                        }
                    }
                    stage('Build with Conan') {
                        parallel {
                            stage('build_mips'){
                                agent {
                                    label 'docker && windows'
                                }
                                steps {
                                    bat 'python build.py'
                                }
                            }
                            stage('build_arm'){
                                agent {
                                    label 'docker && windows'
                                }
                                steps {
                                    bat 'python build.py'
                                }
                            }
                            stage('build_msvc_15'){
                                agent {
                                    label 'docker && windows'
                                }
                                steps {
                                    bat 'python build.py'
                                }
                            }
                            stage('build_gcc_49'){
                                agent {
                                    label 'docker && windows'
                                }
                                steps {
                                    bat 'python build.py'
                                }
                            }
                        }
                    }
                }
            }
            
            Show
            solvingj jerry wiltse added a comment - Sure thing, here it is: pipeline { agent none environment { SVN_CREDENTIALS = credentials( 'credentialsid' ) } triggers { pollSCM( 'H/2 * * * *' ) upstream( upstreamProjects: 'several/upstream/projects' , threshold: hudson.model.Result.SUCCESS ) } stages { stage( 'Checkout' ) { agent { label 'docker && windows' } steps { checkout([ $class: 'SubversionSCM' , locations: [[ credentialsId: '{credentialsid}' , local: '.' , remote: '{my_project_url}' ]], workspaceUpdater: [$class: 'UpdateUpdater' ]]) } } stage( 'Build with Conan' ) { parallel { stage( 'build_mips' ){ agent { label 'docker && windows' } steps { bat 'python build.py' } } stage( 'build_arm' ){ agent { label 'docker && windows' } steps { bat 'python build.py' } } stage( 'build_msvc_15' ){ agent { label 'docker && windows' } steps { bat 'python build.py' } } stage( 'build_gcc_49' ){ agent { label 'docker && windows' } steps { bat 'python build.py' } } } } } }
            Hide
            solvingj jerry wiltse added a comment -

            Here's the log from one poll cycle.  This demonstrates the fact that Jenkins polls one of the SCM URL's once for each stage that exists which is curious.  It polls the other SCM URL only once which seems correct: 

            This "Bug" would seem to make more sense to me if it was doing multiple polls on the URL defined in the Jenkinsfile, but it's not. 

            Multiple polls:  http://myrepo/mypath/_automation  (defined in Jenkins UI: "Jenkinsfile from SCM field")

            Single poll:    http://myrepo/mypath/   (defined in Jenkinsfile "checkout/remote" field)

            Started on Aug 24, 2018 5:27:00 PM
            Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM
            Using sole credentials jenkins/****** in realm ‘<http://myrepo:80> svn’
            http://myrepo/mypath/_automation is at revision 52,301
            Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM
            Using sole credentials jenkins/****** in realm ‘<http://myrepo:80> svn’
            http://myrepo/mypath/_automation is at revision 52,301
            Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM
            Using sole credentials jenkins/****** in realm ‘<http://myrepo:80> svn’
            http://myrepo/mypath is at revision 52,313
            Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM
            Using sole credentials jenkins/****** in realm ‘<http://myrepo:80> svn’
            http://myrepo/mypath/_automation is at revision 52,301
            Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM
            Using sole credentials jenkins/****** in realm ‘<http://myrepo:80> svn’
            http://myrepo/mypath/_automation is at revision 52,301
            Done. Took 98 ms
            No changes
            
            Show
            solvingj jerry wiltse added a comment - Here's the log from one poll cycle.  This demonstrates the fact that Jenkins polls one of the SCM URL's once for each stage that exists which is curious.  It polls the other SCM URL only once which seems correct:  This "Bug" would seem to make more sense to me if it was doing multiple polls on the URL defined in the Jenkinsfile, but it's not.  Multiple polls:  http://myrepo/mypath/_automation   (defined in Jenkins UI: "Jenkinsfile from SCM field") Single poll:    http://myrepo/mypath/    (defined in Jenkinsfile "checkout/remote" field) Started on Aug 24, 2018 5:27:00 PM Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM Using sole credentials jenkins/****** in realm ‘<http: //myrepo:80> svn’ http: //myrepo/mypath/_automation is at revision 52,301 Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM Using sole credentials jenkins/****** in realm ‘<http: //myrepo:80> svn’ http: //myrepo/mypath/_automation is at revision 52,301 Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM Using sole credentials jenkins/****** in realm ‘<http: //myrepo:80> svn’ http: //myrepo/mypath is at revision 52,313 Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM Using sole credentials jenkins/****** in realm ‘<http: //myrepo:80> svn’ http: //myrepo/mypath/_automation is at revision 52,301 Received SCM poll call on master for myproject on Aug 24, 2018 5:27:00 PM Using sole credentials jenkins/****** in realm ‘<http: //myrepo:80> svn’ http: //myrepo/mypath/_automation is at revision 52,301 Done. Took 98 ms No changes
            Hide
            abayer Andrew Bayer added a comment -

            On a call right now so can't dig too deep, but this smells like various issues with multiple checkouts causing multiple Git repo actions etc - obviously not quite the same since it's SVN, but similar. There's an automatic checkout whenever you go to a new agent if you don't have skipDefaultCheckout(true) in your options, and each of those checkouts is triggering a polling of its own. I can't remember the specific issues this would relate to, but will check later.

            Show
            abayer Andrew Bayer added a comment - On a call right now so can't dig too deep, but this smells like various issues with multiple checkouts causing multiple Git repo actions etc - obviously not quite the same since it's SVN, but similar. There's an automatic checkout whenever you go to a new agent if you don't have skipDefaultCheckout(true) in your options , and each of those checkouts is triggering a polling of its own. I can't remember the specific issues this would relate to, but will check later.
            Hide
            solvingj jerry wiltse added a comment -

            Per the suggestion, at the top level of the Jenkinsfile we added the following:

            skipDefaultCheckout(true)

            Unfortunately, it caused a surprising issue that we don't understand.  You can see the following lines in our steps:

            python build.py

            build.py runs a docker container passing the SVN credentials via -e.  The docker container runs another python script which does an SVN checkout all of which works as desired.

            After adding skipDefaultCheckout(true), the SVN checkout inside the docker containers failed.  The most likely candidate seems to be that it somehow broke the passing of the SVN_CREDENTIALS environment variables to the container. 

             

            Rather than troubleshoot that at yesterday, i tried putting skipDefaultCheckout(true) in the checkout step. Unfortunately, that did not fix the issue.

             

            If you have ideas about why skipDefaultCheckout(true) might have broken the environment variable, maybe we can work around it. 

            Show
            solvingj jerry wiltse added a comment - Per the suggestion, at the top level of the Jenkinsfile we added the following: skipDefaultCheckout( true ) Unfortunately, it caused a surprising issue that we don't understand.  You can see the following lines in our steps: python build.py build.py runs a docker container passing the SVN credentials via -e.  The docker container runs another python script which does an SVN checkout all of which works as desired. After adding skipDefaultCheckout(true), the SVN checkout inside the docker containers failed.  The most likely candidate seems to be that it somehow broke the passing of the SVN_CREDENTIALS environment variables to the container.    Rather than troubleshoot that at yesterday, i tried putting skipDefaultCheckout(true) in the checkout step. Unfortunately, that did not fix the issue.   If you have ideas about why skipDefaultCheckout(true) might have broken the environment variable, maybe we can work around it. 
            Hide
            abayer Andrew Bayer added a comment -

            Hrm, can't see why skipDefaultCheckout(true) would mess with the environment at all, other than that any environment variables contributed by the SCM plugin's checkout process (in this case, SVN_REVISION and SVN_URL) wouldn't be automatically added to the environment as they would be with the normal default checkout happening.

            Show
            abayer Andrew Bayer added a comment - Hrm, can't see why skipDefaultCheckout(true) would mess with the environment at all, other than that any environment variables contributed by the SCM plugin's checkout process (in this case, SVN_REVISION and SVN_URL ) wouldn't be automatically added to the environment as they would be with the normal default checkout happening.
            Hide
            solvingj jerry wiltse added a comment -

            I managed to get it set , restored the parallel builds, and they failed again immediately,

            Show
            solvingj jerry wiltse added a comment - I managed to get it set , restored the parallel builds, and they failed again immediately,
            Hide
            solvingj jerry wiltse added a comment -

            is there any additional logging i can turn on to try to identify the root cause of this issue?  Also, if we have 3 or 4 commits in a row before, we have 1 job for each commit kick off basically at the same time. They all share a single source directory, so I wonder if there's not some contention/conflict between each of those build threads.  

            Also, I want to make sure you have taken notice of my previous point about duplicate changes messages, so i attached a screenshot.  1 commit, 5 identical changes are listed (the number of times it appears is equal to number of stages).  Also, currently we have parallel builds disabled and it's still showing these duplicates. 

            Show
            solvingj jerry wiltse added a comment - is there any additional logging i can turn on to try to identify the root cause of this issue?  Also, if we have 3 or 4 commits in a row before, we have 1 job for each commit kick off basically at the same time. They all share a single source directory, so I wonder if there's not some contention/conflict between each of those build threads.   Also, I want to make sure you have taken notice of my previous point about duplicate changes messages, so i attached a screenshot.  1 commit, 5 identical changes are listed (the number of times it appears is equal to number of stages).  Also, currently we have parallel builds disabled and it's still showing these duplicates. 
            Hide
            solvingj jerry wiltse added a comment -

            Andrew Bayer : regarding "skip default checkout" and the related "checkout to subdirectory".  

            How do these options relate to "Pipelines script from SCM?"

            I currently have removed the checkout stage/step in all our jenkinsfiles, thus the only SVN "checkout" and polling are the ones relating to "Pipeline script from SCM". 

            Do these two options have any affect on the Pipeline script from scm behavior?

            Show
            solvingj jerry wiltse added a comment - Andrew Bayer : regarding "skip default checkout" and the related "checkout to subdirectory".   How do these options relate to "Pipelines script from SCM?" I currently have removed the checkout stage/step in all our jenkinsfiles, thus the only SVN "checkout" and polling are the ones relating to "Pipeline script from SCM".  Do these two options have any affect on the Pipeline script from scm behavior?
            Hide
            solvingj jerry wiltse added a comment -

            Is there any other report of a similar issue? Are we the only ones having this problem with parallel jobs? Maybe few people still use SVN Plugin? 

            Show
            solvingj jerry wiltse added a comment - Is there any other report of a similar issue? Are we the only ones having this problem with parallel jobs? Maybe few people still use SVN Plugin? 
            Hide
            solvingj jerry wiltse added a comment -

            Also of note, I've discovered that we'll never really be able to use: 

            skip_default_checkout()

            as any part of a workaround because then we don't get

            SVN_REVISION_1
            and
            SVN_REPOSITORY_URL

            environment variable which we require.

            Show
            solvingj jerry wiltse added a comment - Also of note, I've discovered that we'll never really be able to use:  skip_default_checkout() as any part of a workaround because then we don't get SVN_REVISION_1 and SVN_REPOSITORY_URL environment variable which we require.
            Hide
            solvingj jerry wiltse added a comment -

            I have some new information that might be relevant to isolating the problem. I moved the pipline out of jenkinsfile and into the GUI. The duplicates in the changelog have gone away. 

            Thus, the recipe for at least one problem is simple: 

            • Pipeline from SCM
            • SVN being the SCM used for this
            • Multiple Stages?

            Unfortunately, we can't currently find a workaround here without more help.  It seems that when not using "pipeline script from SCM", the stages all run in different subdirectories rather than the directory the "checkout" stage ran in.  For some reason, the first stage after "checkout" runs in the same directory, but the others choose different directories.  our existing scripts all count on the files being there from a checkout. 

             

            Also, obviously, we want to use "pipeline script from scm", and it's really crazy if we cant. 

            Thanks again for any feedback. 

             

            Show
            solvingj jerry wiltse added a comment - I have some new information that might be relevant to isolating the problem. I moved the pipline out of jenkinsfile and into the GUI. The duplicates in the changelog have gone away.  Thus, the recipe for at least one problem is simple:  Pipeline from SCM SVN being the SCM used for this Multiple Stages? Unfortunately, we can't currently find a workaround here without more help.  It seems that when not using "pipeline script from SCM", the stages all run in different subdirectories rather than the directory the "checkout" stage ran in.  For some reason, the first stage after "checkout" runs in the same directory, but the others choose different directories.  our existing scripts all count on the files being there from a checkout.    Also, obviously, we want to use "pipeline script from scm", and it's really crazy if we cant.  Thanks again for any feedback.   
            Hide
            abayer Andrew Bayer added a comment -

            Not sure if this is a complete duplicate of JENKINS-34313, but it's definitely similar.

            Show
            abayer Andrew Bayer added a comment - Not sure if this is a complete duplicate of JENKINS-34313 , but it's definitely similar.
            Hide
            solvingj jerry wiltse added a comment -

            Yes, we can close this because this issue was resolved in that ticket.

            Show
            solvingj jerry wiltse added a comment - Yes, we can close this because this issue was resolved in that ticket.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              solvingj jerry wiltse
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: