Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-58692

Change in treatment of Success - Stable vs. Unstable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Critical Critical
    • core

      We've recently noticed on our Jenkins instance (at build.kde.org) that builds which are unstable are no longer considered "Successful" by Jenkins.

      This means that all of our views are now broken, because we've used "Successful" as meaning it successfully built (even if tests failed). Our expectations appear to align with the Jenkins terminology guide ( https://wiki.jenkins.io/display/JENKINS/Terminology )

      This behaviour appeared sometime after Jenkins 2.184, and can be viewed at https://build.kde.org/job/Applications/view/Everything%20-%20stable-kf5-qt5/job/kopete/job/stable-kf5-qt5%20SUSEQt5.12/

      (Note that only Build #1 is considered Successful, even though all builds of that job had the result of being Unstable. The correct behaviour in this instance should be for the latest Successful activity for that job to be Build #4 - as it did complete successfully, even if it is unstable)

          [JENKINS-58692] Change in treatment of Success - Stable vs. Unstable

          Ben Cooksley created issue -

          Ben Cooksley added a comment -

          Over the past week we've started receiving additional complaints that a number of projects were not getting builds triggered. Examination of the Polling logs would show something like the following:

          Started on Aug 7, 2019 8:33:46 AM
          Using strategy: Default
          [poll] Last Built Revision: Revision 597ffa6a5e89b7e05180ccb3517973b3867d72fa (refs/remotes/origin/Applications/19.08)
          No credentials specified
           > git --version # timeout=10
           > git ls-remote -h git://anongit.kde.org/konsole # timeout=10
          Found 54 remote heads on git://anongit.kde.org/konsole
          [poll] Latest remote head revision on refs/heads/Applications/19.04 is: 550cd447bc4bb79cc8920a147e84f7afb35406d6 - already built by 2
          no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
          no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
          no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
          no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
          no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
          Done. Took 0.1 sec
          No changes
          

          Examining the Jenkins Core changelog indicated that maintenance of symlinks within Jenkins Core for jobs/projects had been removed and transferred to a plugin. Following installation of that plugin and running jobs again, we've found that correct functionality (both in terms of the branches being polled by Jenkins and the views being updated).

          As such this now appears to be a regression, and given it prevents Git polling from working properly in certain cases (when it's managed as part of a Declarative Pipeline) it actually breaks core functionality of Jenkins.

           

          Ben Cooksley added a comment - Over the past week we've started receiving additional complaints that a number of projects were not getting builds triggered. Examination of the Polling logs would show something like the following: Started on Aug 7, 2019 8:33:46 AM Using strategy: Default [poll] Last Built Revision: Revision 597ffa6a5e89b7e05180ccb3517973b3867d72fa (refs/remotes/origin/Applications/19.08) No credentials specified > git --version # timeout=10 > git ls-remote -h git: //anongit.kde.org/konsole # timeout=10 Found 54 remote heads on git: //anongit.kde.org/konsole [poll] Latest remote head revision on refs/heads/Applications/19.04 is: 550cd447bc4bb79cc8920a147e84f7afb35406d6 - already built by 2 no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644 no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644 no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644 no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644 no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644 Done. Took 0.1 sec No changes Examining the Jenkins Core changelog indicated that maintenance of symlinks within Jenkins Core for jobs/projects had been removed and transferred to a plugin. Following installation of that plugin and running jobs again, we've found that correct functionality (both in terms of the branches being polled by Jenkins and the views being updated). As such this now appears to be a regression, and given it prevents Git polling from working properly in certain cases (when it's managed as part of a Declarative Pipeline) it actually breaks core functionality of Jenkins.  
          Ben Cooksley made changes -
          Priority Original: Major [ 3 ] New: Critical [ 2 ]
          Ben Cooksley made changes -
          Component/s New: git-plugin [ 15543 ]

          Mark Waite added a comment -

          bcooksley I'm not understanding how the change from using symlinks has affected the git polling. Could you provide more details so that I can better understand. We may need help from jglick on differences related to the removal of symlinks.

          Mark Waite added a comment - bcooksley I'm not understanding how the change from using symlinks has affected the git polling. Could you provide more details so that I can better understand. We may need help from jglick on differences related to the removal of symlinks.

          Ben Cooksley added a comment -

          markewaite I'm not sure how it managed to make an impact either, however the behaviour we were seeing prior to the restoration of the maintenance of the symlinks by the plugin was the behaviour noted above - namely, that changes to the branch name weren't being picked up.

          Interestingly, it picked up that the last successful build was 'Applications/19.08', yet for reasons unknown continued to poll an older branch - 'Applications/19.04'.

          You can find copies of the Pipeline templates, along with the Job DSL scripts we use to provision all the jobs on our Jenkins instance at https://invent.kde.org/sysadmin/ci-tooling/tree/master/ (running of helpers/gather-jobs.py is required prior to trying to evaluate the dsl/*.groovy scripts)

          To provide a bit of background, we reuse the same jobs when the stable branches for our software changes, and just update the job to refer to the new branches as needed. This functionality has to date worked perfectly reliably, until the release of Jenkins 2.185/2.186 (we jumped straight from 2.184 to 2.186 due to the Trilead SSH issues in 2.185).

          The solution for us was to install the Symlink plugin, after which normal functionality was restored with 2.186+

          Ben Cooksley added a comment - markewaite I'm not sure how it managed to make an impact either, however the behaviour we were seeing prior to the restoration of the maintenance of the symlinks by the plugin was the behaviour noted above - namely, that changes to the branch name weren't being picked up. Interestingly, it picked up that the last successful build was 'Applications/19.08', yet for reasons unknown continued to poll an older branch - 'Applications/19.04'. You can find copies of the Pipeline templates, along with the Job DSL scripts we use to provision all the jobs on our Jenkins instance at  https://invent.kde.org/sysadmin/ci-tooling/tree/master/  (running of helpers/gather-jobs.py is required prior to trying to evaluate the dsl/*.groovy scripts) To provide a bit of background, we reuse the same jobs when the stable branches for our software changes, and just update the job to refer to the new branches as needed. This functionality has to date worked perfectly reliably, until the release of Jenkins 2.185/2.186 (we jumped straight from 2.184 to 2.186 due to the Trilead SSH issues in 2.185). The solution for us was to install the Symlink plugin, after which normal functionality was restored with 2.186+
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-37862 [ JENKINS-37862 ]

          Jesse Glick added a comment - - edited

          So this is hypothesized to be a regression from JENKINS-37862? I cannot think offhand of any reason why that would be so; workflow-job (the source of the no polling baseline in … message noted above) does not rely on the existence of symlinks to resolve logical permalinks. The change in question did change how permalinks are cached, so as not to read symlinks for this purpose (now a plain text file is used instead), but the build-symlink plugin does not override this new mechanism, so it should not be able to fix any regression from that aspect. The existence of the RunListener in that plugin could perhaps be forcing a cache update that would not otherwise occur, but I do not see how that could be so either, since PeepholePermalink already updates the cache for every standard permalink at the end of every build.

          Is there any known way to reproduce this bug, from scratch, using minimal instructions?

          Jesse Glick added a comment - - edited So this is hypothesized to be a regression from JENKINS-37862 ? I cannot think offhand of any reason why that would be so; workflow-job (the source of the no polling baseline in … message noted above) does not rely on the existence of symlinks to resolve logical permalinks. The change in question did change how permalinks are cached, so as not to read symlinks for this purpose (now a plain text file is used instead), but the build-symlink plugin does not override this new mechanism, so it should not be able to fix any regression from that aspect. The existence of the RunListener in that plugin could perhaps be forcing a cache update that would not otherwise occur, but I do not see how that could be so either, since PeepholePermalink already updates the cache for every standard permalink at the end of every build. Is there any known way to reproduce this bug, from scratch, using minimal instructions?

          Ben Cooksley added a comment -

          I'm afraid i've not attempted to reproduce this bug, and experimenting with returning our production systems to a potentially broken state isn't really an option.

          The only thing I could recommend in this case would be using https://invent.kde.org/sysadmin/ci-tooling/blob/master/pipeline-templates/SUSEQt5.12.template as a starting point.

          The only Stage that matters in that job from the perspective of this bug is the Checkout Sources stage, so you can probably delete the rest without too much impact (although it may be worth forcing the build to always be UNSTABLE)

          It is worth noting that we were also experiencing issues with job runs not being considered Successful by Jenkins unless they were also Stable, which impacted views as noted above. As only some jobs were experiencing the issue of not having the correct branches polled, it is possible that these two issues are somehow related - especially given they both disappeared when the plugin is installed.

          Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here?

          The additional Groovy declarations you'll need to include to use the above template are as follows:

          ```def repositoryUrl = "git://anongit.kde.org/konsole"
          def browserUrl = "https://cgit.kde.org/konsole.git"
          def branchToBuild = "master"
          def productName = "Applications"
          def projectName = "konsole"
          def branchGroup = "kf5-qt5"
          def currentPlatform = "SUSEQt5.12"
          def ciEnvironment = "production"
          def buildFailureEmails = "konsole-devel@kde.org"
          def unstableBuildEmails = ""```

          Ben Cooksley added a comment - I'm afraid i've not attempted to reproduce this bug, and experimenting with returning our production systems to a potentially broken state isn't really an option. The only thing I could recommend in this case would be using  https://invent.kde.org/sysadmin/ci-tooling/blob/master/pipeline-templates/SUSEQt5.12.template  as a starting point. The only Stage that matters in that job from the perspective of this bug is the Checkout Sources stage, so you can probably delete the rest without too much impact (although it may be worth forcing the build to always be UNSTABLE) It is worth noting that we were also experiencing issues with job runs not being considered Successful by Jenkins unless they were also Stable, which impacted views as noted above. As only some jobs were experiencing the issue of not having the correct branches polled, it is possible that these two issues are somehow related - especially given they both disappeared when the plugin is installed. Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here? The additional Groovy declarations you'll need to include to use the above template are as follows: ```def repositoryUrl = "git://anongit.kde.org/konsole" def browserUrl = "https://cgit.kde.org/konsole.git" def branchToBuild = "master" def productName = "Applications" def projectName = "konsole" def branchGroup = "kf5-qt5" def currentPlatform = "SUSEQt5.12" def ciEnvironment = "production" def buildFailureEmails = "konsole-devel@kde.org" def unstableBuildEmails = ""```

          Jesse Glick added a comment -

          Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here?

          That would be my guess, which is why I suspect that the exact sequence of operations matters.

          Jesse Glick added a comment - Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here? That would be my guess, which is why I suspect that the exact sequence of operations matters.

            Unassigned Unassigned
            bcooksley Ben Cooksley
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: