-
Bug
-
Resolution: Fixed
-
Major
-
None
-
git-plugin 2.3.5
git-client-plugin 1.16.1
jenkins 1.580.3
-
Powered by SuggestiMate
We were running jenkins 1.580.2, git-plugin 2.3.3, git-client-plugin 1.15.0, before upgrading to jenkins 1.580.3, git-plugin 2.3.5, git-client-plugin 1.16.1.
After the upgrade, some our SCM poll jobs started to behave erratically, as shown by following poll log:
Started on Feb 23, 2015 1:33:00 PM Using strategy: Default [poll] Last Built Revision: Revision a74f1d1204a5c892466b52ac68ee6443c1e459d7 (refs/remotes/origin/linux-3.14.y) > /usr/bin/git --version # timeout=10 > /usr/bin/git -c core.askpass=true ls-remote -h git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git # timeout=10 [poll] Latest remote head revision on origin/linux-3.14.y is: a74f1d1204a5c892466b52ac68ee6443c1e459d7 Done. Took 0.69 sec Changes found
So, as the log above shows, "Last Built Revision" and "Latest remote head revision" are the same, and yet "Changes found". And this "Changes found" happened at each poll point (every 5 mins in our case), with build triggering, building that revision, and then doing it over again.
After closer look at by job owners, it turned out that after upgrade, this job started to use optimized workspace-less polling (using git ls-remote), whereas previously, it used workspace-ful polling.
Looking at https://github.com/jenkinsci/git-plugin/blob/master/src/main/java/hudson/plugins/git/GitSCM.java#L590 , turns out that while poll log talks about "Last Built Revision" and "Latest remote head revision", the actual logic for detecting whether change has happened is different. So, "Latest remote head revision" is taken, then some kind of data structure is queried for a build number associated with that change, and if there's no such build, it is triggered. And in our case, apparently exactly this querying part what failed, because otherwise revision values were ok.
Summing up: 2.3.5/1.16.1 appear to have made WS-less polling optimization more aggressive, which I don't really see noted in changelog. That's not bad on its own, but apparently that mode has some bugs. In this particular case, the problems can be avoided by adding an extra stop-gap check of "Last Built Revision" and "Latest remote head revision" being equal - if they're, then there're for sure no changes, regardless of presence of detailed revision-to-build mapping.
I would recommend plugin maintainers to add such a condition, to make plugin more robust.
Thanks!
- 159-build.xml
- 3 kB
- 160-build.xml
- 3 kB
[JENKINS-27093] Spurious gits scm poll change detection
Code changed in jenkins
User: Nicolas De Loof
Path:
src/main/java/hudson/plugins/git/util/BuildData.java
http://jenkins-ci.org/commit/git-plugin/aa31af020633ef2d414e3e30d3c4bc3137d0bc01
Log:
JENKINS-27093 quick check for lastBuilt.revision
Code changed in jenkins
User: Nicolas De Loof
Path:
src/main/java/hudson/plugins/git/util/BuildData.java
http://jenkins-ci.org/commit/git-plugin/cfa9180ad4d9c71fd0b06fb1bae47a9d14489957
Log:
JENKINS-27093 quick check for lastBuilt.revision
Hi, Paul, could you provide two consecutive build.xml files (you can cut any private info) that were triggered and the result of manual run for git command from polling log?
Thanks for the replies and commits already made. So, our situation is that we have pretty busy production Jenkins system, so there's only limited experimentation can be done there, and in limited time I had to look into this issue, I traced it just to the source line in the original description above. This is also 1st time I looked into SCM polling, so I don't know how related data is stored and other details.
But job we had issues to is actually public: https://ci.linaro.org/job/trigger-linux-ltsi/ . As you can see (note: builds in question may expire in a week or two), from build #132 Feb 23, 2015 11:33:10 AM to build #161 Feb 23, 2015 1:58:09 PM , build triggered every 5 mins, which is SCM polling period for that job. But typical poll log looks like: https://ci.linaro.org/job/trigger-linux-ltsi/160/pollingLog/ (content quoted above in the original description).
Kanstantsin: The repo is public, running git command from poll log, I get the expected output (I also thought that maybe there's stray space or something gets parsed, but I confirmed it all looks ok). Will try to look up those build.xml's as a next step.
$ /usr/bin/git -c core.askpass=true ls-remote -h git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 8e63197ffe7750c94c8ea9d159ce3e46a76bfcf2 refs/heads/linux-2.6.11.y d04a37911968d919fa842ad40fa9e9ff1dd10904 refs/heads/linux-2.6.12.y 816e9c6c226227c4862b2067aace0f450cc92635 refs/heads/linux-2.6.13.y 789f444285aedfb04af7aa3748aa52e99ac4bd8f refs/heads/linux-2.6.14.y f70602f4f6248735a02c61a1323c9151a33a3775 refs/heads/linux-2.6.15.y 6b0daf99dd2392b024bdca05530e4e761bc3cdae refs/heads/linux-2.6.16.y 78ace17e51d4968ed2355e8f708d233d1cc37f6d refs/heads/linux-2.6.17.y 299a2479bca6211f845158761920ec480f35a229 refs/heads/linux-2.6.18.y 09780ab3b26507776671900e0ed7920f297498ed refs/heads/linux-2.6.19.y f3815da6b4fd508cc3574399248e2e15cb8a617f refs/heads/linux-2.6.20.y a31a9035702124423c3aa5aa848937f165753a4f refs/heads/linux-2.6.21.y 37579d1574f6c18f1f648201c6b0850ac94094cd refs/heads/linux-2.6.22.y 6531868a73a6c91bf0e3e60ded7d1440ee24dfa8 refs/heads/linux-2.6.23.y 928bb8c418b5f9e96dbccc8d7eafb6635ae81548 refs/heads/linux-2.6.24.y 00935daeb04cd54a67b66c9e3babc23389251a98 refs/heads/linux-2.6.25.y 63e0e67b17dc233f93f709610971bbfadc97f75e refs/heads/linux-2.6.26.y bc4e1a77b06519a01e7aed1125695598e27ddeb2 refs/heads/linux-2.6.27.y 5861c853a3f529b9c6a338dd7c4a7afec397ea7a refs/heads/linux-2.6.28.y 12010107aaf417503b7e413d84f2554080aebfe2 refs/heads/linux-2.6.29.y 0d0675cf44c85bd3c0d891845aa02f9249cd7c68 refs/heads/linux-2.6.30.y a389e98d2c6e1900f035befe215f541436bcb0b2 refs/heads/linux-2.6.31.y 3bd0d1ad14c3566107e391732e4df658b005ad67 refs/heads/linux-2.6.32.y 86a705267a2a502a3d62ef0797e449677b25835f refs/heads/linux-2.6.33.y 5878e067ecac4bd2320e933ec485c01190a5b881 refs/heads/linux-2.6.34.y 675f7660ffb0e1880011f6b3c4f9ac241491e3cd refs/heads/linux-2.6.35.y 69ad303ab8321656d6144d13b2444a5595bb6581 refs/heads/linux-2.6.36.y e396c9d8699c95d52b2abcc2d4d5f9616e839734 refs/heads/linux-2.6.37.y 4b7a6d2528bfb625cc359d89ac16439b0ec744ea refs/heads/linux-2.6.38.y ea0dc0dc1c1dca25e50384e300a528db57ee7de5 refs/heads/linux-2.6.39.y 5dba9ddd98cbc7ad319d687887981a0ea0062c75 refs/heads/linux-3.0.y 9bb1282f6a7754955c18be912fbc2b55d133f1b9 refs/heads/linux-3.1.y 5cfc71ce138e79ceb6250f78137dd05ba52e9d34 refs/heads/linux-3.10.y 5ee54f38171b9b3541c5e9cf9c3a9e53455fd8b4 refs/heads/linux-3.11.y 22ccf8f1a5450ac5a6bc2bb519699838017ce1ea refs/heads/linux-3.12.y 2d20120bba8475c963a8d28dd0ffa13637fa3ad7 refs/heads/linux-3.13.y a74f1d1204a5c892466b52ac68ee6443c1e459d7 refs/heads/linux-3.14.y f35b5e46feabab668a44df5b33f3558629f94dfc refs/heads/linux-3.15.y d0335e4feea0d3f7a8af3116c5dc166239da7521 refs/heads/linux-3.16.y bc15d4627aa8f562a1c5ec9d84076b8db25bab31 refs/heads/linux-3.17.y a17f9bf1f7cd3412b9920577a7c0ec34cb81b233 refs/heads/linux-3.18.y bfa76d49576599a4b9f9b7a71f23d73d6dcff735 refs/heads/linux-3.19.y fd623507bdcee1f7a387ae86adb7a66b431dd056 refs/heads/linux-3.2.y 845720650c557a75262b629b0bc228fffcf64638 refs/heads/linux-3.3.y 28895317f9a7d726cd13fc9b5447cb5dcb5cd22c refs/heads/linux-3.4.y f2b152564afdf9c9917c17d1c41c1082c82067bd refs/heads/linux-3.5.y b2824f4e0990716407b0c0e7acee75bb6353febf refs/heads/linux-3.6.y 356d8c6fb2a7cf49e836742738a8b9a47e77cfea refs/heads/linux-3.7.y dbf932a9b316d5b29b3e220e5a30e7a165ad2992 refs/heads/linux-3.8.y 896f5009ed1fbaec43f360c4ebf022639cd61d5f refs/heads/linux-3.9.y c517d838eb7d07bbe9507871fab3931deccff539 refs/heads/master
This looks like regression by some of ndeloof commits for 2.3.5. Will you be able to deploy patched git plugin with additional logging?
As I mentioned, this is pretty busy production system, but I checked with its stakeholders, they've done with release builds for now. So yes, we appreciate your looking into this and can do more testing like deploying test stuff.
The most funny thing that right now i have infinite polling for 2.2.12 on production system. It see changes constantly and the only thing i found: periodically failed slave and jenkins restart.
I am having the problem of polling always deciding that there were changes.
I have a master with many branches and I want to just look at one branch.
there were many past problems in this area, so it was confusing looking at jiras in 2014
Finally I found this jira.
The code in question that decides if there were changes, had a big change Feb 3, 2015
look here for the diff see where the buildData.hasBeenBuilt(head) continue;
has been been replaced by other code
https://github.com/jenkinsci/git-plugin/commit/70857d247d1ef3ab57f7ce047b8d4c7fe88aa4f3#diff-f1f2ff967f38c8b53a4901be3169035e
I took the hint from above and reverted my git-plugin from 2.3.5 to 2.3.4 (not all the way back to 2.3.3)
I finally got the git polling log to say "no changes" rather than changes all the time. However I also added a "git checkout $BRANCH"
to my build script. So I have to isolate whether that may have helped. The comment about headless git stuff got me wondering about whether a "git checkout $BRANCH" would help get my workspace into a "good" state. I don't use the workspace polling additional thing though.
here's evidence that I was able to get it to to not detect a change.
Before, I was seeing what others are reporting above: the sha's were the same but it was still detecting changes. I suspect the code change at the above url isn't exactly right.
Started on Mar 16, 2015 3:15:00 AM
Using strategy: Default
[poll] Last Built Revision: Revision f79163c96d12a2277de247c0d8c530db0a9b606c (origin/brandon_jenkins)
> git ls-remote -h https://github.com/h2oai/h2o-dev.git brandon_jenkins # timeout=10
[poll] Latest remote head revision is: f79163c96d12a2277de247c0d8c530db0a9b606c
Done. Took 1.9 sec
No changes
-kevin
I can confirm my case now works, just with rolling back to git-plugin 2.3.4 (from 2.3.5)
that was the only change and it work. I have 6 similar jenkins servers each with 8 similar jobs, so it's pretty good checking. The jobs work separately on 12 branches in the same repo.
If there was a patched plugin with more logging, I could try it
anstantsin Shautsou mentioned above?
-kevin
Hi, sorry was busy with fixing other bad plugins. Please, don't wait me I also hope that author of this regression will resolve itself.
Kevin, could you reproduce this issue with clean job, that will track 1 branch and provide 2 build.xml files with build/polling logs for them? (Linux kernel is really big)
Hi Kanstantsin,
yes I'll do that.
I have to use another server with 2.3.5 plugin. Those servers have to stay at 2.3.4 now that they seem to work (I actually didn't check now that they see changes correctly, I have both scheduled and polled build on them. but I'll find out in a day or two whether they build on changes)
I'll get the files you mention with another test job.
-kevin
I'm trying on another server, and oddly it seemed to work. I may have to restore the original server to 2.3.5 to get the failing case
I'm using two different branches but the setups are the same.
But look here. there are two different messages in the polling logs from these two servers. On the 'Latest remote head revision.." line..
one says "already built by 2" (build #2)
so something weird made this first case "work". ..but the 2nd case failed when git plugin was at 2.3.5
apparently jenkins is now at 1.604. I've since upgraded both to 1.604 to get jenkins version questions eliminated.
this is one machine that I thought would fail (always detect changes), but apparently didn't.
jenkins was at 1.602
git plugin 2.3.5
git client 1.16.1
Git Polling Log
Started on Mar 16, 2015 5:27:00 AM
Using strategy: Default
[poll] Last Built Revision: Revision 61afd429821a5850a04c9a4870062189a359ca69 (origin/kevin_jenkins)
> git --version # timeout=10
> git -c core.askpass=true ls-remote -h https://github.com/h2oai/h2o-dev.git # timeout=10
[poll] Latest remote head revision on origin/kevin_jenkins is: 61afd429821a5850a04c9a4870062189a359ca69 - already built by 2
Done. Took 0.79 sec
No changes
When I did a commit/push, it correctly detected the change with the poll..
Polling Log
View as plain text
This page captures the polling log that triggered this build.
Started on Mar 16, 2015 5:31:00 AM
Using strategy: Default
[poll] Last Built Revision: Revision 61afd429821a5850a04c9a4870062189a359ca69 (origin/kevin_jenkins)
> git --version # timeout=10
> git -c core.askpass=true ls-remote -h https://github.com/h2oai/h2o-dev.git # timeout=10
[poll] Latest remote head revision on origin/kevin_jenkins is: e1f6277d4489a067b1ccf2b4ab6be57d148e5879
Done. Took 0.74 sec
Changes found
the other machine says no changes also, but it doesn't have that "already built by" message
I am trying with a different branch here, but same repo.
this is the machine that failed with 2.3.5 (too many builds) and now correctly detects "no changes" with 2.3.4
jenkins was at 1.602
git plugin 2.3.4 (downgraded manually form 2.3.5)
git client plugin 1.16.1
Git Polling Log
Started on Mar 16, 2015 5:17:00 AM
Using strategy: Default
[poll] Last Built Revision: Revision f79163c96d12a2277de247c0d8c530db0a9b606c (origin/tomk_jenkins)
> git ls-remote -h https://github.com/h2oai/h2o-dev.git tomk_jenkins # timeout=10
[poll] Latest remote head revision is: f79163c96d12a2277de247c0d8c530db0a9b606c
Done. Took 2.1 sec
No changes
I'm totally confused now. I may not be able to reproduce the problem
I went back and upgraded all my jenkins servers to git plugin 2.3.5
and did a
apt-get install jenkins --reinstall
to get the latest 1.604 jenksin
and now they seem to all be polling correctly and not detecting changes (because the branch didn't change)
They all have that "already built by .." message ..that says what the prior build was, which is good
Git Polling Log
Started on Mar 16, 2015 6:20:00 AM
Using strategy: Default
[poll] Last Built Revision: Revision 1dad640e7ec50cea3965522efa8c797a3ea9b3bc (origin/seb_jenkins)
> git ls-remote -h https://github.com/h2oai/h2o-dev.git # timeout=10
[poll] Latest remote head revision on origin/seb_jenkins is: 1dad640e7ec50cea3965522efa8c797a3ea9b3bc - already built by 44
Done. Took 1.7 sec
No changes
> I'm totally confused now. I may not be able to reproduce the problem
Well, I personally got an impression that the issues we faced might have been related to the fact that between plugin versions upgrades, polling methods changed, and "new" method may have lacked some metadata, not generated by previous method. Something like that. Anyway, I suggested a short-circuit fix for this issue, and it was applied, so just need to wait for 2.36 release (we also have busy production systems, so building and testing debug builds is complicated). We're on Jenkins LTS releases btw (still on 1.580.3, waiting for maintenance window to upgrade to 1.596).
Hi thanks.
Well, a couple of days has gone by with multiple jenkins using similar jobs (polling with parameterized branches). 6 jenkins masters each with 8 jobs. total of 12 branches. Seems to be working as desired with single branches resolved correctly and no unnecessary building due to changes to other branches.
I've not tried to recreate the initial problem. It was weird since it affected every jenkins master I had created, and then it started working right as I noted above, after messing with reinstall of jenkins and the git-plugin. (I think that was what led it to working..although there could have been some racy thing. I was doing combined polling plus scheduled build. I don't think there was any race effect going on, but if jenkins was looking at "last build" state somehow, you never know.
But looking at other open jiras in this area:
I found another open jira in this area that is interesting. I use a parameter ${BRANCH} to specify the branch. While there have been issues in the past with that, I notice in this jira a very odd behavior, where the behavior depends on the branch being different in the last build.
I was wondering if I copied jenkins setups with jobs pointing to master or a different branch, then modified the branch parameter in the job. (well yeah, I know that's how I setup all my different jenkins masters on different machines)
I'm wondering if the issue mentioned in this jira could have affected things..i.e. jenkins deciding that a branch variable wasn't what it was supposed to be.
BRANCH_TO_BUILD is equivalent to my use of BRANCH (${BRANCH}) in the jobs I was having problems with above.
quoting from this link
https://issues.jenkins-ci.org/browse/JENKINS-27349
Git SCM-polling uses wrong parameter values for a parametrized branchspec
to reproduce:
A parametrized build with parameter BRANCH_TO_BUILD, with default value "branch1"
Git SCM with branch specifier "${BRANCH_TO_BUILD}"
SCM polling enabled
Previous manual job launch with BRANCH_TO_BUILD=branch2
Expected:
When polling, BRANCH_TO_BUILD is set to branch1 (default value), so only updates to branch1 trigger a build.
Actual:
When polling, BRANCH_TO_BUILD is set to branch2 (value from last execution).
also:
https://issues.jenkins-ci.org/browse/JENKINS-27327
When specifying the branch to poll as part of a parameterized build, git-plugin uses last polled branch instead
another related one, but not what I was seeing:
https://issues.jenkins-ci.org/browse/JENKINS-27332
git-plugin no longer detects changes of branch with /
The basic question of using a parameterized branch seems to have been fixed in Mar...I had been wondering if that was affecting me (since I'm using a branch parameter) ..but this says it's fixed
https://issues.jenkins-ci.org/browse/JENKINS-27166
CLONE - Git SCM-polling doesn't work when using a parametrized branch-name
Kevin Normoyle: Thanks for this survey and cross-referencing other issues. Just for information, our job, for which this issue was originally reported (https://ci.linaro.org/job/trigger-linux-ltsi), doesn't have branch parametrized, it is set statically in the job config.
I have a similar problem which might be related. Jenkins was working fine with me until I upgraded the git plugin on April 1st, 2015. After that date, Jenkins performs the polling as expected but it never find changes although there is.
When I downgraded the git plugin from 2.3.5 to 2.3, everything started working again.
Hi Mohamed.
As I noted above, the problem I had disappeared as I was messing with things.
what sort of system do you have jenkins running on?
I reinstalled jenkins on my ubuntu system with apt-get install jenkins --reinstall
that may have had a correlation
I can show some things from my current jobs.
Git client plugin version 1.16.1
Git plugin version 2.3.5
Github API plugin 1.63
Github plugin 1.11
Jenkins 1.607
I have a String Parameter on the build called BRANCH
Under Advanced in Source Code Management, the Refspec is
+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}
Branches to Build (Branch Specifier) is
${BRANCH}
I'm using Poll SCM
it currently is working, with Git Polling Log giving results like this when no change is needed
Started on Apr 6, 2015 3:55:00 PM
Using strategy: Default
[poll] Last Built Revision: Revision 4a8e9987b4d2ca94a966abe88a590809e3109e4c (origin/cliffc_jenkins)
> git ls-remote -h https://github.com/h2oai/h2o-dev.git # timeout=10
[poll] Latest remote head revision on origin/cliffc_jenkins is: 4a8e9987b4d2ca94a966abe88a590809e3109e4c - already built by 1032
Done. Took 3.5 sec
No changes
Hi Kevin,
I'm using Ubuntu 14.04. And originally installed Jenkins using the apt-get method as explained here
My Jenkins server version is 1.607. And my git repository is hosted on BitBucket.
My issue seems to be a different than this one. I'll file a separate JIRA.
The git plugin 2.3.6 pre-release and git client plugin 1.18.0 pre-release are being tested in hopes of releasing new versions before the end of June. If you're willing to assist with the testing, please download and install a pre-release build of the git client plugin and the git plugin. Problems detected in the pre-release should be e-mailed to MarkEWaite and ndeloof.
I wrote some test ideas if you would like suggestions of areas that need testing. The git plugin supports many different use cases and its automated tests only evaluate a very few of those use cases.
For sure such a check can be introduced, but would be interesting to understand why the last build based on commit sha1 isn't identified