Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55927

Multibranch pipeline scan is triggered in case of remote pull request merge

    • 935.0.0

      There is such logic in the bitbucket-branch-source plugin:

       
      if (push.getChanges().isEmpty()) {
          LOGGER.log(Level.INFO, "Received hook from Bitbucket. Processing push event on {0}/{1}",
          newObject[]{owner, repository});
          scmSourceReIndex(owner, repository);
      }
       
      And this works in case of remote pull request merges. (https://community.atlassian.com/t5/Questions/Pull-requests-Squashed-commits-Remote-Merges/qaq-p/171569)
       
      This leads to reindexing of a multibranch pipeline project which can take a lot of time.

       

      In my case I have a lot of pull requests(around 150) and build for one of them takes around an hour. These pull requests mostly targeted to the same branch. Because of this we rescan our Bitbucket multibranch pipeline only twice a week because it takes a lot of time to rebuild all of these PRs because the target branch has new changes. And if some pull request is merged remotely this leads to the reindexing which is unexpected. This locks me if this happens in the middle of the day.
       

          [JENKINS-55927] Multibranch pipeline scan is triggered in case of remote pull request merge

          Allan BURDAJEWICZ added a comment - - edited

          I was able to reproduce the problem in Bitbucket Server with the Native webhook as well as with the Bitbucket add-on. I found 2 scenarios where the empty changes may happen:

          Rebase a target branch to a source branch while a PR is open

          git checkout main
          git checkout -B source-1
          echo "test" >> source-1.md
          git add . && git commit -m "source-1: $(date)" && git push --set-upstream origin source-1
          git push --set-upstream origin source-1
          # Open a PR for from the source brancxh `source-1` to the target branch `main`
          git checkout main
          git rebase -i source-1
          git push -f
          

          This will generate 2 events. The first one with empty {changes}} and the next one with the right changes made to the target branch. (The PR is automatically merged because there is no diff and maybe that is why)..

          Auto-Sync of Fork when a branch cannot be synchronized

          When the Multibranch Pipeline is a fork and Automatic Fork Syncing is enabled. If a branch on a fork cannot be synchronized, the attempt from Bitbucket to synchronize it generates a repo:refs_changed with empty changes. The auto sync seem to kick off a couple of seconds after a change is made to the remote respective branch.

          Both the add-on and the native webhook results behave the same. Per my understanding the add-on relies on the native mechanism but add some extra configuration..

          But so far, I don't see a valid scenario where a repo:refs_changed with empty changes should be considered. Or why it would trigger branch indexing.

          I cannot reproduce such an event in Bitbucket cloud...

          Allan BURDAJEWICZ added a comment - - edited I was able to reproduce the problem in Bitbucket Server with the Native webhook as well as with the Bitbucket add-on. I found 2 scenarios where the empty changes may happen: Rebase a target branch to a source branch while a PR is open git checkout main git checkout -B source-1 echo "test" >> source-1.md git add . && git commit -m "source-1: $(date)" && git push --set-upstream origin source-1 git push --set-upstream origin source-1 # Open a PR for from the source brancxh `source-1` to the target branch `main` git checkout main git rebase -i source-1 git push -f This will generate 2 events. The first one with empty {changes}} and the next one with the right changes made to the target branch. (The PR is automatically merged because there is no diff and maybe that is why).. Auto-Sync of Fork when a branch cannot be synchronized When the Multibranch Pipeline is a fork and Automatic Fork Syncing is enabled. If a branch on a fork cannot be synchronized, the attempt from Bitbucket to synchronize it generates a repo:refs_changed with empty changes . The auto sync seem to kick off a couple of seconds after a change is made to the remote respective branch. Both the add-on and the native webhook results behave the same. Per my understanding the add-on relies on the native mechanism but add some extra configuration.. But so far, I don't see a valid scenario where a repo:refs_changed with empty changes should be considered. Or why it would trigger branch indexing. I cannot reproduce such an event in Bitbucket cloud...

          Allan BURDAJEWICZ added a comment - I also asked Atlassian about this: https://community.atlassian.com/t5/Bitbucket-questions/Webhook-repo-refs-changed-events-with-empty-changes-array/qaq-p/2868488#M109342 .

          Nikolas Falco added a comment -

          For the record, these comments fits the same issue arguments
          https://issues.jenkins.io/browse/JENKINS-37491?focusedId=332268&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-332268
          and this is the workaround (as I explained before)
          https://issues.jenkins.io/browse/JENKINS-37491?focusedId=332269&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-332269

          For sure you could better manager specific webhook type request to avoid reindex. To understand which one should re-index and which one no.

          The described behaviour it's not a bug.

          Nikolas Falco added a comment - For the record, these comments fits the same issue arguments https://issues.jenkins.io/browse/JENKINS-37491?focusedId=332268&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-332268 and this is the workaround (as I explained before) https://issues.jenkins.io/browse/JENKINS-37491?focusedId=332269&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-332269 For sure you could better manager specific webhook type request to avoid reindex. To understand which one should re-index and which one no. The described behaviour it's not a bug.

          Allan BURDAJEWICZ added a comment - - edited

          I do not think this should be closed. The problem described is the Branch Indexing triggered on specific scenario whereas it shouldn't. Branch Indexing being triggered for now obvious reason is a significant problem. It is also what the issue describes:

          This leads to reindexing of a multibranch pipeline project which can take a lot of time.
          

          The build storm is just one consequence of that problem. And the solution / workaround proposed only help deal with the build storm.

          Allan BURDAJEWICZ added a comment - - edited I do not think this should be closed. The problem described is the Branch Indexing triggered on specific scenario whereas it shouldn't. Branch Indexing being triggered for now obvious reason is a significant problem. It is also what the issue describes: This leads to reindexing of a multibranch pipeline project which can take a lot of time. The build storm is just one consequence of that problem. And the solution / workaround proposed only help deal with the build storm.

          rsandell added a comment -

          To me it sounds like the correct strategy that when an event arrives for a repo with no further information a full branch index should be made, since we got an event that something happened but not what happened we will need to check everything, otherwise we would get in trouble on the other end from other users that we don't scan often enough.

           

          In one of the specific scenarios you mention where first there is an event with nothing and after there is one with detailed information then maybe the first one could be ignored, but we don't know that when the first one arrives. The only way I can conceive on how to solve that would be to schedule the first full scan with half a minute's delay or something and if a new event comes in for the same repository within that time frame then cancel the full one. But there is still a risk of missing some update that way, and the fix would be quite complex.

           

          This sounds to me like a potential XY Problem.

          rsandell added a comment - To me it sounds like the correct strategy that when an event arrives for a repo with no further information a full branch index should be made, since we got an event that something happened but not what happened we will need to check everything, otherwise we would get in trouble on the other end from other users that we don't scan often enough.   In one of the specific scenarios you mention where first there is an event with nothing and after there is one with detailed information then maybe the first one could be ignored, but we don't know that when the first one arrives. The only way I can conceive on how to solve that would be to schedule the first full scan with half a minute's delay or something and if a new event comes in for the same repository within that time frame then cancel the full one. But there is still a risk of missing some update that way, and the fix would be quite complex.   This sounds to me like a potential XY Problem .

          Allan BURDAJEWICZ added a comment - - edited

          I wish we could get an answer to https://community.atlassian.com/t5/Bitbucket-questions/Webhook-repo-refs-changed-events-with-empty-changes-array/qaq-p/2868488#M109342.

          In the 2 only scenario where I could reproduce this, the event eith empty changes really feels like an anomaly.

          For the rebase of the target branch, once you push it you have actually 3 events:

          • repo:refs_changed for the target branch with the diffs in changes
          • repo:refs_changed to repo, with empty changes
          • pr:merged for the PR that the source branch originated from
            That event in the middle really can be discarded we have all the event we expected and needed regarding this change.

          For the autosync of fork you only get the following:

          • repo:refs_changed to repo, with empty changes
            But actually nothing has changed on the repo...

          I have attached some logger showing the events received in each scenario:

          NOTE: interestingly, the plugin maintained by Atlassian only process refs_changed events if they contain changes:

          Allan BURDAJEWICZ added a comment - - edited I wish we could get an answer to https://community.atlassian.com/t5/Bitbucket-questions/Webhook-repo-refs-changed-events-with-empty-changes-array/qaq-p/2868488#M109342 . In the 2 only scenario where I could reproduce this, the event eith empty changes really feels like an anomaly. For the rebase of the target branch, once you push it you have actually 3 events: repo:refs_changed for the target branch with the diffs in changes repo:refs_changed to repo, with empty changes pr:merged for the PR that the source branch originated from That event in the middle really can be discarded we have all the event we expected and needed regarding this change. For the autosync of fork you only get the following: repo:refs_changed to repo, with empty changes But actually nothing has changed on the repo... I have attached some logger showing the events received in each scenario: rebase-target-events.log autosync-fork-events.log NOTE: interestingly, the plugin maintained by Atlassian only process refs_changed events if they contain changes : https://github.com/jenkinsci/atlassian-bitbucket-server-integration-plugin/blob/7da79af5f2f137ed38b1f31828b45f935a9d86f4/src/main/java/com/atlassian/bitbucket/jenkins/internal/trigger/BitbucketWebhookConsumer.java#L59-L61 https://github.com/jenkinsci/atlassian-bitbucket-server-integration-plugin/blob/7da79af5f2f137ed38b1f31828b45f935a9d86f4/src/main/java/com/atlassian/bitbucket/jenkins/internal/trigger/BitbucketWebhookConsumer.java#L101C44-L106

          rsandell added a comment -

          > NOTE: interestingly, the plugin maintained by Atlassian only process refs_changed events if they contain changes:

          That discovery is interesting!

          rsandell added a comment - > NOTE: interestingly, the plugin maintained by Atlassian only process refs_changed events if they contain changes: That discovery is interesting!

          Nikolas Falco added a comment -

          The reason I closed this issue is because the desired behavior described is achieved through configuration, as already reported in the comments I link to another issue (comments by the first author of this plugin).
          This does not mean that I do not want to better evaluate which events should trigger a reindexing (new tag?), which should be discarded or others that should instead be interpreted as a deletion operation and so on.
          Surely the solution is not to remove all events that do not bring changes or to make this behavior configurable through an option that inhibits or completely cancels the effects of other components such as "Build Strategy" so that it is then impossible to understand if an anomaly is due to a bug or to such configuration.
          This type of intervention is on my to-do list but at low priority. Blocking authentication defects or those that impact the normal development lifecycle have priority.
          I mean... this behavior has been in place since the beginning, I do not want to think only now that it is such a big problem. Additionally, there are other older issues that describe both reindexing being requested for the purpose of scheduling a build for a job whose target branch has changed, and others that don't want it because it causes a "build storm" effect.

          Nikolas Falco added a comment - The reason I closed this issue is because the desired behavior described is achieved through configuration, as already reported in the comments I link to another issue (comments by the first author of this plugin). This does not mean that I do not want to better evaluate which events should trigger a reindexing (new tag?), which should be discarded or others that should instead be interpreted as a deletion operation and so on. Surely the solution is not to remove all events that do not bring changes or to make this behavior configurable through an option that inhibits or completely cancels the effects of other components such as "Build Strategy" so that it is then impossible to understand if an anomaly is due to a bug or to such configuration. This type of intervention is on my to-do list but at low priority. Blocking authentication defects or those that impact the normal development lifecycle have priority. I mean... this behavior has been in place since the beginning, I do not want to think only now that it is such a big problem. Additionally, there are other older issues that describe both reindexing being requested for the purpose of scheduling a build for a job whose target branch has changed, and others that don't want it because it causes a "build storm" effect.

          What can be prevented by configuration is the build storm in case that you scan PR with PR merge strategy. But there is no configuration that help prevent the unexpected branch indexing here. An unexpected Branch Indexing can have other consequences, if you have a repository with 100s or 1000s of branches it can monopolize access to the Bitbucket Server, hitting rate limits (I see that we are tackling other issues with rate limits), impacting anything that use the same credentials to access Bitbucket.
          Again the build storm of PR merge branches is one consequence. The real problem is branch indexing. Note that this unexpected Branch Indexing would happen even if you only scan branches.

          Allan BURDAJEWICZ added a comment - What can be prevented by configuration is the build storm in case that you scan PR with PR merge strategy. But there is no configuration that help prevent the unexpected branch indexing here. An unexpected Branch Indexing can have other consequences, if you have a repository with 100s or 1000s of branches it can monopolize access to the Bitbucket Server, hitting rate limits (I see that we are tackling other issues with rate limits), impacting anything that use the same credentials to access Bitbucket. Again the build storm of PR merge branches is one consequence. The real problem is branch indexing. Note that this unexpected Branch Indexing would happen even if you only scan branches .

          James Nord added a comment -

          Copying comment from https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/908#issuecomment-2630725908 

          > Haven't yet identified exactly what kind of push events has empty `changes`

          A mirror synchronized event with too many refs to list will set `refLimitExceeded = true` and have an empty changes.

          http://confluence.atlassian.com/bitbucketserver/event-payload-938025882.html#Eventpayload-Mirrorsynchronized

          Thus if mirrors are being used, a branch index would appear to be correct for this?

          So it would appear that a rescan is expected for empty changed from a mirror synced event.  And the PR would remove that.  I am not sure why this event is being used.

          James Nord added a comment - Copying comment from https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/908#issuecomment-2630725908   > Haven't yet identified exactly what kind of push events has empty `changes` A mirror synchronized event with too many refs to list will set `refLimitExceeded = true` and have an empty changes. http://confluence.atlassian.com/bitbucketserver/event-payload-938025882.html#Eventpayload-Mirrorsynchronized Thus if mirrors are being used, a branch index wou ld appear to be correct for this? So it would appear that a rescan is expected for empty changed from a mirror synced event.  And the PR would remove that.  I am not sure why this event is being used.

            allan_burdajewicz Allan BURDAJEWICZ
            asavanchuk Alaiksei Savanchuk
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: