Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42518

Provide lightweight checkout capability for bitbucket to avoid repository clone for multi-branch pipeline jobs

    • Blue Ocean 1.4 - beta 2

      JENKINS-33273 introduced a lightweight checkout capability to allow SCM implementors to avoid having checkout the scm on the master to read the contents of the Jenkinsfile from the SCM.  The comments in that bug mention that git doesn't offer a way to do so but I would think this is possible to do using Bitbucket APIs to read a particular file's contents on a branch.  

      In our use case, our repository is multi-gigabyte and we have many developers working within it creating many branches so cloning the repo on master for every branch is very costly in terms of time and storage.

          [JENKINS-42518] Provide lightweight checkout capability for bitbucket to avoid repository clone for multi-branch pipeline jobs

          Patrick Wolf added a comment -

          This is already included in workflow-cps-plugin (core Pipeline) version 2.29 released on March 3.

          https://wiki.jenkins-ci.org/display/JENKINS/Pipeline+Groovy+Plugin

          Patrick Wolf added a comment - This is already included in workflow-cps-plugin (core Pipeline) version 2.29 released on March 3. https://wiki.jenkins-ci.org/display/JENKINS/Pipeline+Groovy+Plugin

          Peter Hayes added a comment -

          I just downloaded the very latest Jenkins, updated all plugins and created a Multibranch Pipeline job configuring Bitbucket as the SCM supplying my Bitbucket Server Web credentials and I still get an "@script" clone for each branch identified as well as a clone for the pipeline checkout step.  

          Peter Hayes added a comment - I just downloaded the very latest Jenkins, updated all plugins and created a Multibranch Pipeline job configuring Bitbucket as the SCM supplying my Bitbucket Server Web credentials and I still get an "@script" clone for each branch identified as well as a clone for the pipeline checkout step.  

          Albert V added a comment -

          I've got an updated Jenkins and plugins too and I've experimenting the same issue using a Multibranch job with Bitbucket as a source.

          I have a repo of 2,6GB with 2 branches and in the Jenkins Master workspace directory I found:
          2,6G part_of_the_branch_name_1-UBEA5XUZNXNXC3VOM6COTGUSIEU26KDGQPKDMGV36VVYRJ6MBLRQ@script
          2,6G part_of_the_branch_name_1-BMHC5R4EOUCRCXQQ77PBNZ2YTY6J5NEQLJ5DXELS3CWYZLG3BJQA@script

          I'm assuming that the famous JENKINS-33273 issue is not applied in the Bitbucket sources case.
          Would if be possible to make a lightweight "checkout" of the Jenkinsfile with something like https://bitbucketServer.url/projects/project-name/repos/repo-name/browse/Jenkinsfile?raw ???

          Nowadays, as it is working right now is making my CI/CD environment un-maintainable.

          Albert V added a comment - I've got an updated Jenkins and plugins too and I've experimenting the same issue using a Multibranch job with Bitbucket as a source. I have a repo of 2,6GB with 2 branches and in the Jenkins Master workspace directory I found: 2,6G part_of_the_branch_name_1-UBEA5XUZNXNXC3VOM6COTGUSIEU26KDGQPKDMGV36VVYRJ6MBLRQ@script 2,6G part_of_the_branch_name_1-BMHC5R4EOUCRCXQQ77PBNZ2YTY6J5NEQLJ5DXELS3CWYZLG3BJQA@script I'm assuming that the famous JENKINS-33273 issue is not applied in the Bitbucket sources case. Would if be possible to make a lightweight "checkout" of the Jenkinsfile with something like https://bitbucketServer.url/projects/project-name/repos/repo-name/browse/Jenkinsfile?raw ??? Nowadays, as it is working right now is making my CI/CD environment un-maintainable.

          Andrew Reslan added a comment -

          I have a similar situation, my Bitbucket Git Repo is 3GB, it takes at least 1hr to clone with jenkins and Bitbucket on the same LAN.

          The bitbucket-branch-source-plugin has resolved the initial scan for the branches with the Jenkinsfile's,

          I am now looking at how to minimize the time required to checkout the individual Jenkinsfile's, using the BitBucket REST API to checkout the raw contents of the Jenkinsfile looks like a potential option?

          https://api.bitbucket.org/1.0/repositories/\{accountname}/{repo_slug}/raw/{revision}/{path}

          Andrew Reslan added a comment - I have a similar situation, my Bitbucket Git Repo is 3GB, it takes at least 1hr to clone with jenkins and Bitbucket on the same LAN. The bitbucket-branch-source-plugin has resolved the initial scan for the branches with the Jenkinsfile's, I am now looking at how to minimize the time required to checkout the individual Jenkinsfile's, using the BitBucket REST API to checkout the raw contents of the Jenkinsfile looks like a potential option? https://api.bitbucket.org/1.0/repositories/\ {accountname}/{repo_slug}/raw/{revision}/{path}

          Utopic Men added a comment -

          With an updated Jenkins instance, I have a weir behavior.

          I created 2 projects : 

          • a Bitbucket Team/Project and
          • a Multibranch Pipeline.

          The first one still create this "script" folder (my repo is 4 Go) so I experience the same issue described above.

          But the Multibranch pipeline works perfectly without creating the script folder.

          So I don't think it's Github or Bitbucket related...

          Utopic Men added a comment - With an updated Jenkins instance, I have a weir behavior. I created 2 projects :  a Bitbucket Team/Project and a Multibranch Pipeline. The first one still create this "script" folder (my repo is 4 Go) so I experience the same issue described above. But the Multibranch pipeline works perfectly without creating the script folder. So I don't think it's Github or Bitbucket related...

          Utopic Men added a comment -

          I still want to use this Bitbucket plugin because of the PR integration (build status pushed to Bibucket).

          Utopic Men added a comment - I still want to use this Bitbucket plugin because of the PR integration (build status pushed to Bibucket).

          Just adding my observations to this:

          On the master node, for every branch of every repo, I also get different behavior when using GitSCMSource vs BitbucketSCMSource plugins.

          • GitSCMSource: No "@script" folder, Jenkinsfile fetched directly from Git without requiring a checkout, but shared libraries put in the "@libs" folder when using:
            <sources class="jenkins.branch.MultiBranchProject$BranchSourceList">
               <data>
                  <jenkins.branch.BranchSource>
                     <source class="jenkins.plugins.git.GitSCMSource">
                        ...
          • ..causing a minimal footprint

           

          • BitbucketSCMSource: Git clone with full history (sometimes years of history) and checkout (minus LFS objects, also no submodules) in workspace "@script" folder when using:
            <sources class="jenkins.branch.MultiBranchProject$BranchSourceList">
               <data>
                  <jenkins.branch.BranchSource plugin="branch-api">
                     <source plugin="cloudbees-bitbucket-branch-source" class="com.cloudbees.jenkins.plugins.bitbucket.BitbucketSCMSource">
                        ...
          • ..causing a huge footprint, hundreds of megabytes and for some projects several GBs for every branch in every repo

           

          Anders Holmblad added a comment - Just adding my observations to this: On the master node, for every branch of every repo, I also get different behavior when using GitSCMSource vs BitbucketSCMSource plugins. GitSCMSource: No "@script" folder, Jenkinsfile fetched directly from Git without requiring a checkout, but shared libraries put in the "@libs" folder when using: <sources class="jenkins.branch.MultiBranchProject$BranchSourceList">     <data>       <jenkins.branch.BranchSource>           <source class="jenkins.plugins.git.GitSCMSource">             ... ..causing a minimal footprint   BitbucketSCMSource: Git clone with full history (sometimes years of history) and checkout (minus LFS objects, also no submodules) in workspace "@script" folder when using: <sources class="jenkins.branch.MultiBranchProject$BranchSourceList">     <data>       <jenkins.branch.BranchSource plugin="branch-api">           <source plugin="cloudbees-bitbucket-branch-source" class="com.cloudbees.jenkins.plugins.bitbucket.BitbucketSCMSource">             ... ..causing a huge footprint, hundreds of megabytes and for some projects several GBs for every branch in every repo  

          Gabriel Ash added a comment -

          andersh I get the same behavior. I was experimenting moving from GitSCMSource to BitbucketSCMSource for the pull request building functionality and overall tighter integration with Bitbucket, but this is a deal breaker for me. 

          It takes more than 10 minutes to clone my repo down, and I can't modify the clone behavior (either adding a timeout or using a reference repo) until I get the Jenkinsfile on the master, so every time someone pushes a new branch the first (or 2nd depending on how long the clone took) fails. I also have tons of copies of my repo cloned on my master that aren't used for anything but getting my Jenkinsfile. 

          Gabriel Ash added a comment - andersh I get the same behavior. I was experimenting moving from GitSCMSource to BitbucketSCMSource for the pull request building functionality and overall tighter integration with Bitbucket, but this is a deal breaker for me.  It takes more than 10 minutes to clone my repo down, and I can't modify the clone behavior (either adding a timeout or using a reference repo) until I get the Jenkinsfile on the master, so every time someone pushes a new branch the first (or 2nd depending on how long the clone took) fails. I also have tons of copies of my repo cloned on my master that aren't used for anything but getting my Jenkinsfile. 

          Gabriel Ash added a comment -

          cloudbees as this is assigned to you guys, do you have any input on it? I think this is appropriately marked as a "Major" priority as any reasonably large repo doesn't work with the Bitbucket Branch Source plugin. I'd love to make use of the pull request builder (I use it on my smaller repositories and it's really powerful) but this bug forces me to just use a generic 'Git' source for my multibranch pipelines.

          Gabriel Ash added a comment - cloudbees as this is assigned to you guys, do you have any input on it? I think this is appropriately marked as a "Major" priority as any reasonably large repo doesn't work with the Bitbucket Branch Source plugin. I'd love to make use of the pull request builder (I use it on my smaller repositories and it's really powerful) but this bug forces me to just use a generic 'Git' source for my multibranch pipelines.

          Michael Neale added a comment -

          gabrielbash yeah I think this is even more important than that - it seems pretty common for people to have big repos and it is reasonable to think it shouldn't hurt them the way it does. This is a blocker for many people, hopefully someone will have a solution soon. 

          Michael Neale added a comment - gabrielbash yeah I think this is even more important than that - it seems pretty common for people to have big repos and it is reasonable to think it shouldn't hurt them the way it does. This is a blocker for many people, hopefully someone will have a solution soon. 

          Michael Neale added a comment -

          abayer just adding this one to the list as you indicated interest. If not - we can leave it unassigned.

          Michael Neale added a comment - abayer just adding this one to the list as you indicated interest. If not - we can leave it unassigned.

          Jesse Glick added a comment -

          GitSCMSource: No "@script" folder, Jenkinsfile fetched directly from Git without requiring a checkout

          Well…not a “checkout” technically, but a naked clone. This is a caching layer implemented in GitSCMSource to work around the fact that the generic Git network protocol does not offer a way of retrieving an individual file revision; you need to fetch commits locally. (The Mercurial plugin does much the same thing.) The advantages over “heavyweight checkout” are that

          • repositories consume less disk than working copies with repositories
          • there is only cache repository for a given URL (the cache key computation is a little tricky here), rather than needing one per job

          So BitbucketSCMSource could probably be improved merely by delegating this SCMFileSystem behavior to GitSCMSource.

          But of course it could do a lot better still by using BitBucket APIs to retrieve blobs directly, as GitHubSCMSource does, avoiding the need for any clone on master (well GitHubSCMSource still has master-based clones for PR merges, pending JENKINS-43194). This is not particularly hard to do in general, it just means implementing and testing some more APIs. See existing implementations.

          Jesse Glick added a comment - GitSCMSource: No "@script" folder, Jenkinsfile fetched directly from Git without requiring a checkout Well…not a “checkout” technically, but a naked clone. This is a caching layer implemented in GitSCMSource to work around the fact that the generic Git network protocol does not offer a way of retrieving an individual file revision; you need to fetch commits locally. (The Mercurial plugin does much the same thing.) The advantages over “heavyweight checkout” are that repositories consume less disk than working copies with repositories there is only cache repository for a given URL (the cache key computation is a little tricky here), rather than needing one per job So BitbucketSCMSource could probably be improved merely by delegating this SCMFileSystem behavior to GitSCMSource . But of course it could do a lot better still by using BitBucket APIs to retrieve blobs directly, as GitHubSCMSource does, avoiding the need for any clone on master (well GitHubSCMSource still has master-based clones for PR merges, pending  JENKINS-43194 ). This is not particularly hard to do in general, it just means implementing and testing some more APIs. See existing implementations .

          kgiloo added a comment -

          any good news about this issue? it seems insane that jenkins needs to do a full checkout to read one single file.

          a complete deal breaker for PR that takes 15 minutes in our case, just to chekout one single file...?

          another idea would be to have at least the possibility to have it relocated in a pipeline library...

           

          kgiloo added a comment - any good news about this issue? it seems insane that jenkins needs to do a full checkout to read one single file. a complete deal breaker for PR that takes 15 minutes in our case, just to chekout one single file...? another idea would be to have at least the possibility to have it relocated in a pipeline library...  

          Michael Neale added a comment -

          vivek any chance we could take a look at this (while the BB connect stuff is paused?)

          Michael Neale added a comment - vivek any chance we could take a look at this (while the BB connect stuff is paused?)

          Vivek Pandey added a comment -

          michaelneale yes, will start working on it.

          Vivek Pandey added a comment - michaelneale yes, will start working on it.

          Vivek Pandey added a comment -

          Vivek Pandey added a comment - PR https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/78

          kgiloo added a comment - - edited

          regression met with Bitbucket Branch Source Plugin 2.2.6:

          feature branches cannot be checkout anymore, disrupted by:

          [BFA] Scanning build for known causes...
          [BFA] No failure causes found
          [BFA] Done. 0s
          [Bitbucket] Notifying commit build result
          [Bitbucket] Build result notified
          java.util.UnknownFormatConversionException: Conversion = 'F'
          at java.util.Formatter$FormatSpecifier.conversion(Formatter.java:2691)
          at java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2720)
          at java.util.Formatter.parse(Formatter.java:2560)
          

          probably produced by defect parsing of

          /var/lib/jenkins/workspace/key/feature*%2F*name@script

          back to 2.2.4 solves the problem.

           

          kgiloo added a comment - - edited regression met with Bitbucket Branch Source Plugin 2.2.6: feature branches cannot be checkout anymore, disrupted by: [BFA] Scanning build for known causes... [BFA] No failure causes found [BFA] Done. 0s [Bitbucket] Notifying commit build result [Bitbucket] Build result notified java.util.UnknownFormatConversionException: Conversion = 'F' at java.util.Formatter$FormatSpecifier.conversion(Formatter.java:2691) at java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2720) at java.util.Formatter.parse(Formatter.java:2560) probably produced by defect parsing of /var/lib/jenkins/workspace/key/feature*%2F*name@script back to 2.2.4 solves the problem.  

          Michael Neale added a comment - - edited

          kgiloo is there a full stack trace? and possibly put in a new ticket vs an old ticket? 

          I wonder if this is fixed by: https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/82

           

          EDIT: no - actually was fixed in https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/83 I think. if you check UC soon, will have latest, can you try it again with that fix? 

          Michael Neale added a comment - - edited kgiloo is there a full stack trace? and possibly put in a new ticket vs an old ticket?  I wonder if this is fixed by:  https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/82   EDIT: no - actually was fixed in https://github.com/jenkinsci/bitbucket-branch-source-plugin/pull/83  I think. if you check UC soon, will have latest, can you try it again with that fix? 

          Vivek Pandey added a comment -

          kgiloo its been fixed and available in bitbucket-branch-source 2.2.7 release.

          Vivek Pandey added a comment - kgiloo its been fixed and available in bitbucket-branch-source 2.2.7 release.

          kgiloo added a comment -

          vivek i checked the 2.2.7. Thanx, no EM.

          just created a PR but jenkins still making a full checkout to read Jenkinsfile.

          anything i need to tweek to make it work?

          kgiloo added a comment - vivek i checked the 2.2.7. Thanx, no EM. just created a PR but jenkins still making a full checkout to read Jenkinsfile. anything i need to tweek to make it work?

          Georgi Hristov added a comment - - edited

          I think there is a problem when trying to lightweight checkout of Jenkins file for branches that have '/' in them like (ft/*)

          ERROR: Could not do lightweight checkout, falling back to heavyweight
          java.io.FileNotFoundException: URL: https://api.bitbucket.org/2.0/repositories/owner/repository/src/ft%2Fci/Jenkinsfile
          	at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getRequestAsInputStream(BitbucketCloudApiClient.java:558)
          	at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getFileContent(BitbucketCloudApiClient.java:719)
          	at com.cloudbees.jenkins.plugins.bitbucket.filesystem.BitbucketSCMFile.content(BitbucketSCMFile.java:81)
          	at jenkins.scm.api.SCMFile.contentAsString(SCMFile.java:338)
          	at org.jenkinsci.plugins.workflow.multibranch.SCMBinder.create(SCMBinder.java:104)
          	at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:263)
          	at hudson.model.ResourceController.execute(ResourceController.java:97)
          	at hudson.model.Executor.run(Executor.java:421)
          

           

          For branches like master, dev, release it works perfectly.

          Thanks

           

          Georgi Hristov added a comment - - edited I think there is a problem when trying to lightweight checkout of Jenkins file for branches that have '/' in them like (ft/*) ERROR: Could not do lightweight checkout, falling back to heavyweight java.io.FileNotFoundException: URL: https: //api.bitbucket.org/2.0/repositories/owner/repository/src/ft%2Fci/Jenkinsfile at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getRequestAsInputStream(BitbucketCloudApiClient.java:558) at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getFileContent(BitbucketCloudApiClient.java:719) at com.cloudbees.jenkins.plugins.bitbucket.filesystem.BitbucketSCMFile.content(BitbucketSCMFile.java:81) at jenkins.scm.api.SCMFile.contentAsString(SCMFile.java:338) at org.jenkinsci.plugins.workflow.multibranch.SCMBinder.create(SCMBinder.java:104) at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:263) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:421)   For branches like master, dev, release it works perfectly. Thanks  

          Michael Neale added a comment -

          ghristov88 it was a problem with versions early than 2.2.7 - are you running the latest? 

          Michael Neale added a comment - ghristov88 it was a problem with versions early than 2.2.7 - are you running the latest? 

          Michael Neale added a comment -

          kgiloo I expect in some cases with PRs it may need to clone it to do a validated merge (github plugin has this same challenge). Woudl that be right vivek ?

          Michael Neale added a comment - kgiloo I expect in some cases with PRs it may need to clone it to do a validated merge (github plugin has this same challenge). Woudl that be right vivek ?

          michaelneale yes I am running 2.2.7. What can I provide to help you reproduce the issue?

          Georgi Hristov added a comment - michaelneale yes I am running 2.2.7. What can I provide to help you reproduce the issue?

          Michael Neale added a comment - - edited

          ghristov88 perhaps should put this in a new issue, as it looks a bit distinct from what others are seeing. 

          Just reproduction instructions should be doable, I expect it is more encoding confusion where it can't find the resource, so falls back to what it knows works, but I assume when it falls back it does work at least. 

          Michael Neale added a comment - - edited ghristov88 perhaps should put this in a new issue, as it looks a bit distinct from what others are seeing.  Just reproduction instructions should be doable, I expect it is more encoding confusion where it can't find the resource, so falls back to what it knows works, but I assume when it falls back it does work at least. 

          michaelneale, yes it falls back and it works. I will open a new issue later today.

          Georgi Hristov added a comment - michaelneale , yes it falls back and it works. I will open a new issue later today.

          I am also seeing the same issue as Georgi using 2.2.7 plugin version.  I'm running with Bitbucket Server version 4.11.  Is there a Bitbucket version requirement for the API used to be available?

          NetAppBlueDevil added a comment - I am also seeing the same issue as Georgi using 2.2.7 plugin version.  I'm running with Bitbucket Server version 4.11.  Is there a Bitbucket version requirement for the API used to be available?

          Is this feature toggle-able? Upgraded today from 2.2.3 and it randomly broke pipelines across the board. All of our jobs were configured to use the stash step in the @script directory to take advantage of that initial checkout. After upgrading to 2.2.7, jobs will sometimes do the heavy checkout while others will sometimes do the light checkout, with no real distinction as to why.

          Do we now need to explicitly have a step in our pipeline code to checkout out the repository? I've rolled back to 2.2.3 but I don't want to keep relying on an old version of the plugin. Does anyone have a recommendation on how to proceed?

          Aaron Lunsford added a comment - Is this feature toggle-able? Upgraded today from 2.2.3 and it randomly broke pipelines across the board. All of our jobs were configured to use the stash step in the @script directory to take advantage of that initial checkout. After upgrading to 2.2.7, jobs will sometimes do the heavy checkout while others will sometimes do the light checkout, with no real distinction as to why. Do we now need to explicitly have a step in our pipeline code to checkout out the repository? I've rolled back to 2.2.3 but I don't want to keep relying on an old version of the plugin. Does anyone have a recommendation on how to proceed?

          Wolff added a comment - - edited

          Is it possible that Atlassian changed some licencing needed for this? We are running into the following error:

          com.atlassian.bitbucket.jenkins.internal.http.HttpRequestExecutorImpl handleErrorBitbucket - did not accept the request
          Failed to retrieve mirroring information for project Test and repo Testing
          com.atlassian.bitbucket.jenkins.internal.client.exception.BadRequestException: - response: 409 with body: '{"errors":[{"context":null,"message":"Mirroring requires a Bitbucket Data Center license.","exceptionName":"com.atlassian.bitbucket.mirroring.upstream.MirroringDisabledException"}]}'
          

          Unfortunately Jenkins output does not specify which request is not accepted, but since we also get

          Lightweight checkout support not available, falling back to full checkout.
          

          I assume the first part is the cause.

          Wolff added a comment - - edited Is it possible that Atlassian changed some licencing needed for this? We are running into the following error: com.atlassian.bitbucket.jenkins.internal.http.HttpRequestExecutorImpl handleErrorBitbucket - did not accept the request Failed to retrieve mirroring information for project Test and repo Testing com.atlassian.bitbucket.jenkins.internal.client.exception.BadRequestException: - response: 409 with body: '{ "errors" :[{ "context" : null , "message" : "Mirroring requires a Bitbucket Data Center license." , "exceptionName" : "com.atlassian.bitbucket.mirroring.upstream.MirroringDisabledException" }]}' Unfortunately Jenkins output does not specify which request is not accepted, but since we also get Lightweight checkout support not available, falling back to full checkout. I assume the first part is the cause.

          Kalle Niemitalo added a comment - - edited

          wlfbck, the package name com.atlassian.bitbucket.jenkins.internal.http belongs to atlassian-bitbucket-server-integration-plugin, but this issue is for bitbucket-branch-source-plugin. I suggest you file a separate issue on atlassian-bitbucket-server-integration-plugin.

          The "Lightweight checkout support not available" message for atlassian-bitbucket-server-integration-plugin is being addressed in JENKINS-63033.

          Kalle Niemitalo added a comment - - edited wlfbck , the package name com.atlassian.bitbucket.jenkins.internal.http belongs to atlassian-bitbucket-server-integration-plugin, but this issue is for bitbucket-branch-source-plugin. I suggest you file a separate issue on atlassian-bitbucket-server-integration-plugin. The "Lightweight checkout support not available" message for atlassian-bitbucket-server-integration-plugin is being addressed in JENKINS-63033 .

            vivek Vivek Pandey
            petehayes Peter Hayes
            Votes:
            18 Vote for this issue
            Watchers:
            33 Start watching this issue

              Created:
              Updated:
              Resolved: