Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-47299

Multibranch pipelines with Jenkinsfile in big svn repositories is unusable

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      We have a product stored in a normal SVN repo, we have trunk, branches, tags, etc. The size of each branch checkout is around a few gigabytes of diskspace. Not ideal, I know, but it's a separate problem. Now, when we create a multibranch pipeline job for this repository our jenkins master runs out of disk space after the first branch indexing. The reason is quite simple, the multibranch job creates one complete checkout for each branch, each requiring several gigabytes, which is completely unnecessary since it only needs to load the Jenkinsfile, the complete checkout is done on the slave later when doing the actual build.

      However, if we create a normal pipeline job for each branch (with loading Jenkinsfile from SVN), then we can specify checkout depth and limit the number of files to checkout, this option is not available for multbranch pipeline jobs.

      Now, I see three solutions to this:

      1. Add the ability to specify depth for multibranch pipeline jobs. This is probably the simplest solution, and the implementation should be similar to JENKINS-35227 (PR-189). The downside is that it won't work for projects where the Jenkinsfile isn't stored in the root (cause you will need to use infinite as depth).
      2. Always use checkout depth "files" when checking out the Jenkinsfile. (this will require some special handling when the Jenkins file is located in a sub folder, anyone who foresee a problem with this solution?
      3. Add proper support for lightweight retrieval of Jenkinsfile. This was introduced in JENKINS-33273, but support for subversion was never added. Anyone who knows if there is a blocker for implementing this or it is simply a matter of someone doing it?

      I'm willing to make a attempt on addressing this issue. However I would like some input from someone who knows the plugin well and can tell which one of the solutions is possible and ideal.

        Attachments

          Issue Links

            Activity

            Hide
            corentin_soen Corentin Soen added a comment - - edited

            One-upping this. It's close to a blocking issue here in our company, where some projects can reach up to hundreds of gigabytes per branch.

            Seemed like it was once possible to set the depth of the polling checkout using the properties{} step in the Jenkinsfile and a snippet like [$class: 'SubversionSCM', depthOption: 'immediates'], but properties seems to have been replaced by the much cleaner but sadly way less exhaustive options{} step.

            Is there still a way to override this depthOption server-wide, as I see no reason to keep full-depth checkouts on the master anyway ? Setting it from the Jenkins script console didn't seem to achieve anything of relevance in our case.

            Show
            corentin_soen Corentin Soen added a comment - - edited One-upping this. It's close to a blocking issue here in our company, where some projects can reach up to hundreds of gigabytes per branch . Seemed like it was once possible to set the depth of the polling checkout using the properties{}  step in the Jenkinsfile and a snippet like [$class: 'SubversionSCM', depthOption: 'immediates'] , but properties seems to have been replaced by the much cleaner but sadly way less exhaustive options{} step. Is there still a way to override this depthOption server-wide, as I see no reason to keep full-depth checkouts on the master anyway ? Setting it from the Jenkins script console didn't seem to achieve anything of relevance in our case.
            Hide
            james_reynolds James Reynolds added a comment -

            This is a blocking issue for us using multi-branch pipelines on subversion as well. A cut-down mirror of the full repository allows the build to proceed in a couple of minutes - but the full repository will cause the build to take nearly 30. Our workaround is to use normal pipelines and fiddle around with BRANCH parameters. It works, but it is significantly more effort.

            Show
            james_reynolds James Reynolds added a comment - This is a blocking issue for us using multi-branch pipelines on subversion as well. A cut-down mirror of the full repository allows the build to proceed in a couple of minutes - but the full repository will cause the build to take nearly 30. Our workaround is to use normal pipelines and fiddle around with BRANCH parameters. It works, but it is significantly more effort.
            Hide
            danekan Dane Kantner added a comment - - edited

            I have worked around issues like this on my Jenkins build environment by having the disks in use be de-duplicated. In a lot of use cases your branch changes aren't really that substantial from one to another so you can have a lot of copies of it and if it gets deduplicated at the OS level, your disks can go a very long way. You can do this on a Linux machine with a ZFS file system, or it's built in to Windows (though it cannot be running on the boot drive, so you need to move where your jobs run).   One caveat of doing this in ZFS is the amount of RAM required for this by your system directly correlates to your drive volume size, so you're better off starting close to where you need it and scaling as needed. Also another option, SAN/NAS systems like NetApp do this beautifully well.

            Show
            danekan Dane Kantner added a comment - - edited I have worked around issues like this on my Jenkins build environment by having the disks in use be de-duplicated. In a lot of use cases your branch changes aren't really that substantial from one to another so you can have a lot of copies of it and if it gets deduplicated at the OS level, your disks can go a very long way. You can do this on a Linux machine with a ZFS file system, or it's built in to Windows (though it cannot be running on the boot drive, so you need to move where your jobs run).   One caveat of doing this in ZFS is the amount of RAM required for this by your system directly correlates to your drive volume size, so you're better off starting close to where you need it and scaling as needed. Also another option, SAN/NAS systems like NetApp do this beautifully well.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jons Jon Sten
              Votes:
              17 Vote for this issue
              Watchers:
              18 Start watching this issue

                Dates

                Created:
                Updated: