Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-5597

symlinks in archive trees lead to double archiving

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • None
    • Centos 5.4
    • Jenkins 2.230

      If the tree you are archiving contains an internal symlink, the target files will be archived twice. This can lead to a very large increase in the size of the archived data and consequently, the time it takes to archive it.

      Example:

      /archive-root
      /big-directory
      /symlink -> big-directory

      Then every file in big directory will be archived twice.

      A fix would be for Hudson to detect internal symlinks and copy them rather than dereference them.

          [JENKINS-5597] symlinks in archive trees lead to double archiving

          Halvor Lund added a comment -

          I can confirm that this still is an issue. It seems that symlinks to files are archived correctly, whereas symlinks to directories are not, and the whole directory is copied instead. Any plans for fixing this bug?

          Halvor Lund added a comment - I can confirm that this still is an issue. It seems that symlinks to files are archived correctly, whereas symlinks to directories are not, and the whole directory is copied instead. Any plans for fixing this bug?

          Sorin Sbarnea added a comment -

          I am really not glad to see that after more than 7 years we still have nobody working on making a fix for this bug.

          Sorin Sbarnea added a comment - I am really not glad to see that after more than 7 years we still have nobody working on making a fix for this bug.

          E H added a comment -

          As well as being a a size issue, this breaks macOS Frameworks for code signing with a "bundle format is ambiguous (could be app or framework)" message.  I found this with stash/unstash, presumably the cause is the same.

          E H added a comment - As well as being a a size issue, this breaks macOS Frameworks for code signing with a "bundle format is ambiguous (could be app or framework)" message.  I found this with stash/unstash, presumably the cause is the same.

          E H added a comment - - edited

          Drawing on https://github.com/jenkinsci/pipeline-examples/blob/master/pipeline-examples/unstash-different-dir/unstashDifferentDir.groovy, the attached "JENKINS-5597-example.groovy" pipeline script will demonstrate the problem of symlinks to directories becoming directories.

          If the stash-related issue should be a different Jira issue please let me know and I'll create one.

          E H added a comment - - edited Drawing on https://github.com/jenkinsci/pipeline-examples/blob/master/pipeline-examples/unstash-different-dir/unstashDifferentDir.groovy,  the attached " JENKINS-5597 -example.groovy" pipeline script will demonstrate the problem of symlinks to directories becoming directories. If the stash-related issue should be a different Jira issue please let me know and I'll create one.

          To be certain, this is a BUG not an Improvement.  Archiving means to store a copy as is, not to interpret and alter the archive such that it does not reflect what is being archived.

          Why does this bug still exist 8.5 years later?

          Brian J Murrell added a comment - To be certain, this is a BUG not an Improvement.   Archiving means to store a copy as is, not to interpret and alter the archive such that it does not reflect what is being archived. Why does this bug still exist 8.5 years later?

          Markus Winter added a comment -

          Ran into an issue where a build made out of 1000 directory entries over 11 million for the ant DirectoryScanner because of symlinks to directories that again contains symlinks in a subfolder despite having an exclude pattern on the problematic folders.

          archive pattern: gen/**/*log

          exclude pattern: gen/out/modules/*/

          The symlinks were all below gen/out/modules but DirectoryScanner still tried to read everything in before applying the exclude.

          Agent process was started with -Xmx8g and ran oom.

           

          Markus Winter added a comment - Ran into an issue where a build made out of 1000 directory entries over 11 million for the ant DirectoryScanner because of symlinks to directories that again contains symlinks in a subfolder despite having an exclude pattern on the problematic folders. archive pattern: gen/**/*log exclude pattern: gen/out/modules/* / The symlinks were all below gen/out/modules but DirectoryScanner still tried to read everything in before applying the exclude. Agent process was started with -Xmx8g and ran oom.  

          Markus Winter added a comment -

          opened a pull request https://github.com/jenkinsci/jenkins/pull/3947 that make follow symlinks configurable

          Markus Winter added a comment - opened a pull request https://github.com/jenkinsci/jenkins/pull/3947 that make follow symlinks configurable

          Will this address a similar issue with stash/unstash as well ?

          We are using git to checkout sources - which have symbolic links. Then when the sources are stashed and unstashed the symbolic links are lost. They appear as separate directories. This is leading up to a series of issues including bloating up of the sanbox size.

          Aakash Sudhanwa added a comment - Will this address a similar issue with stash/unstash as well ? We are using git to checkout sources - which have symbolic links. Then when the sources are stashed and unstashed the symbolic links are lost. They appear as separate directories. This is leading up to a series of issues including bloating up of the sanbox size.

          Hi,

          Can someone please merge the changes to mains. We're very badly hurt by this issue, and currently building Jenkins manually for this change.

          There is a pending merge request  https://github.com/jenkinsci/jenkins/pull/3947 that makes follow symlinks configurable.

           

          Thanks,

          Abhishek

          Abhishek Sharma added a comment - Hi, Can someone please merge the changes to mains. We're very badly hurt by this issue, and currently building Jenkins manually for this change. There is a pending merge request   https://github.com/jenkinsci/jenkins/pull/3947  that makes follow symlinks configurable.   Thanks, Abhishek

          Oleg Nenashev added a comment -

          The change was released in Jenkins 2.230. Thanks to danielbeck wfollonier jthompson for reviews!

          Oleg Nenashev added a comment - The change was released in Jenkins 2.230. Thanks to danielbeck wfollonier jthompson for reviews!

            Unassigned Unassigned
            pgweiss pgweiss
            Votes:
            23 Vote for this issue
            Watchers:
            29 Start watching this issue

              Created:
              Updated:
              Resolved: