Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-30148

Allocate shorter workspace if it will be too long for reasonable use inside build

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None

      When using rich matrix axes or deep folders hierarchy, substantial part of maximal path name (255 on windows for instance) can be consumed by the path to workspace not leaving enough room for the build itself.

      Introduce threshold (global or per job) to define expected path length left for build. if the default allocation algorithm creates WS path so long that there will be not enough path length left for the build, hash should be used instead.

      Therefore, if the threshold will be configured to 1024 Jenkins will use hash on Windows all the time and on linux only if the workspace path exceeds 3072 characters (assuming 4096 is the maximal path allowed on linux).

          [JENKINS-30148] Allocate shorter workspace if it will be too long for reasonable use inside build

          ssbarnea, note that there are several reasons why the path length can bloat and you have not mentioned what is yours. While the reason why the plugin does not work on master is purely technical, it is generally discouraged to use master for building.

          Oliver Gondža added a comment - ssbarnea , note that there are several reasons why the path length can bloat and you have not mentioned what is yours. While the reason why the plugin does not work on master is purely technical, it is generally discouraged to use master for building.

          Jesse Glick added a comment -

          Solved for branch projects in JENKINS-34564 because the problem was acute for these (and also these often had job names including dangerous characters). I proposed a more general solution to how Jenkins manages workspaces, which would deal with not just length but special characters and cleanup.

          Jesse Glick added a comment - Solved for branch projects in  JENKINS-34564 because the problem was acute for these (and also these often had job names including dangerous characters). I proposed a more general solution to how Jenkins manages workspaces, which would deal with not just length but special characters and cleanup.

          jglick

          I proposed a more general solution to how Jenkins manages workspaces, which would deal with not just length but special characters and cleanup.

          Can you point us to that work if it exists already.

          The probability of checksum collisions is not my concern. It is malicious collisions, which are easily constructed with weakened hashes.

          I am not sure I follow this, Will Jenkins not disambiguate this with @2 in case locators offer colliding workspace paths?

          Oliver Gondža added a comment - jglick I proposed a more general solution to how Jenkins manages workspaces, which would deal with not just length but special characters and cleanup. Can you point us to that work if it exists already. The probability of checksum collisions is not my concern. It is malicious collisions, which are easily constructed with weakened hashes. I am not sure I follow this, Will Jenkins not disambiguate this with @2 in case locators offer colliding workspace paths?

          Oliver Gondža added a comment - Jesse has drawn some design I like here: https://issues.jenkins-ci.org/browse/JENKINS-2111?focusedCommentId=270320&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-270320

          Jesse Glick added a comment -

          Will Jenkins not disambiguate this with @2 in case locators offer colliding workspace paths?

          Only for concurrent builds. My concern was about a job deliberately named to produce the same workspace hash as another, then running after the legitimate one in the same workspace, for a system configured with multi-use executors but enforced sandboxing of build steps (e.g. using Docker containers with -v workspace mounts) so that a job is only permitted to read/write files inside its workspace and @tmp directory. Admittedly this security mode is not well supported in Jenkins today and so to have any real enforcement of security between teams you need to use one-shot executors exclusively—safe but inefficient (every build pays slave.jar startup cost).

          Anyway, the proposal I made for a per-node registry of workspaces would just allocate unique directories without possibility of collision, which is both simpler and safer, and would result in shorter workspace paths. In principle this could be done entirely as a plugin (to supersede special-case handling in branch-api), in which case the core behavior can be kept simple and dumb and the very flaky WorkspaceCleanupThread deleted.

          Jesse Glick added a comment - Will Jenkins not disambiguate this with @2 in case locators offer colliding workspace paths? Only for concurrent builds. My concern was about a job deliberately named to produce the same workspace hash as another, then running after the legitimate one in the same workspace, for a system configured with multi-use executors but enforced sandboxing of build steps (e.g. using Docker containers with -v workspace mounts) so that a job is only permitted to read/write files inside its workspace and @tmp directory. Admittedly this security mode is not well supported in Jenkins today and so to have any real enforcement of security between teams you need to use one-shot executors exclusively—safe but inefficient (every build pays slave.jar startup cost). Anyway, the proposal I made for a per-node registry of workspaces would just allocate unique directories without possibility of collision, which is both simpler and safer, and would result in shorter workspace paths. In principle this could be done entirely as a plugin (to supersede special-case handling in branch-api ), in which case the core behavior can be kept simple and dumb and the very flaky WorkspaceCleanupThread deleted.

          Sorin Sbarnea added a comment -

          Any change of having this fixed, ever, at least for normal pipelines?

          Here is one real life example: 
          /home/rhos-ci/jenkins/workspace/DFG-storage-cinder-20_director-rhel-virthost-3cont_2comp-ipv4-vxlan-netapp-nfs-external-ssbarnea
          With the mention that limiting the job name is impossible, there are tens of people creating jobs and no way to enforce a limit, in addition that an attempt to rename >1000 existing jobs would be an epic failure.

          The only thing where it may be possible to cut few characters is on jenkins slave HOME directory and subfolder, but other than this nope. Even so, changing this would be quite complex too as it would  involve lots of slaves type which are created by various people where almost for sure you will endup with some not using the shortened home directory workaround.

          Still, even if I manage to pick the minimum possible home directory, of "/j" it will clearly not be enough to make virtuaenvs working because job names will continue to be too long. ** 

          Sorin Sbarnea added a comment - Any change of having this fixed, ever, at least for normal pipelines ? Here is one real life example:  /home/rhos-ci/jenkins/workspace/DFG-storage-cinder-20_director-rhel-virthost-3cont_2comp-ipv4-vxlan-netapp-nfs-external-ssbarnea With the mention that limiting the job name is impossible, there are tens of people creating jobs and no way to enforce a limit, in addition that an attempt to rename >1000 existing jobs would be an epic failure. The only thing where it may be possible to cut few characters is on jenkins slave HOME directory and subfolder, but other than this nope. Even so, changing this would be quite complex too as it would  involve lots of slaves type which are created by various people where almost for sure you will endup with some not using the shortened home directory workaround. Still, even if I manage to pick the minimum possible home directory, of " /j" it will clearly not be enough to make virtuaenvs working because job names will continue to be too long. ** 

          I am unassuming as I do not intend to work on this anytime soon...

          Oliver Gondža added a comment - I am unassuming as I do not intend to work on this anytime soon...

          Jesse Glick added a comment -

          Any change of having this fixed, ever

          Via my proposal in JENKINS-2111, perhaps.

          Jesse Glick added a comment - Any change of having this fixed, ever Via my proposal in JENKINS-2111 , perhaps.

          Sander Flobbe added a comment -

          The description assumes "4096 is the maximal path allowed on linux". However the actual limit in some cases is much (!!) smaller. As mentioned in an earlier post by ssbarnea, with a python virtualenv command you will get shebang lines that include the full path. For example my bin/pip script created by virtualenv has this shebang line:

          #!/usr/share/tomcat/.jenkins/workspace/Waterworks_master-IAYUAOZ7D4F2MCG5DIMHAV3ODEQZACDRSHT3HA6N5UIRQXOCQQIQ/virtualenv/bin/python

          This results in an error in my Multibranch Pipeline job:

          /usr/share/tomcat/.jenkins/workspace/Waterworks_master-IAYUAOZ7D4F2MCG5DIMHAV3: bad interpreter: No such file or directory

          The actual cut-off point appears to be at character 80. This is on a Centos 7.5 system. Although external documentation suggests 127 as the limit with linux (and I don't understand why I am hitting a barrier at 80), both these numbers are way smaller than the implied and assumed 4096 limit in the description of this issue.

          Sander Flobbe added a comment - The description assumes "4096 is the maximal path allowed on linux". However the actual limit in some cases is much (!!) smaller. As mentioned in an earlier post by ssbarnea , with a python virtualenv command you will get shebang lines that include the full path. For example my bin/pip script created by virtualenv has this shebang line: #!/usr/share/tomcat/.jenkins/workspace/Waterworks_master-IAYUAOZ7D4F2MCG5DIMHAV3ODEQZACDRSHT3HA6N5UIRQXOCQQIQ/virtualenv/bin/python This results in an error in my Multibranch Pipeline job: /usr/share/tomcat/.jenkins/workspace/Waterworks_master-IAYUAOZ7D4F2MCG5DIMHAV3: bad interpreter: No such file or directory The actual cut-off point appears to be at character 80. This is on a Centos 7.5 system. Although external documentation suggests 127 as the limit with linux (and I don't understand why I am hitting a barrier at 80), both these numbers are way smaller than the implied and  assumed 4096 limit in the description of this issue.

          Jesse Glick added a comment -

          My proposed patch in JENKINS-2111 fixes or at least greatly improves these issues.

          Jesse Glick added a comment - My proposed patch in JENKINS-2111 fixes or at least greatly improves these issues.

            Unassigned Unassigned
            olivergondza Oliver Gondža
            Votes:
            5 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: