Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-40999

Recursive symlink causes high resource utilization, termination of slave process in ArchiveArtifact and Publish JUnit results steps.

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • core, junit-plugin
    • None
    • Jenkins v2.35 in docker container, SSH build slaves, JUnit plugin 1.19

      Per JENKINS-7780 I am filing a new issue as I was able to reproduce the same issue on a current version of Jenkins.

      I am seeing the same issue affecting my build slaves in situations where I have a symlink that points upwards to a directory higher than my jenkins workspace. Each build slave has multiple projects being built on it and each project has 4 of these symlinks. Each project consist of many files (multiple copies of the linux kernel, multiple root filesystem images, our application code and our build artifacts). Based on strace-ing the JVM it appears that jenkins is recursively stat'ing files through these symlinks. Because the symlinks points to a directory that contains all project directories it results in a HUGE fanout and eventually the slave disconnects when the JVM is killed, causing all builds on the slave node to fail.

      By process of elimination I was able to determine that both the ArtifactArchiver and the JUnit Publisher steps cause this problem. Both use org.apache.tools.ant.FileSet to handle the pattern match/file collection. While this class does recursive loop detection caused by symlinks it doesn't do it fast enough to prevent the huge fanout that brings down the build nodes.

      Even more interesting - because the project config page seems to "test" the patterns entered into the UI for ArchiveArtifact and Publish JUnit against the workspace, just opening the config page would cause the issue to occur and the slave to eventually disconnect.

      I would be happy if there was a way to disable the following of symlinks. I understand that the org.apache.tools.ant.types.AbstractFileSet has a followSymlinks member that defaults to true. If I could set this to false via the Jenkins UI then corner cases like our horrendous repo would be covered without breaking it for those who rely on the symlink-traversing behavior.

          [JENKINS-40999] Recursive symlink causes high resource utilization, termination of slave process in ArchiveArtifact and Publish JUnit results steps.

          Daniel Beck added a comment -

          Looks like JENKINS-36559?

          Daniel Beck added a comment - Looks like JENKINS-36559 ?

          Daniel Beck added a comment -

          Also, (genuine) thanks for reading the comments and not reopening!

          Daniel Beck added a comment - Also, (genuine) thanks for reading the comments and not reopening!

          danielbeck: Yes, looks just like JENKINS-36559, although that bug doesn't mention the JUnit plugin which exhibits the exact same behavior.

          Just for reference for anyone affected by this bug: I was able to use the XUnit plugin instead of Junit to get around the JUnit issues. I was not able to find a solution for the Achive Artifacts issue besides removing the symlink.

          Mike Nicholson added a comment - danielbeck : Yes, looks just like JENKINS-36559 , although that bug doesn't mention the JUnit plugin which exhibits the exact same behavior. Just for reference for anyone affected by this bug: I was able to use the XUnit plugin instead of Junit to get around the JUnit issues. I was not able to find a solution for the Achive Artifacts issue besides removing the symlink.

          Daniel Beck added a comment -

          I was wrong. JENKINS-36559 is specifically about the field validation falling over (and could be worked around by e.g. a reverse proxy blocking evil URLs). This one is about the actual steps recursing into oblivion.

          Daniel Beck added a comment - I was wrong. JENKINS-36559 is specifically about the field validation falling over (and could be worked around by e.g. a reverse proxy blocking evil URLs). This one is about the actual steps recursing into oblivion.

          This issue is also affecting us.  Our build tool (Buck) creates symlinks that create loops within the directory structure.  I think a general fix for infinitely recursing issue would be to track the canonical path current being traversed and stop if it is revisited.  If I were to get this patched in Ant, would updating the ant version for Jenkins fix this for all of the plugins that use FileSet based scanning (TAP Results, Checkstyle, etc.)?  What version of Ant does Jenkins need to use.  I see that it is currently on 1.9.2.  Does it need to stay on the 1.9.x line?

          Michael Barker added a comment - This issue is also affecting us.  Our build tool (Buck) creates symlinks that create loops within the directory structure.  I think a general fix for infinitely recursing issue would be to track the canonical path current being traversed and stop if it is revisited.  If I were to get this patched in Ant, would updating the ant version for Jenkins fix this for all of the plugins that use FileSet based scanning (TAP Results, Checkstyle, etc.)?  What version of Ant does Jenkins need to use.  I see that it is currently on 1.9.2.  Does it need to stay on the 1.9.x line?

          Looks like infinite loops also break when adding artifacts in the configuration screen, as jenkins will then check if artifact exists in given path and will get lost in there.

           

          Behaviour as of current version is pretty horrible – jenkins runs out of memory, java kills some important threads and jenkins ends up using 100% of all available cores until someone restarts it. Removing the bad directory lowered cpu usage, but the harm was done and service needed restarting to recover.

           

          I'll fix my job to not create such symlink loops (make modules_install for kernel modules makes one), but it'd be great if this could be handled properly...

          Dominique Martinet added a comment - Looks like infinite loops also break when adding artifacts in the configuration screen, as jenkins will then check if artifact exists in given path and will get lost in there.   Behaviour as of current version is pretty horrible – jenkins runs out of memory, java kills some important threads and jenkins ends up using 100% of all available cores until someone restarts it. Removing the bad directory lowered cpu usage, but the harm was done and service needed restarting to recover.   I'll fix my job to not create such symlink loops (make modules_install for kernel modules makes one), but it'd be great if this could be handled properly...

            Unassigned Unassigned
            mikenicholson Mike Nicholson
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: