Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36559

File pattern (e.g. archive) field validation can run forever

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • core

      Every few days, the Jenkins UI hangs and becomes totally unresponsive, i.e. "Jenkins is not picking up HTTP requests, so all attempts to connect hang and no page is ever rendered."
      It doens't seem to be connected with a certain job getting started or anything.

      I couldn't get a thread dump from the UI, nor from jstack, only from jstack -F. I could get the jmap -heap, but not the full heap dump. Both outputs are attached. If I understand that right, we are in a GC spiral.

      Is there anything I can do to help debugging this?

        1. config.xml
          1 kB
        2. jconsole.png
          jconsole.png
          45 kB
        3. jhat.PNG
          jhat.PNG
          225 kB
        4. jmap.out
          1 kB
        5. jstack.out
          170 kB

          [JENKINS-36559] File pattern (e.g. archive) field validation can run forever

          Olaf Lenz added a comment -

          Aparently, it looks like a heap memory overflow.
          Within a few minutes after start, when looking at jmap --heap, I see that the Eden space becomes completely filled.

          I have updated to 2.13, but that didn't help.

          Olaf Lenz added a comment - Aparently, it looks like a heap memory overflow. Within a few minutes after start, when looking at jmap --heap , I see that the Eden space becomes completely filled. I have updated to 2.13, but that didn't help.

          Olaf Lenz added a comment -

          It looks as though the memory overflow happens when I start editing a job. The attached screenshot shows the jconsole memory tab.
          Jenkins had already run for about an hour without any trouble.
          At 11:34, I started editing one of the job configurations. After about 2 minutes, memory was full, and GC couldn't recover any memory, so Jenkins hangs.

          Olaf Lenz added a comment - It looks as though the memory overflow happens when I start editing a job. The attached screenshot shows the jconsole memory tab. Jenkins had already run for about an hour without any trouble. At 11:34, I started editing one of the job configurations. After about 2 minutes, memory was full, and GC couldn't recover any memory, so Jenkins hangs.

          Olaf Lenz added a comment -

          More information: it looks as though the problem only happens when I want to edit one particular job. I have attached the job's XML.

          Olaf Lenz added a comment - More information: it looks as though the problem only happens when I want to edit one particular job. I have attached the job's XML.

          Olaf Lenz added a comment -

          I have managed to create a heap dump shortly before jenkins hangs. Unfortunately, the heap dump is rather big (4.6 GB), so I better do not attach it here.

          Olaf Lenz added a comment - I have managed to create a heap dump shortly before jenkins hangs. Unfortunately, the heap dump is rather big (4.6 GB), so I better do not attach it here.

          Daniel Beck added a comment -

          Probably the recursive validation of a file name pattern in the config that results in DirectoryScanner.scandir invocations.

          Should be bounded by hudson.FilePath.VALIDATE_ANT_FILE_MASK_BOUND to 10k by default, but may not work as expected – Does this issue disappear with a suitably low value configured as described here? https://wiki.jenkins-ci.org/display/JENKINS/Features+controlled+by+system+properties

          Daniel Beck added a comment - Probably the recursive validation of a file name pattern in the config that results in DirectoryScanner.scandir invocations. Should be bounded by hudson.FilePath.VALIDATE_ANT_FILE_MASK_BOUND to 10k by default, but may not work as expected – Does this issue disappear with a suitably low value configured as described here? https://wiki.jenkins-ci.org/display/JENKINS/Features+controlled+by+system+properties

          Olaf Lenz added a comment -

          I will test that. Now that you mention it, I have seen hundreds of instances of this message in the jenkins log:

          skipping symbolic link /workspaces/socvm459/coverity/ws-r8.0-coverity/opt/93000/src/hp83000/hp83000/hp83000/hp83000/hp83000/test/devices/offline/recursiveLink1/recursiveLink1/recursiveLink1/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2/recursiveLink2/recursiveLink1/recursiveLink1/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2 -- too many levels of symbolic links.
          

          Olaf Lenz added a comment - I will test that. Now that you mention it, I have seen hundreds of instances of this message in the jenkins log: skipping symbolic link /workspaces/socvm459/coverity/ws-r8.0-coverity/opt/93000/src/hp83000/hp83000/hp83000/hp83000/hp83000/test/devices/offline/recursiveLink1/recursiveLink1/recursiveLink1/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2/recursiveLink2/recursiveLink1/recursiveLink1/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2/recursiveLinkDir/recursiveLink3/recursiveLink2 -- too many levels of symbolic links.

          Olaf Lenz added a comment -

          Setting the property on the command line doesn't help, even when setting it to 100.
          I also tested it with Jenkins 2.7.1, but it has the same problem.

          Olaf Lenz added a comment - Setting the property on the command line doesn't help, even when setting it to 100. I also tested it with Jenkins 2.7.1, but it has the same problem.

          Olaf Lenz added a comment - - edited

          I have created a simple job that reproduces the behaviour and attached the config.xml. It creates a subdir that contains a recursive link, plus 1000 empty files, and archives the files. This leads to a memory overflow within a few minutes, apparently independent of hudson.FilePath.VALIDATE_ANT_FILE_MASK_BOUND.

          It looks as though it is crucial that the directory also contains a lot of files. My diagnosis would be that Jenkins tries to create a catalogue of files recursively. If there are very few files in the dir, the memory won't overflow as the limit of scanDir-calls is reached soon, but if there are files, each scanDir call is large.

          Olaf Lenz added a comment - - edited I have created a simple job that reproduces the behaviour and attached the config.xml. It creates a subdir that contains a recursive link, plus 1000 empty files, and archives the files. This leads to a memory overflow within a few minutes, apparently independent of hudson.FilePath.VALIDATE_ANT_FILE_MASK_BOUND. It looks as though it is crucial that the directory also contains a lot of files. My diagnosis would be that Jenkins tries to create a catalogue of files recursively. If there are very few files in the dir, the memory won't overflow as the limit of scanDir-calls is reached soon, but if there are files, each scanDir call is large.

          Olaf Lenz added a comment -

          Note that the config that I have attached doesn't exactly reproduce the issue, as this only crashes jenkins when building the job. The actual problem is worse insofar as it crashes jenkins when trying to configure the job.

          Olaf Lenz added a comment - Note that the config that I have attached doesn't exactly reproduce the issue, as this only crashes jenkins when building the job. The actual problem is worse insofar as it crashes jenkins when trying to configure the job.

          Daniel Beck added a comment -

          Should be enough to remove the archiving, build once to get the files, then config to get the archiving again – pattern validation will then result in OOM

          Daniel Beck added a comment - Should be enough to remove the archiving, build once to get the files, then config to get the archiving again – pattern validation will then result in OOM

            danielbeck Daniel Beck
            olenz Olaf Lenz
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: