-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Jenkins 2.263.1
ThinBackup 1.10
Excluding .*xml still backs up some, not all, xml files.
Including .*logs.* does not back up the logs directory nor any files with logs in the name.
robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml
[JENKINS-64490] ThinBackup include/exclude regex doesn't work
Summary | Original: include/exclude regex doesn't work | New: ThinBackup include/exclude regex doesn't work |
Attachment | New: image-2020-12-20-21-18-00-953.png [ 53701 ] |
Description |
Original:
!image-2020-12-20-21-16-35-505.png!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.*logs.*}} does not back up the {{logs}} directory. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
New:
!image-2020-12-20-21-18-00-953.png|width=308,height=437!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.\*logs.\*}} does not back up the {{logs}} directory. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
Description |
Original:
!image-2020-12-20-21-18-00-953.png|width=308,height=437!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.\*logs.\*}} does not back up the {{logs}} directory. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
New:
!image-2020-12-20-21-18-00-953.png|width=308,height=437!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.*logs.*}} does not back up the {{logs}} directory nor any files with {{logs}} in the name. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
Attachment | Original: image-2020-12-20-21-16-35-505.png [ 53700 ] |
Description |
Original:
!image-2020-12-20-21-18-00-953.png|width=308,height=437!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.*logs.*}} does not back up the {{logs}} directory nor any files with {{logs}} in the name. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
New:
!image-2020-12-20-21-18-00-953.png|width=308,height=437!
Excluding {{.*xml}} still backs up some, not all, xml files. Including {{.\*logs.\*}} does not back up the {{logs}} directory nor any files with {{logs}} in the name. {code:java} robot_acct@jenkins-controller:~/backup/FULL-2020-12-21_05-04$ tree . . ├── installedPlugins.xml └── jobs ├── adder │ └── config.xml └── subtractor └── config.xml {code} |
I've read through the code and understand the problem now.
The plugin uses org.apache.commons.io.FileUtils.copyDirectory which uses java.io.File and provides a filter that includes the regex that the users supply. Problem is that copyDirectory is behaving unexpectedly.
Suppose this is our source directory:
When I provide a regex (.*\/)?logs\/.*, I expect that to be compared against the absolute path of the file (e.g., /var/jenkins_home/logs/slaves/slave0/slave.log which would match the regex). That is not at all what copyDirectory does.
copyDirectory uses java.io.File to first tokenize the top level items and saves it to a list.
then it compares each item in the list against the filter. Since neither matches to the regex (.*\/)?logs\/.*, nothing is copied.
Let's suppose we set regex to logs.* in hopes that we capture everything inside logs directory.
copyDirectory first tokenizes the top level items and saves it to a list.
logs matches the regex, therefore it is saved to a "matched" list. log does not match therefore it is not saved to the matched list.
matchedList = ['logs']
Since logs is a directory its children names are saved to evalList.
Neither words match the regex logs.* therefore neither are added to matchList. Since logs is a directory but none of its children matched, it is considered an empty directory and is not copied.
As a result, nothing is copied.
In order to match all files in logs directory, the regex has to be logs|slave.*|tasks|.*\.log. It being unintuitive aside, the regex has to list all directory names as well as the file patterns on a single pattern. At that point, regex becomes too broad and starts including unintended files.
I've read through the function signatures under FileUtils but I don't see anything that doesn't apply the filter on each file name. I think the only way to make it work the way at least I find intuitive would be to move away from Apache commons library which would be a huge task. In fact, underneath it's java.io.File that's applying the filter this way, so I don't think using file filters would work in terms of matching regex against the absolute (or even relative) path of the file.