-
Bug
-
Resolution: Unresolved
-
Major
-
None
Hello, we have a lot of jobs running on our jenkins and each job (mostly multi-branch pipelines) are using multiple libraries. The libraries are downloaded from GitHub and configured to all download the same default version. We now face the issue that everytime a build is started each library seems to be re-downloaded from GitHub into the individual build directories, say:
/var/jenkins_home/jobs/<name>/branches/main/builds/46/libs/... /var/jenkins_home/jobs/<name>/branches/main/builds/47/libs/... ...
As some libraries come with hundrets of files, we have hundrets of jobs, that do hundrets of builds, that results in literally millions of duplicated files on our Jenkins clogging up the whole file system. At that is even if the version of the libraries in question do not change between build the same library is downloaded all over again for every build.
Is there a way to prevent this massive duplication of files between builds, like a way to clone the library to a central folder on Jenkins instead of the individual build directories instead? Thanks!
- relates to
-
JENKINS-70870 Save libraries as JAR files rather than unpacked
-
- In Review
-
-
JENKINS-38992 Cache libraries specified by a permanent revision string
-
- Resolved
-
For purposes of resuming after restart Replay it is necessary to somehow have a copy of the exact version of the library used in that particular build.
Some filesystems can automatically deduplicate.
Currently the LibraryRetriever interface forces libraries to be downloaded to disk. It would be nice on supported SCMs like GitHub to retrieve source files on demand via HTTP(S) API (“lightweight checkout”), so that the build record would only need to keep the commit hash or similar.
JENKINS-38992introduced caching but I guess still makes copies.