-
Improvement
-
Resolution: Unresolved
-
Major
-
Java version: JDK 11
Jenkins version: 2.375.2
Remoting version: 4.13
How can I improve the checkout performance if dynamic Jenkins agent runs inside Kubernetes cluster in poor network?
The git repo is very small, some text files with only one commit. It just takes several seconds on permanent agent, so the checkout time becomes to the issue.
The 'Declarative: Checkout SCM' step is slow if I run Jenkins job in a dynamic Jenkins agent created in a local Kubernetes cluster which has poor network.
Jenkins job configuration is attached.
Component version:
Jenkins version: 2.375.2
Plugins:
git: 5.0.0
git-client: 4.0.0
kubernetes: 3743.v1fa_4c724c3b_7
ssh-agent version: 295.v9ca_a_1c7cc3a_a_
ssh-slave version: 2.854.v7fd446b_337c9
Environment:
Jenkinsfile: https://github.com/zhan9san/test-checkout/blob/main/Jenkinsfile
Jenkins is running on an EC2 server.
Kubernetes cluster-1 is an EKS instance in AWS
Kubernetes cluster-2 is a local Kubernetes cluster in Office
The network between the Jenkins server and cluster-1 is good while Jenkins server to cluster-2 is not good
The same pod template with only one container `jnlp` is configured in Jenkins cloud configuration for both cluster-1 and cluster-2.
One worker node in cluster-02 is configured as a permanent Jenkins agent
They are configured with different labels
How long 'Declarative: Checkout SCM' would take in different test?
Test 1: Run Jenkins job in cluster-1
It takes about less 10 seconds
Test 2: Run Jenkins job in cluster-2 without cached jarCache
It takes about 6 min 10s
Test 3: Run Jenkins job in cluster-2 with cached jarCache
It takes about 4 min 50s
Test 4: Run Jenkins job in permanent Jenkins agent
It takes about 3 min first time(It may retrieve jar file to agent)
It takes about 9 s second time
How the jarCache is cached in cluster?
There is an NFS storing the cached jar files, and a PVC is mounted in jnlp container. And `-jar-cache /home/jenkins/jarCache` is passed to `jenkins-agent` shell script. I use `inotifywait -m -r /home/jenkins/jarCache` to monitor the file change, no `tmp` jar files are created, so I think no retrieving operation occurs.
Here is my conjecture.
The time is spent on calculating the md5 value of jar file and update the last modified time of existing jar files. Would it possible to not update the last modified time of jar files to reduce the time?
If I understand correctly, once the jar files are cached, there is no need to retrieve them again. But `remoting` still update the last modified time via `Files.setLastModifiedTime` in `FileSystemJarCache.java`
Here is what I tested.
I run `inotifywait -m -r /home/jenkins/jarCache` inside jnlp container and notice that there are much more file operations if the pipeline console log is blocked at '[Pipeline] checkout'
/home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/24/ OPEN EB8D09A06BF01720885977D998797A.jar /home/jenkins/jarCache/24/ ATTRIB EB8D09A06BF01720885977D998797A.jar /home/jenkins/jarCache/24/ CLOSE_NOWRITE,CLOSE EB8D09A06BF01720885977D998797A.jar /home/jenkins/jarCache/24/ ACCESS EB8D09A06BF01720885977D998797A.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
But it's weird that if the job runs on the permanent Jenkins agent(one worker node in cluster-2), it is still much faster than it runs in cluster-2.
Besides I tried to run checkout explicitly inside Jenkinsfile as below in a dynamic jenkins agent in local cluster.
`checkout scmGit(branches: [[name: '*/main']], extensions: [], userRemoteConfigs: [[url: 'https://github.com/zhan9san/test-checkout.git']])`
It just takes about 1 min 29 seconds(13:39:49-13:41:18)
Running on tcpdump-kg8sf in /home/jenkins/agent/workspace/Playground/jack-test/test-agent-label [Pipeline] { [Pipeline] timestamps [Pipeline] { [Pipeline] stage [Pipeline] { (Init) [Pipeline] checkout 13:39:49 No credentials specified 13:39:49 Cloning the remote Git repository 13:40:44 Cloning repository https://github.com/zhan9san/test-checkout.git 13:40:45 > git init /home/jenkins/agent/workspace/Playground/jack-test/test-agent-label # timeout=10 13:40:46 Fetching upstream changes from https://github.com/zhan9san/test-checkout.git 13:40:46 > git --version # timeout=10 13:40:46 > git --version # 'git version 2.25.1' 13:40:46 > git fetch --tags --force --progress -- https://github.com/zhan9san/test-checkout.git +refs/heads/*:refs/remotes/origin/* # timeout=10 13:40:47 > git config remote.origin.url https://github.com/zhan9san/test-checkout.git # timeout=10 13:40:47 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 13:41:05 Avoid second fetch 13:41:05 Checking out Revision 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 (refs/remotes/origin/main) 13:41:05 > git rev-parse refs/remotes/origin/main^{commit} # timeout=10 13:41:06 > git config core.sparsecheckout # timeout=10 13:41:06 > git checkout -f 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 # timeout=10 13:41:17 Commit message: "Create Jenkinsfile" 13:41:18 > git rev-list --no-walk 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 # timeout=10 [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Hello) [Pipeline] script
- relates to
-
JENKINS-67829 JarLoaderImpl is per connection in a controller -causing repeated checking of jars checksums
- Open