Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-71217

Optimize checkout performance issue

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • Java version: JDK 11
      Jenkins version: 2.375.2
      Remoting version: 4.13

      How can I improve the checkout performance if dynamic Jenkins agent runs inside Kubernetes cluster in poor network?

      The git repo is very small, some text files with only one commit. It just takes several seconds on permanent agent, so the checkout time becomes to the issue.

       

      The 'Declarative: Checkout SCM' step is slow if I run Jenkins job in a dynamic Jenkins agent created in a local Kubernetes cluster which has poor network.

       

      Jenkins job configuration is attached.

       

      Component version:

      Jenkins version: 2.375.2

      Plugins:

      git: 5.0.0

      git-client: 4.0.0

      kubernetes: 3743.v1fa_4c724c3b_7

      ssh-agent version: 295.v9ca_a_1c7cc3a_a_

      ssh-slave version: 2.854.v7fd446b_337c9

       

      Environment:

      Jenkinsfile: https://github.com/zhan9san/test-checkout/blob/main/Jenkinsfile

      Jenkins is running on an EC2 server.

      Kubernetes cluster-1 is an EKS instance in AWS

      Kubernetes cluster-2 is a local Kubernetes cluster in Office

      The network between the Jenkins server and cluster-1 is good while Jenkins server to cluster-2 is not good

      The same pod template with only one container `jnlp` is configured in Jenkins cloud configuration for both cluster-1 and cluster-2.

      One worker node in cluster-02  is configured as a permanent Jenkins agent

      They are configured with different labels

       

      How long 'Declarative: Checkout SCM' would take in different test?

       

      Test 1: Run Jenkins job in cluster-1

      It takes about less 10 seconds

      Test 2: Run Jenkins job in cluster-2 without cached jarCache

      It takes about 6 min 10s

      Test 3: Run Jenkins job in cluster-2 with cached jarCache

      It takes about 4 min 50s

      Test 4: Run Jenkins job in permanent Jenkins agent

      It takes about 3 min first time(It may retrieve jar file to agent)

      It takes about 9 s second time

       

      How the jarCache is cached in cluster?

      There is an NFS storing the cached jar files, and a PVC is mounted in jnlp container. And `-jar-cache /home/jenkins/jarCache` is passed to `jenkins-agent` shell script. I use `inotifywait -m -r /home/jenkins/jarCache` to monitor the file change, no `tmp` jar files are created, so I think no retrieving operation occurs.

       

      Here is my conjecture.

      The time is spent on calculating the md5 value of jar file and update the last modified time of existing jar files. Would it possible to not update the last modified time of jar files to reduce the time?

      If I understand correctly, once the jar files are cached, there is no need to retrieve them again. But `remoting` still update the last modified time via `Files.setLastModifiedTime` in `FileSystemJarCache.java`

       

      Here is what I tested.

      I run `inotifywait -m -r /home/jenkins/jarCache` inside jnlp container and notice that there are much more file operations if the pipeline console log is blocked at '[Pipeline] checkout'

      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/24/ OPEN EB8D09A06BF01720885977D998797A.jar
      /home/jenkins/jarCache/24/ ATTRIB EB8D09A06BF01720885977D998797A.jar
      /home/jenkins/jarCache/24/ CLOSE_NOWRITE,CLOSE EB8D09A06BF01720885977D998797A.jar
      /home/jenkins/jarCache/24/ ACCESS EB8D09A06BF01720885977D998797A.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ATTRIB 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ CLOSE_NOWRITE,CLOSE 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ ACCESS 674E7818836879CAD5D217139F4228.jar
      /home/jenkins/jarCache/B2/ OPEN 674E7818836879CAD5D217139F4228.jar

       

      But it's weird that if the job runs on the permanent Jenkins agent(one worker node in cluster-2), it is still much faster than it runs in cluster-2.

       

      Besides I tried to run checkout explicitly inside Jenkinsfile as below in a dynamic jenkins agent in local cluster.

      `checkout scmGit(branches: [[name: '*/main']], extensions: [], userRemoteConfigs: [[url: 'https://github.com/zhan9san/test-checkout.git']])`

      It just takes about 1 min 29 seconds(13:39:49-13:41:18)

      Running on tcpdump-kg8sf in /home/jenkins/agent/workspace/Playground/jack-test/test-agent-label
      [Pipeline] {
      [Pipeline] timestamps
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (Init)
      [Pipeline] checkout
      13:39:49  No credentials specified
      13:39:49  Cloning the remote Git repository
      13:40:44  Cloning repository https://github.com/zhan9san/test-checkout.git
      13:40:45   > git init /home/jenkins/agent/workspace/Playground/jack-test/test-agent-label # timeout=10
      13:40:46  Fetching upstream changes from https://github.com/zhan9san/test-checkout.git
      13:40:46   > git --version # timeout=10
      13:40:46   > git --version # 'git version 2.25.1'
      13:40:46   > git fetch --tags --force --progress -- https://github.com/zhan9san/test-checkout.git +refs/heads/*:refs/remotes/origin/* # timeout=10
      13:40:47   > git config remote.origin.url https://github.com/zhan9san/test-checkout.git # timeout=10
      13:40:47   > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
      13:41:05  Avoid second fetch
      13:41:05  Checking out Revision 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 (refs/remotes/origin/main)
      13:41:05   > git rev-parse refs/remotes/origin/main^{commit} # timeout=10
      13:41:06   > git config core.sparsecheckout # timeout=10
      13:41:06   > git checkout -f 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 # timeout=10
      13:41:17  Commit message: "Create Jenkinsfile"
      13:41:18   > git rev-list --no-walk 5d4b69dd8f4e9987a0cfe4ff233a63b266b122c0 # timeout=10
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] stage
      [Pipeline] { (Hello)
      [Pipeline] script 

       

            Unassigned Unassigned
            jack_zhang Jack
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: