Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-18276

Archiving artifacts from a slave node is extremely slow even with 1.509.1+

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None
    • Jenkins 1.515, CentOS 6 64-bit, Java 7 (Oracle)

      I see no changes in a speed of archiving artifacts after upgrade to 1.515 (with slave upgraded to 2.23) which should have applied fix from JENKINS-7813.
      43MB WAR archives in 102 seconds vs. <2 seconds with SCP. No one confirmed on the mailing list that was able to get significant change after upgrade. It was suggested to open the new issue.

          [JENKINS-18276] Archiving artifacts from a slave node is extremely slow even with 1.509.1+

          rggjan added a comment -

          Same problem here, happens for a linux jenkins with a Mac OS X slave. Archiving ~100MB of data takes 10 minutes, while copying it manually over the network only takes around 10 econds...
          Jenkins ver. 1.565.2

          rggjan added a comment - Same problem here, happens for a linux jenkins with a Mac OS X slave. Archiving ~100MB of data takes 10 minutes, while copying it manually over the network only takes around 10 econds... Jenkins ver. 1.565.2

          Akihiro KAYAMA added a comment - - edited

          Total 11GB artifacts copy takes 30 minutes from Windows slave to Linux master.
          There is another critical bug https://issues.jenkins-ci.org/browse/JENKINS-10629 also.
          I decided copy artifacts directly to $JENKINS_HOME/jobs/$JOB_NAME/builds/$BUILD_ID/archive/.

          • create public directory in Windows slave allowing anonymous read access
          • kick network copy shell script using smbclient on Linux master from Windows batch script via ssh(putty's plink)

          11GB copy using smbclient takes only 3 minutes.
          Archive directories in matrix project has more complicated path so bothersome but possible too.

          Akihiro KAYAMA added a comment - - edited Total 11GB artifacts copy takes 30 minutes from Windows slave to Linux master. There is another critical bug https://issues.jenkins-ci.org/browse/JENKINS-10629 also. I decided copy artifacts directly to $JENKINS_HOME/jobs/$JOB_NAME/builds/$BUILD_ID/archive/. create public directory in Windows slave allowing anonymous read access kick network copy shell script using smbclient on Linux master from Windows batch script via ssh(putty's plink) 11GB copy using smbclient takes only 3 minutes. Archive directories in matrix project has more complicated path so bothersome but possible too.

          Lars Lykke added a comment -

          I'm seeing the same issue with Jenkins v. 1.581 (Master, Windows 7).
          My slaves are Windows 7/Windows 2012 machines with jenkins running as a service on them. Retrieving artifacts from upstream job takes a very long time (approx. 7 min for 750 MB ~ 100MB/min).
          The slaves are connected using JNLP agent
          Archiving artifacts is about the same: 12 min. for 2 GB.
          How can I provide additional information which might help in resolving this issue?

          Lars Lykke added a comment - I'm seeing the same issue with Jenkins v. 1.581 (Master, Windows 7). My slaves are Windows 7/Windows 2012 machines with jenkins running as a service on them. Retrieving artifacts from upstream job takes a very long time (approx. 7 min for 750 MB ~ 100MB/min). The slaves are connected using JNLP agent Archiving artifacts is about the same: 12 min. for 2 GB. How can I provide additional information which might help in resolving this issue?

          Edgars Batna added a comment -

          I do not even know where to start. On our build machines (similar setup) Jenkins has just finished a build and has been sitting there doing absolutely nothing for the last 1h, so I looked up the issue online. Hope for improvements.

          Edgars Batna added a comment - I do not even know where to start. On our build machines (similar setup) Jenkins has just finished a build and has been sitting there doing absolutely nothing for the last 1h, so I looked up the issue online. Hope for improvements.

          Daniel Beck added a comment -

          Does this issue still occur on recent Jenkins versions? If so, what kind of slave (SSH, JNLP, …) is affected? I think there was a performance fix a few months ago…

          Daniel Beck added a comment - Does this issue still occur on recent Jenkins versions? If so, what kind of slave (SSH, JNLP, …) is affected? I think there was a performance fix a few months ago…

          Lars Lykke added a comment -

          I just upgraded to 1.638 with a Win7 master and Win2012 slaves connected through JNLP. The slaves are running Jenkins as a Windows service.
          All machines run as virtual machines with non-dedicated disks. However, I would still assume the Network speed to be the bottleneck no matter what?

          If I archive artifacts (3,95 GB) from my build job it takes approx. 15 min 30 sec. = 4.32 MBps. The artifacts consist of approx. 4100 files.

          Copying artifacts from the master node (778MB) takes approx 8 min 50 sec. =1.47 MBps. The artifacts copied consisted of approx. 2.770 files.

          I've not made elaborate statistical sampling but the above is more less the normal time for copying artifacts.
          If I take a project with few files, I get 350 MB transferred in 17 sec. resulting in 20.6 MBps.

          So the slow transfer times in my case seems related to the transfer of many small files whereas transferring a few large files seem to go sufficiently fast.
          Would it make sense to zip artifacts prior to archiving? That sort of goes against the 'copy artifacts' step where it's possible to select/deselect files based on pattern? I don't assume this would be possible if I zip all the files prior to archiving.

          Lars Lykke added a comment - I just upgraded to 1.638 with a Win7 master and Win2012 slaves connected through JNLP. The slaves are running Jenkins as a Windows service. All machines run as virtual machines with non-dedicated disks. However, I would still assume the Network speed to be the bottleneck no matter what? If I archive artifacts (3,95 GB) from my build job it takes approx. 15 min 30 sec. = 4.32 MBps. The artifacts consist of approx. 4100 files. Copying artifacts from the master node (778MB) takes approx 8 min 50 sec. =1.47 MBps. The artifacts copied consisted of approx. 2.770 files. I've not made elaborate statistical sampling but the above is more less the normal time for copying artifacts. If I take a project with few files, I get 350 MB transferred in 17 sec. resulting in 20.6 MBps. So the slow transfer times in my case seems related to the transfer of many small files whereas transferring a few large files seem to go sufficiently fast. Would it make sense to zip artifacts prior to archiving? That sort of goes against the 'copy artifacts' step where it's possible to select/deselect files based on pattern? I don't assume this would be possible if I zip all the files prior to archiving.

          Lars Lykke added a comment -

          I've introduced a zip/unzip step in our build proces, which has HALVED the time used spent on check-out and compile.

          Our steps consists of check-out and archiving og 11500 files (approx. 350 MB) and subsequent compilation in a separate job. By zipping files prior to archiving and the unzipping after copying artifacts we've managed to reduce time spent checking out and compiling from 10 min (check-out/archiving/fingerprinting) + 30 min (copying, compiling, archiving) to 4 min + approx. 15 min.

          I don't know if this is Windows specific but it might be relevant to make it an option of the copy-artifacts and archiving functionality that it should support zipping through check-boxes to reduce the amount of steps involved? I've used powershell to zip and unzip the files in between the jobs.

          Lars Lykke added a comment - I've introduced a zip/unzip step in our build proces, which has HALVED the time used spent on check-out and compile. Our steps consists of check-out and archiving og 11500 files (approx. 350 MB) and subsequent compilation in a separate job. By zipping files prior to archiving and the unzipping after copying artifacts we've managed to reduce time spent checking out and compiling from 10 min (check-out/archiving/fingerprinting) + 30 min (copying, compiling, archiving) to 4 min + approx. 15 min. I don't know if this is Windows specific but it might be relevant to make it an option of the copy-artifacts and archiving functionality that it should support zipping through check-boxes to reduce the amount of steps involved? I've used powershell to zip and unzip the files in between the jobs.

          Carl van Schaik added a comment - - edited

          With Linux slaves our builds generates large artifacts (2GB+ after compressed with XZ). Copying these from and to builds takes a while obviously, the main reason seems to be that Jenkins archives via its control channel (e.g. ssh slave - using java SSH implementation JSCH). The java ssh just can't get anywhere near 1Gb/s network speed that native SSH can manage easily.

          I don't think this plugin will ever be able to provide suitable speeds without a major redesign to support alternative copy methods - e.g. using native SCP, SSL etc where supported by the slave/master. If going to the effort, I would highly recommend also separating the master node from the archive storage node - e.g. we started with Jenkins in a small VM, it now requires one with 300+ of GB of storage and can only manage the last 48 hours of artifacts.
          Delta-compression is useless for us as well, it works on only a certain type of data.

          Carl van Schaik added a comment - - edited With Linux slaves our builds generates large artifacts (2GB+ after compressed with XZ). Copying these from and to builds takes a while obviously, the main reason seems to be that Jenkins archives via its control channel (e.g. ssh slave - using java SSH implementation JSCH). The java ssh just can't get anywhere near 1Gb/s network speed that native SSH can manage easily. I don't think this plugin will ever be able to provide suitable speeds without a major redesign to support alternative copy methods - e.g. using native SCP, SSL etc where supported by the slave/master. If going to the effort, I would highly recommend also separating the master node from the archive storage node - e.g. we started with Jenkins in a small VM, it now requires one with 300+ of GB of storage and can only manage the last 48 hours of artifacts. Delta-compression is useless for us as well, it works on only a certain type of data.

          Tim Black added a comment - - edited

          navlrac, can you clarify what you mean by "this plugin will never be able to provide suitable speeds..."? My understanding is that this bug is against Jenkins core. Do you mean the [ssh-slaves-plugin|https://github.com/jenkinsci/ssh-slaves-plugin]?

          Also, has anyone achieved a reasonable workaround for archiving artifacts on jenkins agents?

          We're using ssh-slaves-plugin for our linux build nodes, and experiencing abysmal agent-master throughput, despite their 10Gbps link. We could replace our usage of `archiveArtifacts` steps with a custom groovy call to use system scp to master, but I'd of course rather not have to reinvent this wheel. I found [the publish-over-ssh plugin|https://github.com/jenkinsci/publish-over-ssh-plugin], but it doesn't seem to be maintained anymore. 

          FWIW, we're using Debian Buster, Jenkins 2.222.3, and this java version on all master/agents:

          jenkins@jenkins-testing-agent-2:~$ java --version
          openjdk 11.0.8 2020-07-14
          OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)
          OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed mode, sharing)

          Update: I created this Improvement issue over in ssh-slaves-plugin component, in hopes of getting some more ssh-specific eyes on this issue: https://issues.jenkins-ci.org/browse/JENKINS-63517

          Tim Black added a comment - - edited navlrac , can you clarify what you mean by " this plugin will never be able to provide suitable speeds..."? My understanding is that this bug is against Jenkins core. Do you mean the [ssh-slaves-plugin| https://github.com/jenkinsci/ssh-slaves-plugin ]? Also, has anyone achieved a reasonable workaround for archiving artifacts on jenkins agents? We're using ssh-slaves-plugin for our linux build nodes, and experiencing abysmal agent-master throughput, despite their 10Gbps link. We could replace our usage of `archiveArtifacts` steps with a custom groovy call to use system scp to master, but I'd of course rather not have to reinvent this wheel. I found [the publish-over-ssh plugin| https://github.com/jenkinsci/publish-over-ssh-plugin ], but it doesn't seem to be maintained anymore.  FWIW, we're using Debian Buster, Jenkins 2.222.3, and this java version on all master/agents: jenkins@jenkins-testing-agent-2:~$ java --version openjdk 11.0.8 2020-07-14 OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1) OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed mode, sharing) Update: I created this Improvement issue over in ssh-slaves-plugin component, in hopes of getting some more ssh-specific eyes on this issue:  https://issues.jenkins-ci.org/browse/JENKINS-63517

          Master: Windows running 2.249.2

          Slave: macOS 10.13.6

          The slave was launched using JNLP.

          In the process of investigating why the 'Copy Artifact' plugin was taking 12 mins to copy about 250 MB from the Master to the Slave, I incorrectly assumed that the plugin was slow.

          I realized this the hard way by reconfiguring my jobs to use Azure Storage where the master uploaded to Azure and the slave downloaded from Azure. Before I embarked on the Azure Storage alternative, I ran a test on the Mac by downloading the necessary file from Azure - 250 MB in about 20 seconds. But when I finally ran the job, the download from Azure took about 13 minutes!

          This is definitely not resolved.

          Ranjit Vadakkan added a comment - Master: Windows running 2.249.2 Slave: macOS 10.13.6 The slave was launched using JNLP. In the process of investigating why the 'Copy Artifact' plugin was taking 12 mins to copy about 250 MB from the Master to the Slave, I incorrectly assumed that the plugin was slow. I realized this the hard way by reconfiguring my jobs to use Azure Storage where the master uploaded to Azure and the slave downloaded from Azure. Before I embarked on the Azure Storage alternative, I ran a test on the Mac by downloading the necessary file from Azure - 250 MB in about 20 seconds. But when I finally ran the job, the download from Azure took about 13 minutes! This is definitely not resolved.

            Unassigned Unassigned
            emszpak Marcin Zajączkowski
            Votes:
            36 Vote for this issue
            Watchers:
            43 Start watching this issue

              Created:
              Updated: