• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • ssh-slaves-plugin
    • None
    • Ubuntu Server 10.04 64-bit

      I don't know why this happens, but my slaves have begun to hang when they get to the Archiving Artifacts portion of my job:

      Archiving artifacts
      ERROR: Failed to archive artifacts: dist/**
      hudson.util.IOException2: Failed to extract /mnt/hudsonslave/workspace/simplegeo-puppet-manifests/dist/**
      	at hudson.FilePath.readFromTar(FilePath.java:1577)
      	at hudson.FilePath.copyRecursiveTo(FilePath.java:1491)
      	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:117)
      	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      	at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
      	at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:580)
      	at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:558)
      	at hudson.model.Build$RunnerImpl.post2(Build.java:157)
      	at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:528)
      	at hudson.model.Run.run(Run.java:1303)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      	at hudson.model.ResourceController.execute(ResourceController.java:88)
      	at hudson.model.Executor.run(Executor.java:137)
      Caused by: java.io.IOException
      	at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173)
      	at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61)
      	at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221)
      	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141)
      	at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:92)
      	at org.apache.tools.tar.TarBuffer.readBlock(TarBuffer.java:257)
      	at org.apache.tools.tar.TarBuffer.readRecord(TarBuffer.java:223)
      	at hudson.org.apache.tools.tar.TarInputStream.read(TarInputStream.java:345)
      	at java.io.FilterInputStream.read(FilterInputStream.java:90)
      	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1025)
      	at org.apache.commons.io.IOUtils.copy(IOUtils.java:999)
      	at hudson.util.IOUtils.copy(IOUtils.java:33)
      	at hudson.FilePath.readFromTar(FilePath.java:1565)
      	... 12 more
      

      I aborted this job, so I don't know if the error is related or not. The job runs fine, but it hangs during archiving. If I restart the connection to the node (in /computer/; not restarting Hudson or the node itself), jobs will build successfully again for a while.

          [JENKINS-7641] Slaves hang when archiving artifacts

          We have just started experiencing the same problem, and its killing our build farm. Its happening across platforms as mentioned before and does not seem specific to how the connection is launched. Both ssh from the master or webstart from a slave have the same problem.

          Some information that might help. We don't always have artifacts, they are only present if tests fail and leave behind what they expected to be created which we can manually inspect later. Could it be because it didn't find anything?

          Nicholas Klopfer-Webber added a comment - We have just started experiencing the same problem, and its killing our build farm. Its happening across platforms as mentioned before and does not seem specific to how the connection is launched. Both ssh from the master or webstart from a slave have the same problem. Some information that might help. We don't always have artifacts, they are only present if tests fail and leave behind what they expected to be created which we can manually inspect later. Could it be because it didn't find anything?

          I have the same problem using version 1.609.
          I think the problem is related somehow to the dimensions of files to archive.
          Infact if i change my job configuration in order to archive less artifacts, the job completes successfully

          Marco Albanese added a comment - I have the same problem using version 1.609. I think the problem is related somehow to the dimensions of files to archive. Infact if i change my job configuration in order to archive less artifacts, the job completes successfully

          Jason Naylor added a comment -

          I'm running into this now in version 1.651.3 on a windows slave using ssh
          I'm not sure what is preventing the Archiving step from succeeding, but whatever it is leaves the build in a hung state indefinitely. This build agent has successfully built before and only started having this problem recently.

          Jason Naylor added a comment - I'm running into this now in version 1.651.3 on a windows slave using ssh I'm not sure what is preventing the Archiving step from succeeding, but whatever it is leaves the build in a hung state indefinitely. This build agent has successfully built before and only started having this problem recently.

          Jason Naylor added a comment -

          The most disruptive part of this hang is that it disregards the build timeout. I have 'Abort the build if it is stuck' set with an absolute timeout of 45 minutes configured in the job and the build was stuck at the Archiving step for over 2 and a half hours.

          Jason Naylor added a comment - The most disruptive part of this hang is that it disregards the build timeout. I have 'Abort the build if it is stuck' set with an absolute timeout of 45 minutes configured in the job and the build was stuck at the Archiving step for over 2 and a half hours.

          Joseph John added a comment -

          We were hitting with this issue but atleast in our case I was able to get a work around as below and with my limited experience I doubt the actual problem is with the Jenkins code interaction with DNS.

          In my case the Jenkins Master and the slave where I had the Artifact hanging was in different network domain.
          Example a simple ping command for me from jenkins master

          
          

          ping windows_machine_with_problem
          PING windows_machine_with_problem.DOMAIN1 (172.23.136.85) 56(84) bytes of data.
          64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=1 ttl=127 time=0.287 ms
          64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=2 ttl=127 time=0.335 ms
          64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=3 ttl=127 time=0.377 ms

          ping windows_machine_with_noproblem
          PING windows_machine_with_noproblem.DOMAIN2 (172.28.8.87) 56(84) bytes of data.
          64 bytes from 172.28.8.87: icmp_seq=1 ttl=128 time=0.379 ms
          64 bytes from 172.28.8.87: icmp_seq=2 ttl=128 time=0.519 ms
          64 bytes from 172.28.8.87: icmp_seq=3 ttl=128 time=0.400 ms

          This indicated any traffic which goes from the master to windows_machine_with_problem passes through some networking entities.
          To avoid any DNS from equation I just replaced the "hostname" in slave configuration directly with IP address and the artifact hanging disappeared.

          So to summarize when I used ip-address instead of DNS name , atleast in my case artifact hanging issue was resolved.

          Joseph John added a comment - We were hitting with this issue but atleast in our case I was able to get a work around as below and with my limited experience I doubt the actual problem is with the Jenkins code interaction with DNS. In my case the Jenkins Master and the slave where I had the Artifact hanging was in different network domain. Example a simple ping command for me from jenkins master ping windows_machine_with_problem PING windows_machine_with_problem.DOMAIN1 (172.23.136.85) 56(84) bytes of data. 64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=1 ttl=127 time=0.287 ms 64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=2 ttl=127 time=0.335 ms 64 bytes from windows_machine_with_problem.DOMAIN1 (172.23.136.85): icmp_seq=3 ttl=127 time=0.377 ms ping windows_machine_with_noproblem PING windows_machine_with_noproblem.DOMAIN2 (172.28.8.87) 56(84) bytes of data. 64 bytes from 172.28.8.87: icmp_seq=1 ttl=128 time=0.379 ms 64 bytes from 172.28.8.87: icmp_seq=2 ttl=128 time=0.519 ms 64 bytes from 172.28.8.87: icmp_seq=3 ttl=128 time=0.400 ms This indicated any traffic which goes from the master to windows_machine_with_problem passes through some networking entities. To avoid any DNS from equation I just replaced the "hostname" in slave configuration directly with IP address and the artifact hanging disappeared. So to summarize when I used ip-address instead of DNS name , atleast in my case artifact hanging issue was resolved.

          Kaveh Vaghefi added a comment -

          This still happens for us in jenkins 2.21 with a number of Ubuntu 14.04 slaves.

          Kaveh Vaghefi added a comment - This still happens for us in jenkins 2.21 with a number of Ubuntu 14.04 slaves.

          wgracelee added a comment -

          It's true for us for jenkins 2.63 on CentOS 5 64b.

          wgracelee added a comment - It's true for us for jenkins 2.63 on CentOS 5 64b.

          I'm seeing this on Jenkins ver. 2.46.2 on Windows 10.

           

           

          Dennis Jackson added a comment - I'm seeing this on Jenkins ver. 2.46.2 on Windows 10.    

          Another interesting data point: when the archive step hangs, jenkins does not honor the build timeout parameter:  Abort the build if it's stuck was set to 25 minutes and the build sat there for an hour and 20 minutes.

          Dennis Jackson added a comment - Another interesting data point: when the archive step hangs, jenkins does not honor the build timeout parameter:  Abort the build if it's stuck  was set to 25 minutes and the build sat there for an hour and 20 minutes.

          Could someone try to use the new option to disable the TCP_NODELAY (1.28)? it will use a buffered connection that will improve the data transfer on large files transfer.

          see https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/CONFIGURE.md#advanced-settings

          Ivan Fernandez Calvo added a comment - Could someone try to use the new option to disable the TCP_NODELAY (1.28)? it will use a buffered connection that will improve the data transfer on large files transfer. see https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/CONFIGURE.md#advanced-settings

            ifernandezcalvo Ivan Fernandez Calvo
            ieure ieure
            Votes:
            49 Vote for this issue
            Watchers:
            57 Start watching this issue

              Created:
              Updated:
              Resolved: