Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7114

aborting download of workspace files makes slave agent hang

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: remoting
    • Labels:
      None
    • Environment:
      Hudson ver. 1.368 running on tomcat 6.0.18 running on ubuntu server 9.04 (32bit)
    • Similar Issues:

      Description

      I'm experiencing troubles, when aborting downloads of files from the workspace of a job.

      Hudson allows to download files from the workspace of the last build of a job. Just downloading these files works flawless.

      But if I abort a download in my browser, I'am not able to browse the workspace any more. This even affects all other workspaces on the same node (connected via the same slave agent).
      In the manage nodes page, the node is still listed as connected and hudson still schedules new jobs to be run on this node. But once they get executed, they hang, as hudson is not able to run remote commands on the node.

      A disconnect/reconnect of the slave agent is needed, to recover from this issue.

      I can reproduce this issue with windows nodes (slave agent is launched via JNLP) and linux nodes (slave agent is launched via SSH).

        Attachments

          Activity

          Hide
          evernat evernat added a comment -

          There was some issues fixed in the remoting between master and slave since a year. So this issue may well be fixed.
          Do you reproduce the issue with a recent release of Jenkins?

          Show
          evernat evernat added a comment - There was some issues fixed in the remoting between master and slave since a year. So this issue may well be fixed. Do you reproduce the issue with a recent release of Jenkins?
          Hide
          jsirex jsirex added a comment -

          I have similar issue:
          I have a master jenkins on Linux Debian and slave on Windows 2008 R2
          When I try to see or get content from workspace of a job tied to windows slave i also get hang on slave.

          Jenkins slave error log
           hudson.remoting.Channel$ReaderThread run
          SEVERE: Unable to read a command (channel channel)
          java.lang.ClassNotFoundException: winstone.ClientSocketException
                  at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
                  at java.security.AccessController.doPrivileged(Native Method)
                  at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
                  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
                  at java.lang.Class.forName0(Native Method)
                  at java.lang.Class.forName(Class.java:247)
                  at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:603)
                  at hudson.remoting.ObjectInputStreamEx.resolveClass(ObjectInputStreamEx.java:50)
                  at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574)
                  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495)
                  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731)
                  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
                  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
                  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
                  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
                  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
                  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
                  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
                  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
                  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
                  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
                  at hudson.remoting.Channel$ReaderThread.run(Channel.java:1031)
          24.08.2011 14:21:45 hudson.remoting.Channel$ReaderThread run
          SEVERE: Failed to execute command null (channel channel)
          java.lang.NullPointerException
                  at hudson.remoting.Channel$ReaderThread.run(Channel.java:1049)
          Exception in thread "Channel reader thread: channel" java.lang.NullPointerException
                  at hudson.remoting.Channel$ReaderThread.run(Channel.java:1052)
          
          Show
          jsirex jsirex added a comment - I have similar issue: I have a master jenkins on Linux Debian and slave on Windows 2008 R2 When I try to see or get content from workspace of a job tied to windows slave i also get hang on slave. Jenkins slave error log hudson.remoting.Channel$ReaderThread run SEVERE: Unable to read a command (channel channel) java.lang.ClassNotFoundException: winstone.ClientSocketException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang. ClassLoader .loadClass( ClassLoader .java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang. ClassLoader .loadClass( ClassLoader .java:248) at java.lang. Class .forName0(Native Method) at java.lang. Class .forName( Class .java:247) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:603) at hudson.remoting.ObjectInputStreamEx.resolveClass(ObjectInputStreamEx.java:50) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1031) 24.08.2011 14:21:45 hudson.remoting.Channel$ReaderThread run SEVERE: Failed to execute command null (channel channel) java.lang.NullPointerException at hudson.remoting.Channel$ReaderThread.run(Channel.java:1049) Exception in thread "Channel reader thread: channel" java.lang.NullPointerException at hudson.remoting.Channel$ReaderThread.run(Channel.java:1052)
          Hide
          jstruck Jes Struck added a comment -

          we have just upgraded from jenkins 1.490 -> 1.491 and are still seing this issue
          we are on ubuntu 12.04 LTS on both master and slaves
          and using the ssh slave connector.

          is there any one that havesolved this. it stackes up in the log file quit fast. so the log file grows to 2-3 GB in a couple of hours

          Show
          jstruck Jes Struck added a comment - we have just upgraded from jenkins 1.490 -> 1.491 and are still seing this issue we are on ubuntu 12.04 LTS on both master and slaves and using the ssh slave connector. is there any one that havesolved this. it stackes up in the log file quit fast. so the log file grows to 2-3 GB in a couple of hours
          Hide
          marscher Martin Scherer added a comment -

          I can confirm this behaviour. My Log has grown to 12Gb due to the continuous thrown (and not properly handled) exception.
          Thanks in advance for fixing this soon!

          Show
          marscher Martin Scherer added a comment - I can confirm this behaviour. My Log has grown to 12Gb due to the continuous thrown (and not properly handled) exception. Thanks in advance for fixing this soon!
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Capability.java
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/Command.java
          src/main/java/hudson/remoting/MimicException.java
          src/main/java/hudson/remoting/ProxyOutputStream.java
          src/test/java/hudson/remoting/DeadRemoteOutputStreamTest.java
          http://jenkins-ci.org/commit/remoting/d084f5095f11a8bcb80913266d664b9d1f17dc3b
          Log:
          [FIXED JENKINS-7114] fixed the layer confusion in the remoting

          When ProxyOutputStream sends write(byte[]) to the other end and the actual write fails, "NotifyDeadWriter" object comes back and reports back that the write has failed. Without this mechanism, the writer side will keep on going.

          Because the RemoteOutputStream service is a lower layer service that cannot rely on custom classloading service (doing so would create cyclic dependencies), when we send back this exception, the object graph of the exception needs to be deserializable on the receiver side. This is not the case when the exception class is defined by the user (as is the case of winstone.ClientSocketException.)

          This fix addresses this problem by turning an exception class into another class that emulates the output of the original exception.

          To make this change interopeable with earlier versions, we need to introduce a new capability flag. If we send MimicException to the other side and the other side doesn't have this class definition, then it'll cause automatic fail.

          Compare: https://github.com/jenkinsci/remoting/compare/2b1ec8ab1528...d084f5095f11

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Capability.java src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/Command.java src/main/java/hudson/remoting/MimicException.java src/main/java/hudson/remoting/ProxyOutputStream.java src/test/java/hudson/remoting/DeadRemoteOutputStreamTest.java http://jenkins-ci.org/commit/remoting/d084f5095f11a8bcb80913266d664b9d1f17dc3b Log: [FIXED JENKINS-7114] fixed the layer confusion in the remoting When ProxyOutputStream sends write(byte[]) to the other end and the actual write fails, "NotifyDeadWriter" object comes back and reports back that the write has failed. Without this mechanism, the writer side will keep on going. Because the RemoteOutputStream service is a lower layer service that cannot rely on custom classloading service (doing so would create cyclic dependencies), when we send back this exception, the object graph of the exception needs to be deserializable on the receiver side. This is not the case when the exception class is defined by the user (as is the case of winstone.ClientSocketException.) This fix addresses this problem by turning an exception class into another class that emulates the output of the original exception. To make this change interopeable with earlier versions, we need to introduce a new capability flag. If we send MimicException to the other side and the other side doesn't have this class definition, then it'll cause automatic fail. Compare: https://github.com/jenkinsci/remoting/compare/2b1ec8ab1528...d084f5095f11
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          pom.xml
          http://jenkins-ci.org/commit/jenkins/3cfa3e55a1f57b6634576a0d6d7ac044714e1022
          Log:
          [FIXED JENKINS-7114]

          The actual change is in the remoting 1.20.

          Compare: https://github.com/jenkinsci/jenkins/compare/75fc9308a9f2...3cfa3e55a1f5

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/3cfa3e55a1f57b6634576a0d6d7ac044714e1022 Log: [FIXED JENKINS-7114] The actual change is in the remoting 1.20. Compare: https://github.com/jenkinsci/jenkins/compare/75fc9308a9f2...3cfa3e55a1f5
          Hide
          dogfood dogfood added a comment -

          Integrated in jenkins_main_trunk #2154
          [FIXED JENKINS-7114] (Revision 3cfa3e55a1f57b6634576a0d6d7ac044714e1022)

          Result = SUCCESS
          kohsuke : 3cfa3e55a1f57b6634576a0d6d7ac044714e1022
          Files :

          • changelog.html
          • pom.xml
          Show
          dogfood dogfood added a comment - Integrated in jenkins_main_trunk #2154 [FIXED JENKINS-7114] (Revision 3cfa3e55a1f57b6634576a0d6d7ac044714e1022) Result = SUCCESS kohsuke : 3cfa3e55a1f57b6634576a0d6d7ac044714e1022 Files : changelog.html pom.xml

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            heiko_bihr heiko_bihr
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: