Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-18879

Collecting finbugs analysis results randomly fails with exception

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • ssh-slaves-plugin
    • Master/Slave on Linux x64 with Java 1.7.0_21 and Jenkins ver. 1.524-SNAPSHOT (rc-07/16/2013 13:36 GMT-kohsuke)

      It started some weeks ago that jobs with long running findbugs tasks started to fail while collecting the results. So every morning I have to restart a job or two to get everything back to blue.

      This exception seems to be random. It only ever occurs while the findbugs plugin is running and not during any other static code analysis. Any plugin that runs afterwards fails with a similar message that the connection is closed.

      Another strange thing is that afterwards I can't see the slave log anymore. All I get is the wait image on the page but no log ever loads again until a restart of the master. A simple restart of the slave node does not change anything although jobs are running fine.

      ERROR: Publisher hudson.plugins.findbugs.FindBugsPublisher aborted due to exception
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Sorry, this connection is closed.
      	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      	at hudson.remoting.Request.call(Request.java:174)
      	at hudson.remoting.Channel.call(Channel.java:713)
      	at hudson.FilePath.act(FilePath.java:895)
      	at hudson.FilePath.act(FilePath.java:879)
      	at hudson.plugins.findbugs.FindBugsPublisher.perform(FindBugsPublisher.java:161)
      	at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:144)
      	at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:334)
      	at hudson.tasks.BuildStepMonitor$2.perform(BuildStepMonitor.java:27)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:776)
      	at hudson.model.Build$BuildExecution.post2(Build.java:183)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:726)
      	at hudson.model.Run.execute(Run.java:1618)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      	at hudson.model.ResourceController.execute(ResourceController.java:88)
      	at hudson.model.Executor.run(Executor.java:247)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Sorry, this connection is closed.
      	at hudson.remoting.Request.abort(Request.java:299)
      	at hudson.remoting.Channel.terminate(Channel.java:773)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      Caused by: java.io.IOException: Sorry, this connection is closed.
      	at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:642)
      	at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:378)
      	at com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:953)
      	at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58)
      	at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2308)
      	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2321)
      	at java.io.ObjectInputStream$BlockDataInputStream.readUnsignedShort(ObjectInputStream.java:2804)
      	at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2862)
      	at java.io.ObjectInputStream.readString(ObjectInputStream.java:1636)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1339)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
      	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
      	at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:499)
      	at java.lang.Throwable.readObject(Throwable.java:913)
      	at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at hudson.remoting.Command.readFrom(Command.java:92)
      	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:72)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      Caused by: java.io.IOException: Assertion error: sendMessage may never be invoked by the receiver thread!
      	at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:634)
      	at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:378)
      	at com.trilead.ssh2.channel.Channel$Output.write(Channel.java:97)
      	at com.trilead.ssh2.channel.ChannelManager.msgChannelExtendedData(ChannelManager.java:858)
      	at com.trilead.ssh2.channel.ChannelManager.handleMessage(ChannelManager.java:1517)
      	at com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:780)
      	at com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:475)
      	at java.lang.Thread.run(Thread.java:724)
      

          [JENKINS-18879] Collecting finbugs analysis results randomly fails with exception

          Sadly, ssh-slaves 1.4 does not resolve the issue

          Stephen Connolly added a comment - Sadly, ssh-slaves 1.4 does not resolve the issue

          It's a bug in the trilead ssh lib... I think i have the fix

          Stephen Connolly added a comment - It's a bug in the trilead ssh lib... I think i have the fix

          Code changed in jenkins
          User: Stephen Connolly
          Path:
          src/com/trilead/ssh2/channel/Channel.java
          http://jenkins-ci.org/commit/trilead-ssh2/f1353cc0e0aa1b1e6bc845236e4a2530ea3103fd
          Log:
          [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] remove double call of freeupWindow(len); when using ssh-slaves 0.27+

          • the more performant code path is only followed when using SSH Slaves 0.27+
          • the double call causes the channel to get torn down
          • thus excessive logging to stderr on the slave side of the connection will cause the connection to tear down
          • removing the duplicate call resolves the issue

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: src/com/trilead/ssh2/channel/Channel.java http://jenkins-ci.org/commit/trilead-ssh2/f1353cc0e0aa1b1e6bc845236e4a2530ea3103fd Log: [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] remove double call of freeupWindow(len); when using ssh-slaves 0.27+ the more performant code path is only followed when using SSH Slaves 0.27+ the double call causes the channel to get torn down thus excessive logging to stderr on the slave side of the connection will cause the connection to tear down removing the duplicate call resolves the issue

          Code changed in jenkins
          User: Stephen Connolly
          Path:
          changelog.html
          core/pom.xml
          http://jenkins-ci.org/commit/jenkins/bb265c5e95b0fe39128720b903914236962db41b
          Log:
          [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: changelog.html core/pom.xml http://jenkins-ci.org/commit/jenkins/bb265c5e95b0fe39128720b903914236962db41b Log: [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix

          Fixed towards Jenkins 1.536

          Stephen Connolly added a comment - Fixed towards Jenkins 1.536

          dogfood added a comment -

          Integrated in jenkins_main_trunk #2938
          [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (Revision bb265c5e95b0fe39128720b903914236962db41b)

          Result = UNSTABLE
          Stephen Connolly : bb265c5e95b0fe39128720b903914236962db41b
          Files :

          • changelog.html
          • core/pom.xml

          dogfood added a comment - Integrated in jenkins_main_trunk #2938 [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (Revision bb265c5e95b0fe39128720b903914236962db41b) Result = UNSTABLE Stephen Connolly : bb265c5e95b0fe39128720b903914236962db41b Files : changelog.html core/pom.xml

          I upgraded our installation to Jenkins ver. 1.536. I will report back in a few days if it is fixed.

          Ramin Baradari added a comment - I upgraded our installation to Jenkins ver. 1.536. I will report back in a few days if it is fixed.

          The builds have been stable during the last week.

          Ramin Baradari added a comment - The builds have been stable during the last week.

          Code changed in jenkins
          User: Stephen Connolly
          Path:
          core/pom.xml
          http://jenkins-ci.org/commit/jenkins/1bb06ada301496ebed6d212188d1b7c9d006317b
          Log:
          [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix

          (cherry picked from commit bb265c5e95b0fe39128720b903914236962db41b)

          Conflicts:
          changelog.html

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: core/pom.xml http://jenkins-ci.org/commit/jenkins/1bb06ada301496ebed6d212188d1b7c9d006317b Log: [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (cherry picked from commit bb265c5e95b0fe39128720b903914236962db41b) Conflicts: changelog.html

          Code changed in jenkins
          User: Stephen Connolly
          Path:
          src/com/trilead/ssh2/channel/Channel.java
          http://jenkins-ci.org/commit/trilead-ssh2/5811ddd7ae15670a4f9ad345352613b3f2f2db97
          Log:
          JENKINS-22938 SSH slave connections die after the slave outputs 4MB of stderr, usually during findbugs analysis

          The fix for JENKINS-18836, JENKINS-18879, JENKINS-19619 was incorrect in its analysis.

          • There is no call to getChannelData() on the new code path, so thus you cannot have two calls of freeupWindow()
          • The problem with the original call to freeupWindow() is that it is on the receiver thread. You should not mix the responsibilities. Blocking the receiver thread to send a message will negatively impact performance and connection stability.
          • The correct solution is to push the freeupWindow onto the async queue thus the ACK gets sent and the purity of the receiving thread can be maintained.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: src/com/trilead/ssh2/channel/Channel.java http://jenkins-ci.org/commit/trilead-ssh2/5811ddd7ae15670a4f9ad345352613b3f2f2db97 Log: JENKINS-22938 SSH slave connections die after the slave outputs 4MB of stderr, usually during findbugs analysis The fix for JENKINS-18836 , JENKINS-18879 , JENKINS-19619 was incorrect in its analysis. There is no call to getChannelData() on the new code path, so thus you cannot have two calls of freeupWindow() The problem with the original call to freeupWindow() is that it is on the receiver thread. You should not mix the responsibilities. Blocking the receiver thread to send a message will negatively impact performance and connection stability. The correct solution is to push the freeupWindow onto the async queue thus the ACK gets sent and the purity of the receiving thread can be maintained.

            kohsuke Kohsuke Kawaguchi
            rbaradari Ramin Baradari
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: