Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7809

Remote Launcher randomly returns no data.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
      None
    • Environment:
      Hudson 1.378-1.383
    • Similar Issues:

      Description

      Since 1.378, the remoting functions of hudson have become unstable. When using a Launcher to execute commands remotely on a slave, sometimes it fails to return any data. The following stack trace is found in the hudson log when this occurs.

      Oct 15, 2010 10:12:17 AM hudson.remoting.ProxyOutputStream$Chunk$1 run
      WARNING: Failed to write to stream
      java.io.IOException: Pipe closed
              at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:244)
              at java.io.PipedInputStream.receive(PipedInputStream.java:210)
              at java.io.PipedOutputStream.write(PipedOutputStream.java:132)
              at java.io.OutputStream.write(OutputStream.java:58)
              at hudson.util.DelegatingOutputStream.write(DelegatingOutputStream.java:51)
              at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:185)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
              at java.util.concurrent.FutureTask.run(FutureTask.java:138)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:619)
      

      This is currently causing issues with the perforce plugin (JENKINS-7664), and I've been able to reproduce it using the hudson_main_trunk#331 build on ci.hudson-labs.org as well.

        Attachments

          Issue Links

            Activity

            rpetti Rob Petti created issue -
            rpetti Rob Petti made changes -
            Field Original Value New Value
            Link This issue is blocking JENKINS-7664 [ JENKINS-7664 ]
            rpetti Rob Petti made changes -
            Priority Major [ 3 ] Blocker [ 1 ]
            Hide
            torbent torbent added a comment - - edited

            Oh, so that's what's causing all those warnings?
            I was just about to submit an issue on them
            I'm the submitter of JENKINS-7664 so I can, at least anecdotally, confirm the connection.

            We're still running 1.379, by the way, so it's not limited to 1.380

            Show
            torbent torbent added a comment - - edited Oh, so that's what's causing all those warnings? I was just about to submit an issue on them I'm the submitter of JENKINS-7664 so I can, at least anecdotally, confirm the connection. We're still running 1.379, by the way, so it's not limited to 1.380
            Hide
            rpetti Rob Petti added a comment -

            Updated the description to make it more clear which versions are affected. I've also tested with 1.381, and the problem still exists.

            Show
            rpetti Rob Petti added a comment - Updated the description to make it more clear which versions are affected. I've also tested with 1.381, and the problem still exists.
            rpetti Rob Petti made changes -
            Environment Hudson 1.380 Hudson 1.378-1.381
            Hide
            torbent torbent added a comment -

            I've seen these (or like) messages appear in the log around times where the Cobertura plugin fails. It transfers the XML files from the slave(s) to the master for processing, but somehow one (or more) files end up too short and can no longer be parsed as valid xml.
            The error message explicitly mentions a filename on the master, not the slave. When inspected, the file is indeed incomplete.

            (Just to state that it's not only affecting the perforce plugin)

            Show
            torbent torbent added a comment - I've seen these (or like) messages appear in the log around times where the Cobertura plugin fails. It transfers the XML files from the slave(s) to the master for processing, but somehow one (or more) files end up too short and can no longer be parsed as valid xml. The error message explicitly mentions a filename on the master, not the slave. When inspected, the file is indeed incomplete. (Just to state that it's not only affecting the perforce plugin)
            Hide
            torbent torbent added a comment -

            I believe these are related and mimic my observations on Cobertura.

            Show
            torbent torbent added a comment - I believe these are related and mimic my observations on Cobertura.
            torbent torbent made changes -
            Link This issue is related to JENKINS-7836 [ JENKINS-7836 ]
            torbent torbent made changes -
            Link This issue is related to JENKINS-7871 [ JENKINS-7871 ]
            Hide
            rpetti Rob Petti added a comment -

            Still seeing this behavior in 1.383.

            Show
            rpetti Rob Petti added a comment - Still seeing this behavior in 1.383.
            rpetti Rob Petti made changes -
            Environment Hudson 1.378-1.381 Hudson 1.378-1.383
            Hide
            lkishalmi lkishalmi added a comment -

            This issue seems to be caused by fixing JENKINS-5977.
            However JENKINS-5977 solved our overall Hudson master/slave communications made our Hudson grid stable we are experiencing this bug as well.

            Show
            lkishalmi lkishalmi added a comment - This issue seems to be caused by fixing JENKINS-5977 . However JENKINS-5977 solved our overall Hudson master/slave communications made our Hudson grid stable we are experiencing this bug as well.
            lkishalmi lkishalmi made changes -
            Link This issue is related to JENKINS-5977 [ JENKINS-5977 ]
            ickersep ickersep made changes -
            Link This issue is blocking JENKINS-8064 [ JENKINS-8064 ]
            Hide
            rpetti Rob Petti added a comment -

            We've gotten around our issue by flushing the pipe before closing it, since the new threading model for remote channels seems to close it immediately even when there's still stuff in the pipe. This problem is still affecting other plugins, it seems, so I'll leave the issue open.

            Show
            rpetti Rob Petti added a comment - We've gotten around our issue by flushing the pipe before closing it, since the new threading model for remote channels seems to close it immediately even when there's still stuff in the pipe. This problem is still affecting other plugins, it seems, so I'll leave the issue open.
            Hide
            kohsuke Kohsuke Kawaguchi added a comment - - edited

            @rpetti
            I checked the code and the close doesn't happen immediately — it gets queued in the same way as the rest of the I/O operations. I believe your flush fix refers to JENKINS-7664, but the reporter responded that that didn't work.

            So I doubt if this is caused by close() failing to report data buffered earlier.

            Still looking into the cause.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - - edited @rpetti I checked the code and the close doesn't happen immediately — it gets queued in the same way as the rest of the I/O operations. I believe your flush fix refers to JENKINS-7664 , but the reporter responded that that didn't work. So I doubt if this is caused by close() failing to report data buffered earlier. Still looking into the cause.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in hudson
            User: : kohsuke
            Path:
            trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java
            http://jenkins-ci.org/commit/37070
            Log:
            added a test in an attempt to reproduce / pin-point JENKINS-7809 but this was not it.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java http://jenkins-ci.org/commit/37070 Log: added a test in an attempt to reproduce / pin-point JENKINS-7809 but this was not it.
            Hide
            kohsuke Kohsuke Kawaguchi added a comment -

            Is anyone seeing this outside the perforce plugin?

            I notice that the perforce plugin does something bit different with stream remoting.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - Is anyone seeing this outside the perforce plugin? I notice that the perforce plugin does something bit different with stream remoting.
            Hide
            fpavageau fpavageau added a comment -

            Yes, it's happening with the Cobertura plugin as well, where the coverage.xml files end up truncated on master. Cf. JENKINS-7836.

            Show
            fpavageau fpavageau added a comment - Yes, it's happening with the Cobertura plugin as well, where the coverage.xml files end up truncated on master. Cf. JENKINS-7836 .
            Hide
            kohsuke Kohsuke Kawaguchi added a comment -

            JENKINS-7664 is updated with my proposed change.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - JENKINS-7664 is updated with my proposed change.
            Hide
            hquiroz hquiroz added a comment -

            It's happening with Hudson 1.387 as well. We have trouble with the Cobertura report when the build was remote.

            Show
            hquiroz hquiroz added a comment - It's happening with Hudson 1.387 as well. We have trouble with the Cobertura report when the build was remote.
            mindless Alan Harder made changes -
            Link This issue is related to JENKINS-7745 [ JENKINS-7745 ]
            rpetti Rob Petti made changes -
            Link This issue is blocking JENKINS-8245 [ JENKINS-8245 ]
            Hide
            hbjastad hbjastad added a comment -

            It's happening with Hudson 1.388 as well. We have trouble with various parts of the build when the build is remote:

            • The Cobertura report
            • site:deploy
            • Maven site archiving

            Had to revert back to 1.377

            In case it is of any help, here is a stack trace:

            java.lang.NegativeArraySizeException
            	at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175)
            	at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123)
            	at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)
            	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
            	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
            	at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:161)
            	at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:118)
            	at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:72)
            	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
            	at org.apache.tools.tar.TarBuffer.writeBlock(TarBuffer.java:410)
            	at org.apache.tools.tar.TarBuffer.writeRecord(TarBuffer.java:351)
            	at hudson.org.apache.tools.tar.TarOutputStream.writeEOFRecord(TarOutputStream.java:356)
            	at hudson.org.apache.tools.tar.TarOutputStream.finish(TarOutputStream.java:137)
            	at hudson.org.apache.tools.tar.TarOutputStream.close(TarOutputStream.java:149)
            	at hudson.util.io.TarArchiver.close(TarArchiver.java:114)
            	at hudson.FilePath.writeToTar(FilePath.java:1545)
            	at hudson.FilePath.copyRecursiveTo(FilePath.java:1470)
            	at hudson.FilePath.copyRecursiveTo(FilePath.java:1399)
            	at hudson.maven.reporters.MavenSiteArchiver.postExecute(MavenSiteArchiver.java:81)
            	at hudson.maven.MavenModuleSetBuild$Builder.postExecute(MavenModuleSetBuild.java:822)
            	at hudson.maven.MavenBuilder$Adapter.postExecute(MavenBuilder.java:300)
            	at hudson.maven.agent.PluginManagerInterceptor$1MojoIntercepterImpl.callPost(PluginManagerInterceptor.java:170)
            	at hudson.maven.agent.PluginManagerInterceptor.executeMojo(PluginManagerInterceptor.java:183)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:559)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:513)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:483)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:331)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:292)
            	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
            	at org.apache.maven.lifecycle.LifecycleExecutorInterceptor.execute(LifecycleExecutorInterceptor.java:65)
            	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
            	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
            	at org.apache.maven.cli.MavenCli.main(MavenCli.java:301)
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            	at java.lang.reflect.Method.invoke(Method.java:597)
            	at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
            	at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
            	at hudson.maven.agent.Main.launch(Main.java:165)
            	at hudson.maven.MavenBuilder.call(MavenBuilder.java:165)
            	at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:744)
            	at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:688)
            	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
            	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
            	at hudson.remoting.Request$2.run(Request.java:270)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
            	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
            	at java.lang.Thread.run(Thread.java:662)
            
            Show
            hbjastad hbjastad added a comment - It's happening with Hudson 1.388 as well. We have trouble with various parts of the build when the build is remote: The Cobertura report site:deploy Maven site archiving Had to revert back to 1.377 In case it is of any help, here is a stack trace: java.lang.NegativeArraySizeException at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175) at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:161) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:118) at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:72) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at org.apache.tools.tar.TarBuffer.writeBlock(TarBuffer.java:410) at org.apache.tools.tar.TarBuffer.writeRecord(TarBuffer.java:351) at hudson.org.apache.tools.tar.TarOutputStream.writeEOFRecord(TarOutputStream.java:356) at hudson.org.apache.tools.tar.TarOutputStream.finish(TarOutputStream.java:137) at hudson.org.apache.tools.tar.TarOutputStream.close(TarOutputStream.java:149) at hudson.util.io.TarArchiver.close(TarArchiver.java:114) at hudson.FilePath.writeToTar(FilePath.java:1545) at hudson.FilePath.copyRecursiveTo(FilePath.java:1470) at hudson.FilePath.copyRecursiveTo(FilePath.java:1399) at hudson.maven.reporters.MavenSiteArchiver.postExecute(MavenSiteArchiver.java:81) at hudson.maven.MavenModuleSetBuild$Builder.postExecute(MavenModuleSetBuild.java:822) at hudson.maven.MavenBuilder$Adapter.postExecute(MavenBuilder.java:300) at hudson.maven.agent.PluginManagerInterceptor$1MojoIntercepterImpl.callPost(PluginManagerInterceptor.java:170) at hudson.maven.agent.PluginManagerInterceptor.executeMojo(PluginManagerInterceptor.java:183) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:559) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:513) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:483) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:331) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:292) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142) at org.apache.maven.lifecycle.LifecycleExecutorInterceptor.execute(LifecycleExecutorInterceptor.java:65) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129) at org.apache.maven.cli.MavenCli.main(MavenCli.java:301) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) at hudson.maven.agent.Main.launch(Main.java:165) at hudson.maven.MavenBuilder.call(MavenBuilder.java:165) at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:744) at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:688) at hudson.remoting.UserRequest.perform(UserRequest.java:114) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
            hudsonfsc hudsonfsc made changes -
            Assignee hudsonfsc [ hudsonfsc ]
            hudsonfsc hudsonfsc made changes -
            Assignee hudsonfsc [ hudsonfsc ] kohsuke [ kohsuke ]
            Hide
            kohsuke Kohsuke Kawaguchi added a comment -

            I wonder if this was a different manifestation of JENKINS-7745, which we fixed in 1.397. I encourage people to upgrade to 1.397 and report their results.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - I wonder if this was a different manifestation of JENKINS-7745 , which we fixed in 1.397. I encourage people to upgrade to 1.397 and report their results.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            remoting/src/main/java/hudson/remoting/PipeWindow.java
            http://jenkins-ci.org/commit/core/7b40e2e091db073df18a27d27e2019a290fa3bee
            Log:
            being more defensive, in seeing JENKINS-7809 and the following stack trace:

            java.lang.NegativeArraySizeException
            at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175)
            at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123)
            at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: remoting/src/main/java/hudson/remoting/PipeWindow.java http://jenkins-ci.org/commit/core/7b40e2e091db073df18a27d27e2019a290fa3bee Log: being more defensive, in seeing JENKINS-7809 and the following stack trace: java.lang.NegativeArraySizeException at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175) at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)
            Hide
            rpetti Rob Petti added a comment -

            I was able to reproduce JENKINS-7664 with that fix, so that doesn't seem to be the cause.

            In the perforce plugin, we are joining the RemoteProc created by the RemoteLauncher, and assuming that all the data has made it through the channel at that point. Unfortunately this doesn't seem to be the case, and occasionally you can run into a race condition where the process is joined, but there are still Chunks on their way through the Channel. Given that we close the stream right after the join, this generates exceptions and missing/truncated data.

            IMO, joining a RemoteProc should ensure that ALL data generated by that process has successfully made it through the channel before terminating. In the meantime, I've reimplemented RemoteLauncher in the perforce plugin to close the pipes on the slave end when it's LocalLauncher is finished. This will ensure that all the data makes it through (since the EOF is the last thing to be sent).

            What I can't figure out is why this doesn't seem to be affecting things like build task logs... We're closing the stream as fast as possible, but perhaps the build listener normally stays open for longer periods?

            Show
            rpetti Rob Petti added a comment - I was able to reproduce JENKINS-7664 with that fix, so that doesn't seem to be the cause. In the perforce plugin, we are joining the RemoteProc created by the RemoteLauncher, and assuming that all the data has made it through the channel at that point. Unfortunately this doesn't seem to be the case, and occasionally you can run into a race condition where the process is joined, but there are still Chunks on their way through the Channel. Given that we close the stream right after the join, this generates exceptions and missing/truncated data. IMO, joining a RemoteProc should ensure that ALL data generated by that process has successfully made it through the channel before terminating. In the meantime, I've reimplemented RemoteLauncher in the perforce plugin to close the pipes on the slave end when it's LocalLauncher is finished. This will ensure that all the data makes it through (since the EOF is the last thing to be sent). What I can't figure out is why this doesn't seem to be affecting things like build task logs... We're closing the stream as fast as possible, but perhaps the build listener normally stays open for longer periods?
            Hide
            kohsuke Kohsuke Kawaguchi added a comment -

            OK, I think I understand what's going on here.

            In Jenkins, the process I/O goes the opposite direction — we provide OutputStream that the process writes its stdout to, as opposed to drain the stream via read. And it doesn't close the stream that it writes to when the process has closed its stdout to indicate EOF. So the plugin like perforce that needs to read from stdout gets no EOF signal in the stream.

            The only way out in the current design is to close the output stream after joining the process, which is what you do in HudsonPipedOutputStream. But this only works if the data has been indeed written to the output stream before the join successfully returns.

            This is the case for local execution, as Proc.join() internally waits for the stream pumping threads to complete, but for remote execution, this isn't a guarantee enough — it merely means that the last bits of data has left the remote end to start its journey, but it still involves additional steps before the data actually gets written to locally exported OutputStream, hence the race condition.

            So the short term fix is to ensure that RemoteProc.join() does make sure that all the data arrived locally before returning. The long term proper fix is to allow code to read directly from stdout/stderr as InputStream, thereby eliminating this pseudo-EOF business.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - OK, I think I understand what's going on here. In Jenkins, the process I/O goes the opposite direction — we provide OutputStream that the process writes its stdout to, as opposed to drain the stream via read. And it doesn't close the stream that it writes to when the process has closed its stdout to indicate EOF. So the plugin like perforce that needs to read from stdout gets no EOF signal in the stream. The only way out in the current design is to close the output stream after joining the process, which is what you do in HudsonPipedOutputStream. But this only works if the data has been indeed written to the output stream before the join successfully returns. This is the case for local execution, as Proc.join() internally waits for the stream pumping threads to complete, but for remote execution, this isn't a guarantee enough — it merely means that the last bits of data has left the remote end to start its journey, but it still involves additional steps before the data actually gets written to locally exported OutputStream, hence the race condition. So the short term fix is to ensure that RemoteProc.join() does make sure that all the data arrived locally before returning. The long term proper fix is to allow code to read directly from stdout/stderr as InputStream, thereby eliminating this pseudo-EOF business.
            Hide
            rpetti Rob Petti added a comment -

            No longer a blocking issue, as a work-around has been found.

            Show
            rpetti Rob Petti added a comment - No longer a blocking issue, as a work-around has been found.
            rpetti Rob Petti made changes -
            Priority Blocker [ 1 ] Major [ 3 ]
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            core/src/main/java/hudson/Launcher.java
            core/src/main/java/hudson/Proc.java
            remoting/src/main/java/hudson/remoting/Channel.java
            test/src/test/java/hudson/ProcTest.java
            http://jenkins-ci.org/commit/core/3a3690b28135287ed78d6a966a5632a16277f05b
            Log:
            [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/src/main/java/hudson/Launcher.java core/src/main/java/hudson/Proc.java remoting/src/main/java/hudson/remoting/Channel.java test/src/test/java/hudson/ProcTest.java http://jenkins-ci.org/commit/core/3a3690b28135287ed78d6a966a5632a16277f05b Log: [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.
            scm_issue_link SCM/JIRA link daemon made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            Hide
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #514
            [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.

            Kohsuke Kawaguchi :
            Files :

            • test/src/test/java/hudson/ProcTest.java
            • core/src/main/java/hudson/Proc.java
            • changelog.html
            • core/src/main/java/hudson/Launcher.java
            • remoting/src/main/java/hudson/remoting/Channel.java
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #514 [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process. Kohsuke Kawaguchi : Files : test/src/test/java/hudson/ProcTest.java core/src/main/java/hudson/Proc.java changelog.html core/src/main/java/hudson/Launcher.java remoting/src/main/java/hudson/remoting/Channel.java
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            core/src/main/java/hudson/Launcher.java
            core/src/main/java/hudson/Proc.java
            core/src/main/java/hudson/os/windows/WindowsRemoteLauncher.java
            core/src/test/java/hudson/LauncherTest.java
            remoting/src/main/java/hudson/remoting/RemoteInputStream.java
            test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java
            test/src/test/java/hudson/ProcTest.java
            http://jenkins-ci.org/commit/core/96dd84b5d0d37e1765c3615bcba048eb62e38ad6
            Log:
            [FIXED JENKINS-7809] expose child process stdout as InputStream.

            I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments.

            This concludes the fix.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/src/main/java/hudson/Launcher.java core/src/main/java/hudson/Proc.java core/src/main/java/hudson/os/windows/WindowsRemoteLauncher.java core/src/test/java/hudson/LauncherTest.java remoting/src/main/java/hudson/remoting/RemoteInputStream.java test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java test/src/test/java/hudson/ProcTest.java http://jenkins-ci.org/commit/core/96dd84b5d0d37e1765c3615bcba048eb62e38ad6 Log: [FIXED JENKINS-7809] expose child process stdout as InputStream. I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments. This concludes the fix.
            Hide
            dogfood dogfood added a comment -
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #517
            vlatombe Vincent Latombe made changes -
            Link This issue is blocking JENKINS-7800 [ JENKINS-7800 ]
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java
            http://jenkins-ci.org/commit/jenkins-test-harness/d4fe2dcee80c02264fa5b687f72e2ea0e8af3aeb
            Log:
            [FIXED JENKINS-7809] expose child process stdout as InputStream.

            I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments.

            This concludes the fix.

            Originally-Committed-As: 96dd84b5d0d37e1765c3615bcba048eb62e38ad6

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java http://jenkins-ci.org/commit/jenkins-test-harness/d4fe2dcee80c02264fa5b687f72e2ea0e8af3aeb Log: [FIXED JENKINS-7809] expose child process stdout as InputStream. I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments. This concludes the fix. Originally-Committed-As: 96dd84b5d0d37e1765c3615bcba048eb62e38ad6
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 137911 ] JNJira + In-Review [ 187734 ]

              People

              Assignee:
              kohsuke Kohsuke Kawaguchi
              Reporter:
              rpetti Rob Petti
              Votes:
              37 Vote for this issue
              Watchers:
              39 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: