Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22816

SocketException on copy artifact after upgrad

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Duplicate
    • Component/s: copyartifact-plugin
    • Labels:
      None
    • Environment:
      Windows Server 2008
      Jenkins ver. 1.561
      Copy Artifact Plugin 1.30
    • Similar Issues:

      Description

      Recently after upgrading both Jenkins and several plugins we very often get failed matrix builds with following exception. This totaly blocks our CI activities.
      Let me know if you need more info.

      FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      	at hudson.remoting.Request.call(Request.java:174)
      	at hudson.remoting.Channel.call(Channel.java:739)
      	at hudson.FilePath.act(FilePath.java:909)
      	at hudson.FilePath.act(FilePath.java:893)
      	at hudson.FilePath.touch(FilePath.java:1355)
      	at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyOne(FingerprintingCopyMethod.java:90)
      	at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyAll(FingerprintingCopyMethod.java:68)
      	at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:368)
      	at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:306)
      	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:745)
      	at hudson.model.Build$BuildExecution.build(Build.java:198)
      	at hudson.model.Build$BuildExecution.doRun(Build.java:159)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
      	at hudson.model.Run.execute(Run.java:1709)
      	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
      	at hudson.model.ResourceController.execute(ResourceController.java:88)
      	at hudson.model.Executor.run(Executor.java:231)
      Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
      	at hudson.remoting.Request.abort(Request.java:299)
      	at hudson.remoting.Channel.terminate(Channel.java:802)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      Caused by: java.net.SocketException: Connection reset
      	at java.net.SocketInputStream.read(Unknown Source)
      	at java.io.FilterInputStream.read(Unknown Source)
      	at java.io.BufferedInputStream.fill(Unknown Source)
      	at java.io.BufferedInputStream.read(Unknown Source)
      	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:77)
      	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:67)
      	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:93)
      	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
      	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      

        Attachments

          Issue Links

            Activity

            Hide
            ikedam ikedam added a comment -

            Is there any sign when this happens? For example, the build hangs and takes too long time.

            This may be caused not by copyartifact plugin, but by Jenkins core, or maybe by Windows.

            Let me know:

            • Do crashes always happen when runnig copyartifact? That is, stacktraces always contain 'copyartifact'?
            • Do crashes always happen in the same project?
            • Do crashes always happen in the same matrix axes combination?
            • What versions do you upgrade from? The version of Jenkins core and copyartifact will be helpful.
            • Is there any reason you upgrade Jenkins? (for example, you want to use a new feature)
              • I higly recommend you to use LTS releases for production usage. You would better downgrade to the latest LTS version if you can.
            • Do you use Distributed builds? If so, you should check the log of slave nodes. I often see those logs when a slave crashes.
            • Are there any outputs in Windows Event Logs?
            Show
            ikedam ikedam added a comment - Is there any sign when this happens? For example, the build hangs and takes too long time. This may be caused not by copyartifact plugin, but by Jenkins core, or maybe by Windows. Let me know: Do crashes always happen when runnig copyartifact? That is, stacktraces always contain 'copyartifact'? Do crashes always happen in the same project? Do crashes always happen in the same matrix axes combination? What versions do you upgrade from? The version of Jenkins core and copyartifact will be helpful. Is there any reason you upgrade Jenkins? (for example, you want to use a new feature) I higly recommend you to use LTS releases for production usage. You would better downgrade to the latest LTS version if you can. Do you use Distributed builds ? If so, you should check the log of slave nodes. I often see those logs when a slave crashes. Are there any outputs in Windows Event Logs?
            Hide
            fvaletas Franck Valetas added a comment -

            I have the same trouble on the same config.

            Show
            fvaletas Franck Valetas added a comment - I have the same trouble on the same config.
            Hide
            swiniak Andrzej Pasterczyk added a comment -

            Nope, haven't found anything that would point to build taking too long.

            • Yes, always copy artifact fails
            • Mostly in one project
            • Different nodes fail randomly
            • Hmmm... good question, anyway I can determine that post upgrade? I've upgraded to 1.29 of copy artifact and then almost immediately to 1.30 since it appeared then so I don't have "downgrade to X.YY" shown with the original version that I've started
            • had issues with some other plugin and one led to another
              • unfortunately I went the wrong path

            I'll get back to you on the rest

            Still - it was working fine "few" versions ago so something is clearly wrong... maybe copy artifact plugin could be extended with some retry procedure in case of file copy issues?

            Show
            swiniak Andrzej Pasterczyk added a comment - Nope, haven't found anything that would point to build taking too long. Yes, always copy artifact fails Mostly in one project Different nodes fail randomly Hmmm... good question, anyway I can determine that post upgrade? I've upgraded to 1.29 of copy artifact and then almost immediately to 1.30 since it appeared then so I don't have "downgrade to X.YY" shown with the original version that I've started had issues with some other plugin and one led to another unfortunately I went the wrong path I'll get back to you on the rest Still - it was working fine "few" versions ago so something is clearly wrong... maybe copy artifact plugin could be extended with some retry procedure in case of file copy issues?
            Hide
            danielbeck Daniel Beck added a comment -

            Looks like a possible duplicate of JENKINS-22734

            Show
            danielbeck Daniel Beck added a comment - Looks like a possible duplicate of JENKINS-22734
            Hide
            swiniak Andrzej Pasterczyk added a comment - - edited

            Some more details:

            • we transfer small files (few MB max) although they sum up to over 100MB, stack traces are different than in JENKINS-22734 (we always get SocketException on copy artifact, nothing else)
            • seems that slave node is dying (not 100% sure if related and what happens first - slave down or copy problem), here's the entry from log file
              Connecting to FS12TEST
              Checking if Java exists
              java -version returned 1.7.0.
              Installing the Jenkins slave service
              Copying jenkins-slave.exe
              Copying slave.jar
              Copying jenkins-slave.xml
              Registering the service
              Starting the service
              Waiting for the service to become ready
              Connecting to port 53,955
              <===[JENKINS REMOTING CAPACITY]===>   Slave.jar version: 2.40
              This is a Windows slave
              Effective SlaveRestarter on FS12TEST: null
              Slave successfully connected and online
              ERROR: Connection terminated
              ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.net.SocketException: Connection reset
              	at java.net.SocketInputStream.read(Unknown Source)
              	at java.io.FilterInputStream.read(Unknown Source)
              	at java.io.BufferedInputStream.fill(Unknown Source)
              	at java.io.BufferedInputStream.read(Unknown Source)
              	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:77)
              	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:67)
              	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:93)
              	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
              	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
              	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
              Stopping the service
              Unregistering the service
              
            Show
            swiniak Andrzej Pasterczyk added a comment - - edited Some more details: we transfer small files (few MB max) although they sum up to over 100MB, stack traces are different than in JENKINS-22734 (we always get SocketException on copy artifact, nothing else) seems that slave node is dying (not 100% sure if related and what happens first - slave down or copy problem), here's the entry from log file Connecting to FS12TEST Checking if Java exists java -version returned 1.7.0. Installing the Jenkins slave service Copying jenkins-slave.exe Copying slave.jar Copying jenkins-slave.xml Registering the service Starting the service Waiting for the service to become ready Connecting to port 53,955 <===[JENKINS REMOTING CAPACITY]===> Slave.jar version: 2.40 This is a Windows slave Effective SlaveRestarter on FS12TEST: null Slave successfully connected and online ERROR: Connection terminated ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.net.SocketException: Connection reset at java.net.SocketInputStream.read(Unknown Source) at java.io.FilterInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:77) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:67) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:93) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Stopping the service Unregistering the service
            Hide
            danielbeck Daniel Beck added a comment -

            Copy Artifact has a transfer mode that basically copies all files in one step, so chances are it accumulates the data of all files into the buffer.

            It's suspicious that FlightRecorderInputStream is in the stack trace.

            My suggestion is upgrade to 1.563 ASAP (maybe even the current RC if urgent), or downgrade to 1.559, and see whether the issue is gone. If so, this is a duplicate.

            Show
            danielbeck Daniel Beck added a comment - Copy Artifact has a transfer mode that basically copies all files in one step, so chances are it accumulates the data of all files into the buffer. It's suspicious that FlightRecorderInputStream is in the stack trace. My suggestion is upgrade to 1.563 ASAP (maybe even the current RC if urgent), or downgrade to 1.559, and see whether the issue is gone. If so, this is a duplicate.
            Hide
            ikedam ikedam added a comment -

            > * Hmmm... good question, anyway I can determine that post upgrade? I've upgraded to 1.29 of copy artifact and then almost immediately to 1.30 since it appeared then so I don't have "downgrade to X.YY" shown with the original version that I've started
            > * had issues with some other plugin and one led to another
            > ** unfortunately I went the wrong path

            As there are cases you cannot downgrading plugins easily (e.g. the configuration fields are changed), you would better determine you need a new version BEFORE upgrading.
            Anyway, this problem seems caused by Jenkins core as Daniel points, you should switch your Jenkins core to the LTS versions. It might also resolve problems with other plugins.

            > maybe copy artifact plugin could be extended with some retry procedure in case of file copy issues?

            That feature could not help you as the slave crashes and the build cannot continue anymore.
            JobRequeue-Plugin might help you. (As I haven't tried that plugin, I recommend you test that plugin before introducing to the production environment.)

            > seems that slave node is dying (not 100% sure if related and what happens first - slave down or copy problem), here's the entry from log file

            This log file is the one stored in the master, isn't it?
            (8mha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT... is an annotation appended by Jenkins)
            Please check the log files in slave nodes. They are placed in JENKINS_HOME you specified in node configuration pages.

            Show
            ikedam ikedam added a comment - > * Hmmm... good question, anyway I can determine that post upgrade? I've upgraded to 1.29 of copy artifact and then almost immediately to 1.30 since it appeared then so I don't have "downgrade to X.YY" shown with the original version that I've started > * had issues with some other plugin and one led to another > ** unfortunately I went the wrong path As there are cases you cannot downgrading plugins easily (e.g. the configuration fields are changed), you would better determine you need a new version BEFORE upgrading. Anyway, this problem seems caused by Jenkins core as Daniel points, you should switch your Jenkins core to the LTS versions. It might also resolve problems with other plugins. > maybe copy artifact plugin could be extended with some retry procedure in case of file copy issues? That feature could not help you as the slave crashes and the build cannot continue anymore. JobRequeue-Plugin might help you. (As I haven't tried that plugin, I recommend you test that plugin before introducing to the production environment.) > seems that slave node is dying (not 100% sure if related and what happens first - slave down or copy problem), here's the entry from log file This log file is the one stored in the master, isn't it? (8mha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT... is an annotation appended by Jenkins) Please check the log files in slave nodes. They are placed in JENKINS_HOME you specified in node configuration pages.
            Hide
            swiniak Andrzej Pasterczyk added a comment -

            Thanks for explaining all this. You seem to be right about that OutOfMemory issue

            SEVERE: Unexpected error in channel channel
            java.lang.OutOfMemoryError: Requested array size exceeds VM limit
            	at java.util.Arrays.copyOf(Unknown Source)
            	at java.io.ByteArrayOutputStream.grow(Unknown Source)
            	at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source)
            	at java.io.ByteArrayOutputStream.write(Unknown Source)
            	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:87)
            	at hudson.remoting.ChunkedInputStream.read(ChunkedInputStream.java:46)
            	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:88)
            	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
            
            Exception in thread "Channel reader thread: channel" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
            	at java.util.Arrays.copyOf(Unknown Source)
            	at java.io.ByteArrayOutputStream.grow(Unknown Source)
            	at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source)
            	at java.io.ByteArrayOutputStream.write(Unknown Source)
            	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:87)
            	at hudson.remoting.ChunkedInputStream.read(ChunkedInputStream.java:46)
            	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:88)
            	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
            channel stopped
            channel started
            
            Show
            swiniak Andrzej Pasterczyk added a comment - Thanks for explaining all this. You seem to be right about that OutOfMemory issue SEVERE: Unexpected error in channel channel java.lang.OutOfMemoryError: Requested array size exceeds VM limit at java.util.Arrays.copyOf(Unknown Source) at java.io.ByteArrayOutputStream.grow(Unknown Source) at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source) at java.io.ByteArrayOutputStream.write(Unknown Source) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:87) at hudson.remoting.ChunkedInputStream.read(ChunkedInputStream.java:46) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:88) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Exception in thread "Channel reader thread: channel" java.lang.OutOfMemoryError: Requested array size exceeds VM limit at java.util.Arrays.copyOf(Unknown Source) at java.io.ByteArrayOutputStream.grow(Unknown Source) at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source) at java.io.ByteArrayOutputStream.write(Unknown Source) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:87) at hudson.remoting.ChunkedInputStream.read(ChunkedInputStream.java:46) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:88) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) channel stopped channel started
            Hide
            danielbeck Daniel Beck added a comment -

            General agreement seems to be that this duplicates JENKINS-22734.

            If in 1.563 onward JENKINS-22734 is confirmed fix and this still occurs, feel free to reopen (with new stack traces etc.).

            Show
            danielbeck Daniel Beck added a comment - General agreement seems to be that this duplicates JENKINS-22734 . If in 1.563 onward JENKINS-22734 is confirmed fix and this still occurs, feel free to reopen (with new stack traces etc.).
            Hide
            swiniak Andrzej Pasterczyk added a comment -

            Fix in 1.563 confirmed. Thanks

            Show
            swiniak Andrzej Pasterczyk added a comment - Fix in 1.563 confirmed. Thanks

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              swiniak Andrzej Pasterczyk
              Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: