• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Minor Minor
    • docker-plugin
    • None

      We have been seeing a consistent "Slave went offline" message when using the docker plugin to schedule our ssh build slaves. The build appears to be working fine. I can attach to the slave container and still see the slave.jar process as well as the build job running inside the container.

      However, when I look at the Nodes page, it looks like eventually the master loses connection with the slave and it starts counting down to failure. The build eventually fails after 15-17 mins. I have disabled the slave and master ping threads documented https://wiki.jenkins-ci.org/display/JENKINS/Ping+Thread

      We started experiencing this issue after jumping from docker-plugin 0.9.* to 0.16.*

          [JENKINS-34043] Docker SSH Slave loses connectivity

          Brandon Raabe added a comment - - edited

          Here's the stack trace:

          16:00:42 ERROR: Connection was broken: java.io.IOException: Sorry, this connection is closed.
          16:00:42 at com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587)
          16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660)
          16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:407)
          16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:347)
          16:00:42 at com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:943)
          16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58)
          16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79)
          16:00:42 at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          16:00:42 at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          16:00:42 at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          16:00:42 at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
          16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          16:00:42 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          16:00:42 Caused by: java.net.SocketException: Connection timed out
          16:00:42 at java.net.SocketOutputStream.socketWrite0(Native Method)
          16:00:42 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
          16:00:42 at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
          16:00:42 at com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75)
          16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193)
          16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107)
          16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677)
          16:00:42 at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:429)
          16:00:42 at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63)
          16:00:42 at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:94)
          16:00:42 at hudson.remoting.ChunkedOutputStream.sendBreak(ChunkedOutputStream.java:66)
          16:00:42 at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:46)
          16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45)
          16:00:42 at hudson.remoting.Channel.send(Channel.java:582)
          16:00:42 at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:261)
          16:00:42 at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158)
          16:00:42 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          16:00:42 at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
          16:00:42 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
          16:00:42 at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18)
          16:00:42 at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21)
          16:00:42 at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          16:00:42 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          16:00:42 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          16:00:42 at java.lang.Thread.run(Thread.java:745)
          16:00:42
          16:00:42 Build step 'Execute shell' marked build as failure
          16:00:42 FATAL: channel is already closed
          16:00:42 hudson.remoting.ChannelClosedException: channel is already closed
          16:00:42 at hudson.remoting.Channel.send(Channel.java:578)
          16:00:42 at hudson.remoting.Request.call(Request.java:130)
          16:00:42 at hudson.remoting.Channel.call(Channel.java:780)
          16:00:42 at hudson.Launcher$RemoteLauncher.kill(Launcher.java:953)
          16:00:42 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:540)
          16:00:42 at hudson.model.Run.execute(Run.java:1738)
          16:00:42 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          16:00:42 at hudson.model.ResourceController.execute(ResourceController.java:98)
          16:00:42 at hudson.model.Executor.run(Executor.java:410)
          16:00:42 Caused by: java.io.IOException: Sorry, this connection is closed.
          16:00:42 at com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587)
          16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660)
          16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:407)
          16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:347)
          16:00:42 at com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:943)
          16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58)
          16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79)
          16:00:42 at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
          16:00:42 at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
          16:00:42 at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
          16:00:42 at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
          16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
          16:00:42 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
          16:00:42 Caused by: java.net.SocketException: Connection timed out
          16:00:42 at java.net.SocketOutputStream.socketWrite0(Native Method)
          16:00:42 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
          16:00:42 at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
          16:00:42 at com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75)
          16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193)
          16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107)
          16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677)
          16:00:42 at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:429)
          16:00:42 at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63)
          16:00:42 at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:94)
          16:00:42 at hudson.remoting.ChunkedOutputStream.sendBreak(ChunkedOutputStream.java:66)
          16:00:42 at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:46)
          16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45)
          16:00:42 at hudson.remoting.Channel.send(Channel.java:582)
          16:00:42 at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:261)
          16:00:42 at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158)
          16:00:42 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
          16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          16:00:42 at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
          16:00:42 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
          16:00:42 at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18)
          16:00:42 at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21)
          16:00:42 at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          16:00:42 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          16:00:42 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          16:00:42 at java.lang.Thread.run(Thread.java:745)

          Brandon Raabe added a comment - - edited Here's the stack trace: 16:00:42 ERROR: Connection was broken: java.io.IOException: Sorry, this connection is closed. 16:00:42 at com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587) 16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660) 16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:407) 16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:347) 16:00:42 at com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:943) 16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58) 16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79) 16:00:42 at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) 16:00:42 at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) 16:00:42 at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) 16:00:42 at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) 16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) 16:00:42 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) 16:00:42 Caused by: java.net.SocketException: Connection timed out 16:00:42 at java.net.SocketOutputStream.socketWrite0(Native Method) 16:00:42 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) 16:00:42 at java.net.SocketOutputStream.write(SocketOutputStream.java:159) 16:00:42 at com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75) 16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193) 16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107) 16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677) 16:00:42 at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:429) 16:00:42 at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63) 16:00:42 at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:94) 16:00:42 at hudson.remoting.ChunkedOutputStream.sendBreak(ChunkedOutputStream.java:66) 16:00:42 at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:46) 16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45) 16:00:42 at hudson.remoting.Channel.send(Channel.java:582) 16:00:42 at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:261) 16:00:42 at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158) 16:00:42 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 16:00:42 at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) 16:00:42 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) 16:00:42 at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18) 16:00:42 at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21) 16:00:42 at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) 16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 16:00:42 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 16:00:42 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 16:00:42 at java.lang.Thread.run(Thread.java:745) 16:00:42 16:00:42 Build step 'Execute shell' marked build as failure 16:00:42 FATAL: channel is already closed 16:00:42 hudson.remoting.ChannelClosedException: channel is already closed 16:00:42 at hudson.remoting.Channel.send(Channel.java:578) 16:00:42 at hudson.remoting.Request.call(Request.java:130) 16:00:42 at hudson.remoting.Channel.call(Channel.java:780) 16:00:42 at hudson.Launcher$RemoteLauncher.kill(Launcher.java:953) 16:00:42 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:540) 16:00:42 at hudson.model.Run.execute(Run.java:1738) 16:00:42 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 16:00:42 at hudson.model.ResourceController.execute(ResourceController.java:98) 16:00:42 at hudson.model.Executor.run(Executor.java:410) 16:00:42 Caused by: java.io.IOException: Sorry, this connection is closed. 16:00:42 at com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587) 16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660) 16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:407) 16:00:42 at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:347) 16:00:42 at com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:943) 16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58) 16:00:42 at com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79) 16:00:42 at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) 16:00:42 at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) 16:00:42 at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) 16:00:42 at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39) 16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) 16:00:42 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) 16:00:42 Caused by: java.net.SocketException: Connection timed out 16:00:42 at java.net.SocketOutputStream.socketWrite0(Native Method) 16:00:42 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) 16:00:42 at java.net.SocketOutputStream.write(SocketOutputStream.java:159) 16:00:42 at com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75) 16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193) 16:00:42 at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107) 16:00:42 at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677) 16:00:42 at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:429) 16:00:42 at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63) 16:00:42 at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:94) 16:00:42 at hudson.remoting.ChunkedOutputStream.sendBreak(ChunkedOutputStream.java:66) 16:00:42 at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:46) 16:00:42 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45) 16:00:42 at hudson.remoting.Channel.send(Channel.java:582) 16:00:42 at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:261) 16:00:42 at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158) 16:00:42 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 16:00:42 at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) 16:00:42 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) 16:00:42 at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18) 16:00:42 at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21) 16:00:42 at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) 16:00:42 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 16:00:42 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 16:00:42 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 16:00:42 at java.lang.Thread.run(Thread.java:745)

          pjdarton added a comment -

          brandocorp Is this still an issue?

          There's been a lot of changes since you raised this, including changes to how SSH-connection parameters get passed from the docker-plugin to the SSH slave connection code.

          If it's not still an issue, please let me know.

          pjdarton added a comment - brandocorp Is this still an issue? There's been a lot of changes since you raised this, including changes to how SSH-connection parameters get passed from the docker-plugin to the SSH slave connection code. If it's not still an issue, please let me know.

          pjdarton added a comment -

          Closed due to lack of response.

          Note to anyone stumbling upon this in future: There's a myriad of reasons why a slave (docker or otherwise) might loose its connection the the Jenkins master; without further detail, it's not worth speculating.
          In this particular case, the connection code's undergone a lot of changes between when this report was originally raised and the current day, and given the lack of response I'd guess that these changes have fixed whatever the problem was.

          pjdarton added a comment - Closed due to lack of response. Note to anyone stumbling upon this in future: There's a myriad of reasons why a slave (docker or otherwise) might loose its connection the the Jenkins master; without further detail, it's not worth speculating. In this particular case, the connection code's undergone a lot of changes between when this report was originally raised and the current day, and given the lack of response I'd guess that these changes have fixed whatever the problem was.

            Unassigned Unassigned
            brandocorp Brandon Raabe
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: