Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-61409

Websocket connections crash if message size is greater than 64Kb

    • Jenkins 2.229

      Websocket connections fail if the response from a pipeline step is greater than 64kb. 

      Steps to recreate the problem:

      1. Run a jenkins server using the docker container jenkins/jenkins:centos
      2. Create a New Node with webSocket enabled
      3. On another Linux VM/slave machine, run the agent.jar with -webSocket option to connect to the Jenkins Master
      4. Create a pipeline with a step that would output text >64kb. For example, create a text file with size > 64Kb. 'cat' the file as a pipeline step. 
      pipeline {
        agent {
          label 'test-channelclose'
        }  
        stages {
          stage ('Debug Issue') {
            steps {
              sh 'cat build_output.log'
              //archiveArtifacts 'build_output.log'
            }
          }
        }
      }
      

      5. Running the pipeline will break with the following error.

      Mar 07, 2020 2:31:28 PM jenkins.agents.WebSocketAgents$Session error
      WARNING: null
      org.eclipse.jetty.websocket.api.MessageTooLargeException: Binary message size [524494] exceeds maximum size [65536]
              at org.eclipse.jetty.websocket.api.WebSocketPolicy.assertValidBinaryMessageSize(WebSocketPolicy.java:128)
              at org.eclipse.jetty.websocket.common.Parser.assertSanePayloadLength(Parser.java:133)
              at org.eclipse.jetty.websocket.common.Parser.parseFrame(Parser.java:494)
              at org.eclipse.jetty.websocket.common.Parser.parseSingleFrame(Parser.java:253)
              at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:460)
              at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:441)
              at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
              at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
              at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
              at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
              at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
              at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
              at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
              at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388)
              at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
              at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
              at java.lang.Thread.run(Thread.java:748)

      Note: One of the workaround is to write the output to a file and archive the file. However, not all pipeline steps support an easy way to write the output to file.

          [JENKINS-61409] Websocket connections crash if message size is greater than 64Kb

          Using a large value can potentially cause memory issues on the server. Note that the slave connection is crashing anyway.

          I think the size should be configurable so that we can change it based on the environment that we run our Jenkins servers on.

          Karthik Jayaraman added a comment - Using a large value can potentially cause memory issues on the server. Note that the slave connection is crashing anyway. I think the size should be configurable so that we can change it based on the environment that we run our Jenkins servers on.

          Jesse Glick added a comment -

          I do not wish to require any configuration. The Remoting communication should just work, reliably, regardless of workload.

          Jesse Glick added a comment - I do not wish to require any configuration. The Remoting communication should just work, reliably, regardless of workload.

          Jesse Glick added a comment -

          olivergondza remember that if backporting to 2.222.x, we would want to create a component backport so as to get something like remoting.version=4.2.1. Otherwise we would be including https://github.com/jenkinsci/remoting/pull/369 in LTS, which includes a couple of production source code changes, albeit small (Util.mkdirsFiles.createDirectories).

          Jesse Glick added a comment - olivergondza remember that if backporting to 2.222.x, we would want to create a component backport so as to get something like remoting.version=4.2.1 . Otherwise we would be including https://github.com/jenkinsci/remoting/pull/369 in LTS, which includes a couple of production source code changes, albeit small ( Util.mkdirs → Files.createDirectories ).

          Jeff Thompson added a comment -

          Should I create that intermediate version of Remoting, without those minor cleanup changes?

          Jeff Thompson added a comment - Should I create that intermediate version of Remoting, without those minor cleanup changes?

          jglick,jthompson, I chose not to backport this mostly for being too new. Since the unrelated changes are far less scary than the body of the fix, I suggest to target 2.222.3 without special remoting release.

          Note to self: this needs to be backported as well https://github.com/jenkinsci/jenkins/pull/4601

          Oliver Gondža added a comment - jglick , jthompson , I chose not to backport this mostly for being too new. Since the unrelated changes are far less scary than the body of the fix, I suggest to target 2.222.3 without special remoting release. Note to self: this needs to be backported as well https://github.com/jenkinsci/jenkins/pull/4601

          Oleg Nenashev added a comment -

          olivergondza JEP-222 is a pretty important feature. Even if it is experimental, it would be great to have it fixed for LTS users so that we can get more adoption and feedback from early adopters.

          For me backporting 4,3 is a no-go option due to removal of the deprecated code in https://github.com/jenkinsci/remoting/pull/369 . This change also adds serial version IDs, and it may lead to a funny behavior if classes are serialized over the Remoting channel. I would advice against taking the risk even in 2.222.3. CC raihaan who submitted a patch.

          At the same time https://github.com/jenkinsci/remoting/pull/373 would be considerable for 2.222.2/3 if it was released as Remoting 4.2.1. It is still a high-risk change due to patches under the hood of Remoting engine (JNLP4 might be potentially affected by changes), but personally I would give it a try for 2.222.2 assuming that extensive testing of the patch is performance for the JNLP4 mode.

           

           

           

           

          Oleg Nenashev added a comment - olivergondza JEP-222 is a pretty important feature. Even if it is experimental, it would be great to have it fixed for LTS users so that we can get more adoption and feedback from early adopters. For me backporting 4,3 is a no-go option due to removal of the deprecated code in  https://github.com/jenkinsci/remoting/pull/369  . This change also adds serial version IDs, and it may lead to a funny behavior if classes are serialized over the Remoting channel. I would advice against taking the risk even in 2.222.3. CC raihaan who submitted a patch. At the same time  https://github.com/jenkinsci/remoting/pull/373  would be considerable for 2.222.2/3 if it was released as Remoting 4.2.1. It is still a high-risk change due to patches under the hood of Remoting engine (JNLP4 might be potentially affected by changes), but personally I would give it a try for 2.222.2 assuming that extensive testing of the patch is performance for the JNLP4 mode.        

          Jesse Glick added a comment -

          It is still a high-risk change due to patches under the hood of Remoting engine (JNLP4 might be potentially affected by changes)

          I cannot see how. None of the affected code is outside the WebSocket-specific handler method used in the -webSocket mode new in 2.222.x, except for a new method in AbstractByteBufferCommandTransport which is newly called in that modified code.

          Jesse Glick added a comment - It is still a high-risk change due to patches under the hood of Remoting engine (JNLP4 might be potentially affected by changes) I cannot see how. None of the affected code is outside the WebSocket-specific handler method used in the -webSocket mode new in 2.222.x, except for a new method in AbstractByteBufferCommandTransport which is newly called in that modified code.

          Jeff Thompson added a comment -

          I agree with Jesse. I don't see any risk to previously existing functionality for this change to better support WebSockets.

          I'm in the middle of preparing a 4.2.1 patch containing just the WebSockets fix over 4.2. That eliminates any concern over https://github.com/jenkinsci/remoting/pull/369 .

          Jeff Thompson added a comment - I agree with Jesse. I don't see any risk to previously existing functionality for this change to better support WebSockets. I'm in the middle of preparing a 4.2.1 patch containing just the WebSockets fix over 4.2. That eliminates any concern over https://github.com/jenkinsci/remoting/pull/369 .

          Jeff Thompson added a comment -

          The Remoting 4.2.1 release is out and available in repo.jenkins.io. The release notes are at https://github.com/jenkinsci/remoting/releases/tag/remoting-4.2.1 . I put together a PR to integrate that version into the 2.222.x LTS Jenkins branch at https://github.com/jenkinsci/jenkins/pull/4635 .

          Jeff Thompson added a comment - The Remoting 4.2.1 release is out and available in repo.jenkins.io. The release notes are at https://github.com/jenkinsci/remoting/releases/tag/remoting-4.2.1 . I put together a PR to integrate that version into the 2.222.x LTS Jenkins branch at https://github.com/jenkinsci/jenkins/pull/4635 .

            vlatombe Vincent Latombe
            kjayaraman Karthik Jayaraman
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: