Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-12037

CLI - I/O error in channel Chunked connection/Unexpected termination of the channel - still occurring in Jenkins 1.449

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • cli
    • None
    • * Running on SLES9 Linux server with 4 CPUs and plenty of diskspace.
      * Tomcat 7.0.22
      * JDK 1.6.0_14
      * Only ONE Master configuration - no slaves are configured
      * 3 Executors - (one less than the max number of CPUs)

      We reported an issue some time back that was also listed as fixed in Jenkins 1.441:
      Log:
      [FIXED JENKINS-11130] SEVERE: I/O error in channel Chunked connection when using jenkins-cli.jar

      Perhaps another bug should NOT be submitted so I have added the following comments below the line to the original defect 11130 comments section in case it can be reviewed/re-opened.

      We did NOT try to make any adjustments to the Tomcat configuration:

      Tomcat Connector connectionUploadTimeout

      but we are also now seeing the same problem with Winstone when at this same 1.441 level. We did revert to the 1.438 version of the CLI (leaving the WAR at 1.441 running in Winstone) and that is serving asthe current workaround.

      ================================================================================================

      We have downloaded and installed the LATEST 1.441 release that lists the fix for this problem. Currently we were running 1.438 on Winstone only (since with Tomcat 6 or 7, we had experienced the error HOWEVER yet under Winstone, it worked OK so that was our workaround - while running 1.438).

      Now with Jenkins 1.441 - we are getting the ERROR again and NOW WITH BOTH Winstone and the Tomcat configurations). We have left the Jenkins 1.441 WAR file in place running on Winstone, and reverted the CLI jar file back to the 1.438 version for now and that appears to work again with Winstone.

      Checked Manifest of CLI jar downloaded with the 1.441 WAR installation:

      Manifest-Version: 1.0
      Archiver-Version: Plexus Archiver
      Created-By: Apache Maven
      Built-By: kohsuke
      Build-Jdk: 1.6.0_26
      Main-Class: hudson.cli.CLI
      Jenkins-CLI-Version: 1.441

      Under Tomcat 7, we get this stacktrace:

      Started by command line
      [workspace] $ /bin/bash -xe /opt/apache-tomcat-7.0.22_jenkins/temp/hudson32817888834817830.sh
      + /opt/Sun/jdk1.6.0_14/bin/java -jar /opt/Sun/jdk1.6.0_14/lib/jenkins-cli.jar -s http://11.22.33.44:8082/jenkins/ build XYZ_Project-SharedLibs -s -p SVN_PATH=trunk
      Dec 5, 2011 12:59:11 PM hudson.remoting.Channel$ReaderThread run
      SEVERE: I/O error in channel Chunked connection to http://11.22.33.44:8082/jenkins/cli
      java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      Exception in thread "main" hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.call(Request.java:149)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
      at $Proxy2.main(Unknown Source)
      at hudson.cli.CLI.execute(CLI.java:200)
      at hudson.cli.CLI._main(CLI.java:330)
      at hudson.cli.CLI.main(CLI.java:245)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.abort(Request.java:273)
      at hudson.remoting.Channel.terminate(Channel.java:732)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1139)
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      Build step 'Execute shell' marked build as failure
      Notifying upstream projects of job completion
      Finished: FAILURE

      Under Winstone, we get this stacktrace - it's somewhat different:

      Started by command line
      [workspace] $ /bin/bash -xe /tmp/hudson10791816374444704.sh
      + /opt/Sun/jdk1.6.0_14/bin/java -jar /opt/Sun/jdk1.6.0_14/lib/jenkins-cli.jar -s http://11.22.33.44:8082/jenkins/ build XYZ_Project-SharedLibs -s -p SVN_PATH=trunk
      Dec 5, 2011 1:18:22 PM hudson.remoting.Channel$ReaderThread run
      SEVERE: I/O error in channel Chunked connection to http://11.22.33.44:8082/jenkins/cli
      java.io.IOException: Premature EOF
      at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:538) (http://www.http.ChunkedInputStream.readAheadBlocking%28ChunkedInputStream.java:538%29)
      at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:582) (http://www.http.ChunkedInputStream.readAhead%28ChunkedInputStream.java:582%29)
      at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:669) (http://www.http.ChunkedInputStream.read%28ChunkedInputStream.java:669%29)
      at java.io.FilterInputStream.read(FilterInputStream.java:116)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2504) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2504%29)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2499) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2499%29)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2488) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2488%29)
      at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      Exception in thread "main" hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Premature EOF
      at hudson.remoting.Request.call(Request.java:149)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
      at $Proxy2.main(Unknown Source)
      at hudson.cli.CLI.execute(CLI.java:200)
      at hudson.cli.CLI._main(CLI.java:330)
      at hudson.cli.CLI.main(CLI.java:245)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Premature EOF
      at hudson.remoting.Request.abort(Request.java:273)
      at hudson.remoting.Channel.terminate(Channel.java:732)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1139)
      Caused by: java.io.IOException: Premature EOF
      at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:538) (http://www.http.ChunkedInputStream.readAheadBlocking%28ChunkedInputStream.java:538%29)
      at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:582) (http://www.http.ChunkedInputStream.readAhead%28ChunkedInputStream.java:582%29)
      at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:669) (http://www.http.ChunkedInputStream.read%28ChunkedInputStream.java:669%29)
      at java.io.FilterInputStream.read(FilterInputStream.java:116)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2504) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2504%29)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2499) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2499%29)
      at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2488) (http://www.protocol.http.HttpURLConnection$HttpInputStream.read%28HttpURLConnection.java:2488%29)
      at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
      at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      Build step 'Execute shell' marked build as failure
      Notifying upstream projects of job completion
      Finished: FAILURE

          [JENKINS-12037] CLI - I/O error in channel Chunked connection/Unexpected termination of the channel - still occurring in Jenkins 1.449

          mark streit created issue -

          For completeness can you get the associated logs from the jenkins server side. That should confirm that the server side is closing the downstream socket because it has detected inactivity/timeout on the upstream side.

          How long does the cli command take to fail? A rough (to the nearest second or so) timing is all that is needed. The issue was due to webserver connection timeout before so it is likely to be something like 60, 30, 20 or 15 seconds.

          Are you able to get a packet trace of the network traffic between cli and jenkins? capturing from the cli side with wireshark or tcpdump should be fine. If you are able to get that and can share it then drop me an email at oldelvet@java.net so that we can arrange to transfer it in private.

          I don't have a lot of time at the moment so if anyone else wants to take a look instead of me please feel free to do so!

          Richard Mortimer added a comment - For completeness can you get the associated logs from the jenkins server side. That should confirm that the server side is closing the downstream socket because it has detected inactivity/timeout on the upstream side. How long does the cli command take to fail? A rough (to the nearest second or so) timing is all that is needed. The issue was due to webserver connection timeout before so it is likely to be something like 60, 30, 20 or 15 seconds. Are you able to get a packet trace of the network traffic between cli and jenkins? capturing from the cli side with wireshark or tcpdump should be fine. If you are able to get that and can share it then drop me an email at oldelvet@java.net so that we can arrange to transfer it in private. I don't have a lot of time at the moment so if anyone else wants to take a look instead of me please feel free to do so!

          mark streit added a comment -

          I will get logs from the Tomcat server. It also failed with Winstone this time whereas before it worked with Winstone when using the 1.441 CLI. I will check on those logs assuming they are somewhere in the location where we put the WAR. Failure hit within 10 sec each time.

          Thanks

          mark streit added a comment - I will get logs from the Tomcat server. It also failed with Winstone this time whereas before it worked with Winstone when using the 1.441 CLI. I will check on those logs assuming they are somewhere in the location where we put the WAR. Failure hit within 10 sec each time. Thanks

          mark streit added a comment -

          Have attached logs from the Tomcat server where we are running this. We have downloaded the last version 1.449 WAR and the corresponding CLI jar for 1.449. We repeated the test and are seeing this I/O error in chunked connection again. (right now only 1.441 running in Winstone will work without this error).

          Console output from the Jenkins web page is also included as is the Shell command text that occupies the Execute Shell section of the build job configuration page.

          This all works with 1.441 as long as the container is Winstone.

          mark streit added a comment - Have attached logs from the Tomcat server where we are running this. We have downloaded the last version 1.449 WAR and the corresponding CLI jar for 1.449. We repeated the test and are seeing this I/O error in chunked connection again. (right now only 1.441 running in Winstone will work without this error). Console output from the Jenkins web page is also included as is the Shell command text that occupies the Execute Shell section of the build job configuration page. This all works with 1.441 as long as the container is Winstone.
          mark streit made changes -
          Attachment New: Tomcat7_Jenkins1449_logs.zip [ 21501 ]
          Summary Original: CLI - I/O error in channel Chunked connection/Unexpected termination of the channel - still occurring in Jenkins 1.441 New: CLI - I/O error in channel Chunked connection/Unexpected termination of the channel - still occurring in Jenkins 1.449

          I'm still max'd out with $dayjob and have not had time to look in much detail at this.

          I just did a quick test on my test system with 1.451 using tomcat 6 and do not see any timeouts there with the tomcat http data upload timeout set to 20 seconds. That rules out any basic brokenness (I hope!).

          Looking at your log files (see below) there is one instance where an error occurs 12 seconds after the previous timeout. That suggests that tomcat 7 is timing out in around 10 seconds. The jenkins channel keepalive ping is set for 15 seconds so it could be that the ping is still not aggressive enough.

          We probably just need to test with a 5 seconds ping time and see if that fixes things.

          First error reported in ConsoleOutput.log

          Feb 6, 2012 2:47:38 PM hudson.remoting.Channel$ReaderThread run
          SEVERE: I/O error in channel Chunked connection to http://99.99.99.94:8082/jenkins/cli
          java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133)
          Caused by: java.io.EOFException
          

          This corresponds to the following two connections in localhost_access_log.2012-02-06.txt

          10.25.50.94 - - [06/Feb/2012:14:47:38 -0500] "POST /jenkins/cli HTTP/1.1" 200 13935
          10.25.50.94 - - [06/Feb/2012:14:47:38 -0500] "POST /jenkins/cli HTTP/1.1" 200 -
          

          The former is the downstream link and the latter is the upstream (sending from cli to jenkins) hence there being no output in that direction.

          catalina.out (equivalent to standalone jenkins.log) has the other side of the error
          This confirms that the upstream reader sees the input data closed by tomcat.

          Feb 6, 2012 2:47:38 PM hudson.remoting.Channel$ReaderThread run
          SEVERE: I/O error in channel HTTP full-duplex channel b625b58a-d9d7-4a42-856c-287e953edb47
          java.net.SocketTimeoutException: Read timed out
          	at java.net.SocketInputStream.socketRead0(Native Method)
          

          The next error in the console output occurs 12 seconds later.

          Feb 6, 2012 2:48:00 PM hudson.remoting.Channel$ReaderThread run
          SEVERE: I/O error in channel Chunked connection to http://99.99.99.94:8082/jenkins/cli
          java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133)
          
          	
          10.25.50.94 - - [06/Feb/2012:14:48:00 -0500] "POST /jenkins/cli HTTP/1.1" 200 13935
          10.25.50.94 - - [06/Feb/2012:14:48:00 -0500] "POST /jenkins/cli HTTP/1.1" 200 -
          

          The access log has a couple of extra cli commands in the meantime. It isn't clear whether these are for the same job or for something else. Assuming they are for something else that means that the 2nd command timed out roughly 10 seconds after the previous cli command.

          10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 10156
          10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 -
          
          10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 10156
          10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 -
          

          I wonder if the http data upload inactivity timeout is 10 seconds in tomcat7.

          Richard Mortimer added a comment - I'm still max'd out with $dayjob and have not had time to look in much detail at this. I just did a quick test on my test system with 1.451 using tomcat 6 and do not see any timeouts there with the tomcat http data upload timeout set to 20 seconds. That rules out any basic brokenness (I hope!). Looking at your log files (see below) there is one instance where an error occurs 12 seconds after the previous timeout. That suggests that tomcat 7 is timing out in around 10 seconds. The jenkins channel keepalive ping is set for 15 seconds so it could be that the ping is still not aggressive enough. We probably just need to test with a 5 seconds ping time and see if that fixes things. First error reported in ConsoleOutput.log Feb 6, 2012 2:47:38 PM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel Chunked connection to http: //99.99.99.94:8082/jenkins/cli java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) Caused by: java.io.EOFException This corresponds to the following two connections in localhost_access_log.2012-02-06.txt 10.25.50.94 - - [06/Feb/2012:14:47:38 -0500] "POST /jenkins/cli HTTP/1.1" 200 13935 10.25.50.94 - - [06/Feb/2012:14:47:38 -0500] "POST /jenkins/cli HTTP/1.1" 200 - The former is the downstream link and the latter is the upstream (sending from cli to jenkins) hence there being no output in that direction. catalina.out (equivalent to standalone jenkins.log) has the other side of the error This confirms that the upstream reader sees the input data closed by tomcat. Feb 6, 2012 2:47:38 PM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel HTTP full-duplex channel b625b58a-d9d7-4a42-856c-287e953edb47 java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) The next error in the console output occurs 12 seconds later. Feb 6, 2012 2:48:00 PM hudson.remoting.Channel$ReaderThread run SEVERE: I/O error in channel Chunked connection to http: //99.99.99.94:8082/jenkins/cli java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1133) 10.25.50.94 - - [06/Feb/2012:14:48:00 -0500] "POST /jenkins/cli HTTP/1.1" 200 13935 10.25.50.94 - - [06/Feb/2012:14:48:00 -0500] "POST /jenkins/cli HTTP/1.1" 200 - The access log has a couple of extra cli commands in the meantime. It isn't clear whether these are for the same job or for something else. Assuming they are for something else that means that the 2nd command timed out roughly 10 seconds after the previous cli command. 10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 10156 10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 - 10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 10156 10.25.50.94 - - [06/Feb/2012:14:47:39 -0500] "POST /jenkins/cli HTTP/1.1" 200 - I wonder if the http data upload inactivity timeout is 10 seconds in tomcat7.

          Richard Mortimer added a comment - - edited

          Oops comment added to wrong issue!

          Richard Mortimer added a comment - - edited Oops comment added to wrong issue!

          mark streit added a comment -

          with regard to your question about the cli commands showing in the logs -

          "The access log has a couple of extra cli commands in the meantime. It isn't clear whether these are for the same job or for something else. Assuming they are for something else that means that the 2nd command timed out roughly 10 seconds after the previous cli command."

          This test was only done with no other jobs being run. Only the command shell being run that is running the jobs with the -s parameter. It "appears" to us that the -s parameter gets ignored and other jobs start running before we expect them to.

          mark streit added a comment - with regard to your question about the cli commands showing in the logs - "The access log has a couple of extra cli commands in the meantime. It isn't clear whether these are for the same job or for something else. Assuming they are for something else that means that the 2nd command timed out roughly 10 seconds after the previous cli command." This test was only done with no other jobs being run. Only the command shell being run that is running the jobs with the -s parameter. It "appears" to us that the -s parameter gets ignored and other jobs start running before we expect them to.

          If I build a test version for you with a 5 second heartbeat timeout are you able to test it? If so is there any particular Jenkins version that you would be testing with so that I can build a matching cli jar file.

          Regarding the -s parameter being ignored I vaguely remember that there is a bug reported in JIRA where it does not wait if there is already a queued instance of the same job. From what you describe this may not be applicable to your circumstances but it could.

          Richard Mortimer added a comment - If I build a test version for you with a 5 second heartbeat timeout are you able to test it? If so is there any particular Jenkins version that you would be testing with so that I can build a matching cli jar file. Regarding the -s parameter being ignored I vaguely remember that there is a bug reported in JIRA where it does not wait if there is already a queued instance of the same job. From what you describe this may not be applicable to your circumstances but it could.
          mark streit made changes -
          Priority Original: Major [ 3 ] New: Critical [ 2 ]

            Unassigned Unassigned
            mcs13099 mark streit
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: