Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24050

All slaves disconnect and no new slaves can connect due to CancelledKeyException in org.jenkinsci.remoting

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • Enterprise Linux 5.x master, Windows and Linux slaves of varying releases. Slaves are added and removed reasonably frequently in a way similar to the EC2Plugin (although others have reported with snapshot reverting and even with regular slaves)

      We have an issue where we get a CancelledKeyException and 100% of our slaves disconnect and no new new slaves can connect until a restart happens. The issue seems to happen randomly.

      See: https://issues.jenkins-ci.org/browse/JENKINS-22932?focusedCommentId=205983&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-205983#JENKINS-22932 and later for some more context.

      The full error message in the build is:
      FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      at hudson.remoting.Request.call(Request.java:174)
      at hudson.remoting.Channel.call(Channel.java:739)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168)
      at com.sun.proxy.$Proxy83.join(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:956)
      at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:772)
      at hudson.model.Build$BuildExecution.build(Build.java:199)
      at hudson.model.Build$BuildExecution.doRun(Build.java:160)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:535)
      at hudson.model.Run.execute(Run.java:1732)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:234)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      at hudson.remoting.Request.abort(Request.java:299)
      at hudson.remoting.Channel.terminate(Channel.java:802)
      at hudson.remoting.Channel$2.terminate(Channel.java:483)
      at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72)
      at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:195)
      at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:618)
      at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:592)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)
      Caused by: java.io.IOException: Failed to abort
      ... 9 more
      Caused by: java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
      at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
      at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
      at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:513)
      ... 6 more

          [JENKINS-24050] All slaves disconnect and no new slaves can connect due to CancelledKeyException in org.jenkinsci.remoting

          Kevin Browder added a comment -

          JENKINS-24050 was opened since it's actually a different issue than JENKINS-22932 (which probably should be reclosed)

          Kevin Browder added a comment - JENKINS-24050 was opened since it's actually a different issue than JENKINS-22932 (which probably should be reclosed)

          James Noonan added a comment -

          I was going to raise a second defect, but I think this is similar enough.

          When the problem occurs, the Slaves Console shows 'Connected'. However, the master shows them all disconnected. The only way to recover so far is to restart Jenkins.
          We are running Master on WindowsServer2012, on VMWare. We are running about 70 slaves, a mix OSX10.9, Win7, and Linux Sled 11 on VMWare. There are some other variants. We are running Jenkins 1.563.

          This issue has occurred three times for us. Two cases are independent; one occurred shortly after the first and the JVM was not restarted, so perhaps recovery between the 1st and 2nd time was not complete. We have not identified a trigger cause for this problem.

          The thread count starts to increase linearly once the problem occurs, but we believe that this is a symptom. In the JavaMelody Monitoring Plugin, there may be a difference between the reported thread number on the machine in two different places. The graph showed 4000 (it was running but down for 30 hours). However, the thread count below showed 400. I believe that the first figure maybe the JVM's count while the second is Jenkins'. In normal operation, we see about 200 threads. (However, we restarted, so I am not 100% sure that this is correct).

          We see the following messages in the error log. The same exception occurs for each of our slaves within a short period of time.

          Jul 31, 2014 5:13:17 AM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed
          WARNING: NioChannelHub keys=86 gen=1625477529: Computer.threadPoolForRemoting 58 for + XXXXXXXX terminated
          java.io.IOException: Failed to abort
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184)
          at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:599)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:481)
          at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
          Caused by: java.nio.channels.ClosedChannelException
          at sun.nio.ch.SocketChannelImpl.shutdownInput(Unknown Source)
          at sun.nio.ch.SocketAdaptor.shutdownInput(Unknown Source)
          at org.jenkinsci.remoting.nio.Closeables$1.close(Closeables.java:20)
          at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:289)
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:226)
          at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:224)
          at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:474)
          ... 6 more

          In the first case, we also saw ping timeouts occur at about the same time as the problem. These were not present in the other case. On the latest case, there was a single slave losing network connectivity and we saw this exception in advance of when the 'crash' happened. However, I believe this to be a coincidence. The exception occurs in the logs without all slaves losing connectivity from time to time.

          We see other exceptions in the logs. However, these seem to be related to us shutting down idle machines, or the Disk Usage Util plugin, and seem unrelated.

          Last week, we increased the load on our machine from about 40-slaves to 70, and also increased the number of jobs. Before this, we had not seen this problem.

          We are planning to upgrade to take in the (now reopened) fix for 22932.

          James Noonan added a comment - I was going to raise a second defect, but I think this is similar enough. When the problem occurs, the Slaves Console shows 'Connected'. However, the master shows them all disconnected. The only way to recover so far is to restart Jenkins. We are running Master on WindowsServer2012, on VMWare. We are running about 70 slaves, a mix OSX10.9, Win7, and Linux Sled 11 on VMWare. There are some other variants. We are running Jenkins 1.563. This issue has occurred three times for us. Two cases are independent; one occurred shortly after the first and the JVM was not restarted, so perhaps recovery between the 1st and 2nd time was not complete. We have not identified a trigger cause for this problem. The thread count starts to increase linearly once the problem occurs, but we believe that this is a symptom. In the JavaMelody Monitoring Plugin, there may be a difference between the reported thread number on the machine in two different places. The graph showed 4000 (it was running but down for 30 hours). However, the thread count below showed 400. I believe that the first figure maybe the JVM's count while the second is Jenkins'. In normal operation, we see about 200 threads. (However, we restarted, so I am not 100% sure that this is correct). We see the following messages in the error log. The same exception occurs for each of our slaves within a short period of time. Jul 31, 2014 5:13:17 AM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed WARNING: NioChannelHub keys=86 gen=1625477529: Computer.threadPoolForRemoting 58 for + XXXXXXXX terminated java.io.IOException: Failed to abort at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184) at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:599) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:481) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(Unknown Source) at sun.nio.ch.SocketAdaptor.shutdownInput(Unknown Source) at org.jenkinsci.remoting.nio.Closeables$1.close(Closeables.java:20) at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:289) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:226) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:224) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:474) ... 6 more In the first case, we also saw ping timeouts occur at about the same time as the problem. These were not present in the other case. On the latest case, there was a single slave losing network connectivity and we saw this exception in advance of when the 'crash' happened. However, I believe this to be a coincidence. The exception occurs in the logs without all slaves losing connectivity from time to time. We see other exceptions in the logs. However, these seem to be related to us shutting down idle machines, or the Disk Usage Util plugin, and seem unrelated. Last week, we increased the load on our machine from about 40-slaves to 70, and also increased the number of jobs. Before this, we had not seen this problem. We are planning to upgrade to take in the (now reopened) fix for 22932.

          Kevin Browder added a comment -

          OK so I think the core issue is that org.jenkinsci.remoting.nio.NioChannelHub.java's line 513 is:
          if (key.isReadable()) {
          where as I think it should be:
          if (key.isValid() && key.isReadable()) {
          I guess this would fix the issue assuming that selectedKeys().iterator() is thread safe (I don't really know much about nio), actually it probably makes sense just to add a catch to one of the handlers in the same method (I think the one at http://git.io/VtniaQ).

          Basically my thoughts as to what's happening is that isReadable is generating a CancelledKeyException which ends up getting caught by the RuntimeException handler (at http://git.io/l-5MhA) which ends up killing the loop and attempts to abort everything, including the selector that's not-valid (which gives the message in the description).

          Kevin Browder added a comment - OK so I think the core issue is that org.jenkinsci.remoting.nio.NioChannelHub.java's line 513 is: if (key.isReadable()) { where as I think it should be: if (key.isValid() && key.isReadable()) { I guess this would fix the issue assuming that selectedKeys().iterator() is thread safe (I don't really know much about nio), actually it probably makes sense just to add a catch to one of the handlers in the same method (I think the one at http://git.io/VtniaQ ). Basically my thoughts as to what's happening is that isReadable is generating a CancelledKeyException which ends up getting caught by the RuntimeException handler (at http://git.io/l-5MhA ) which ends up killing the loop and attempts to abort everything, including the selector that's not-valid (which gives the message in the description).

          Kevin Browder added a comment - - edited

          @James: So I think the closed channel exception is actually closer to the Jenkins-22932 bug (if so you should repopen, since I had reopened before realizing I had a different root cause I then closed). However one could argue that the "selector" loop should actually catch all NIO errors and try again instead of it's current behavior of killing the loop entirely so it might be the case that the fix ends up being the same.

          Additionally I've implemented a patch that implements the key.isValid() check above:
          https://github.com/kbrowder/remoting/commit/d52cef17a789bac0d1478c561c6696a82eb9ab6a
          Additionally I've got another change that captures CancelledKeyExceptions:
          https://github.com/kbrowder/remoting/commit/1dc29075e26c382b593d189a3a04cd1ab859f7c5

          Actually I think with some minor modification you could extend this last approach to catch a number of potential pitfalls

          Kevin Browder added a comment - - edited @James: So I think the closed channel exception is actually closer to the Jenkins-22932 bug (if so you should repopen, since I had reopened before realizing I had a different root cause I then closed). However one could argue that the "selector" loop should actually catch all NIO errors and try again instead of it's current behavior of killing the loop entirely so it might be the case that the fix ends up being the same. Additionally I've implemented a patch that implements the key.isValid() check above: https://github.com/kbrowder/remoting/commit/d52cef17a789bac0d1478c561c6696a82eb9ab6a Additionally I've got another change that captures CancelledKeyExceptions: https://github.com/kbrowder/remoting/commit/1dc29075e26c382b593d189a3a04cd1ab859f7c5 Actually I think with some minor modification you could extend this last approach to catch a number of potential pitfalls

          Jesse Glick added a comment -

          Assuming the purported fix in JENKINS-22932 did in fact correct at least some variants of the bug, it should be left closed; if this issue represents some other variants, then fine—a follow-up fix can close this one, and it can be backported separately if marked lts-candidate.

          Jesse Glick added a comment - Assuming the purported fix in JENKINS-22932 did in fact correct at least some variants of the bug, it should be left closed; if this issue represents some other variants, then fine—a follow-up fix can close this one, and it can be backported separately if marked lts-candidate .

          James Noonan added a comment -

          We updated to take in fix 22932 today.

          If the issue reoccurs for us, I'll raise a new defect.

          James Noonan added a comment - We updated to take in fix 22932 today. If the issue reoccurs for us, I'll raise a new defect.

          Kevin Browder added a comment -

          I have filed a pull request: https://github.com/jenkinsci/remoting/pull/24
          Some might consider avoiding the error and catching it a bit paranoid, but I'm not entirely sure about concurrency issues with NIO. Additionally I still don't have a good test for this, I guess you'd need to cancel the key before calling key.isReadable(), but there's probably a very narrow window there.

          Kevin Browder added a comment - I have filed a pull request: https://github.com/jenkinsci/remoting/pull/24 Some might consider avoiding the error and catching it a bit paranoid, but I'm not entirely sure about concurrency issues with NIO. Additionally I still don't have a good test for this, I guess you'd need to cancel the key before calling key.isReadable() , but there's probably a very narrow window there.

          Kevin Browder added a comment -

          If it wasn't clear (re-reading my last message I guess it wasn't); yes this represents a different variant to issue initially presented in JENKINS-22932, basically a different exception get's thrown which I think I've fixed in the pull request above, it's probably possible to refactor the fix for both issues in such a way that other exceptions don't kill the main NIO/select loop in the future but this specific issue can be fixed without that (it's easier for you guys to review this anyways).

          Kevin Browder added a comment - If it wasn't clear (re-reading my last message I guess it wasn't); yes this represents a different variant to issue initially presented in JENKINS-22932 , basically a different exception get's thrown which I think I've fixed in the pull request above, it's probably possible to refactor the fix for both issues in such a way that other exceptions don't kill the main NIO/select loop in the future but this specific issue can be fixed without that (it's easier for you guys to review this anyways).

          Kevin Browder added a comment -

          To fill this in I hear Koshuke is on break so a remoting release wont happen until he comes back (I don't know when that is). Is there any way to hack in a new remoting jar into our production Jenkins (without building all of Jenkins)? Is this wise (we're not on the latest jenkins so are these things cross compatible, additionally will update continue to work)? Basically we're getting crashes a 1-3 times a day, just wondering if there's anything else anyone would recommend so we could get back to normal faster.

          Kevin Browder added a comment - To fill this in I hear Koshuke is on break so a remoting release wont happen until he comes back (I don't know when that is). Is there any way to hack in a new remoting jar into our production Jenkins (without building all of Jenkins)? Is this wise (we're not on the latest jenkins so are these things cross compatible, additionally will update continue to work)? Basically we're getting crashes a 1-3 times a day, just wondering if there's anything else anyone would recommend so we could get back to normal faster.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/org/jenkinsci/remoting/nio/NioChannelHub.java
          http://jenkins-ci.org/commit/remoting/1083a97145b83f88d9eee0a920a9495e192cd480
          Log:
          [FIXED JENKINS-24050] don't let canceled keys kill the selector thread

          In looking at the proposed PR #24 (https://github.com/jenkinsci/remoting/pull/24), I feel bit
          uneasy to mask the problem like it does.

          The code in question is looping through selected keys and processing it one by one.

          The only code that calls key.cancel() is done from the selector thread that runs this loop.
          So I don't understand how it is possible that the key picked up from selected key set is
          already cancelled here. I wonder if something more is going on.

          Regardless, I agree that this shouldn't kill the selector thread, which breaks all the slaves
          in one go. This change flags and reports the problem, kill the connection related to that key,
          then continue to serve other connections.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/org/jenkinsci/remoting/nio/NioChannelHub.java http://jenkins-ci.org/commit/remoting/1083a97145b83f88d9eee0a920a9495e192cd480 Log: [FIXED JENKINS-24050] don't let canceled keys kill the selector thread In looking at the proposed PR #24 ( https://github.com/jenkinsci/remoting/pull/24 ), I feel bit uneasy to mask the problem like it does. The code in question is looping through selected keys and processing it one by one. The only code that calls key.cancel() is done from the selector thread that runs this loop. So I don't understand how it is possible that the key picked up from selected key set is already cancelled here. I wonder if something more is going on. Regardless, I agree that this shouldn't kill the selector thread, which breaks all the slaves in one go. This change flags and reports the problem, kill the connection related to that key, then continue to serve other connections.

          After this change, the slaves no longer disconnect. Instead, the underlying issue causes the slaves to just stop doing whatever they were doing. Running jobs on those slaves hang forever and can not be cancelled. The Jenkins server starts to spam this in the logs until the filesystem fills up:

          Sep 11, 2014 10:38:13 AM org.kohsuke.stapler.export.Property writeValue
          WARNING: null
          org.kohsuke.stapler.export.NotExportableException: class hudson.plugins.parameterizedtrigger.CapturedEnvironmentAction doesn't have @ExportedBean so cannot write hudson.model.Actionable.actions
          at org.kohsuke.stapler.export.Model.<init>(Model.java:73)
          at org.kohsuke.stapler.export.ModelBuilder.get(ModelBuilder.java:51)
          at org.kohsuke.stapler.export.Property.writeValue(Property.java:231)
          at org.kohsuke.stapler.export.Property.writeValue(Property.java:187)
          at org.kohsuke.stapler.export.Property.writeValue(Property.java:139)
          at org.kohsuke.stapler.export.Property.writeTo(Property.java:116)
          at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:190)
          at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
          at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
          at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
          at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
          at org.kohsuke.stapler.export.Model.writeTo(Model.java:157)
          at org.kohsuke.stapler.ResponseImpl.serveExposedBean(ResponseImpl.java:267)
          at hudson.model.Api.doPython(Api.java:216)
          at sun.reflect.GeneratedMethodAccessor387.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:622)
          at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298)
          at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161)
          at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96)
          at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:120)
          at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
          at org.kohsuke.stapler.MetaClass$4.doDispatch(MetaClass.java:210)
          at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
          at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:390)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
          at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:248)
          at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:631)
          at org.kohsuke.stapler.Stapler.service(Stapler.java:225)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
          at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96)
          at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:58)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:99)
          at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
          at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:135)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
          at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
          at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
          at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
          at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
          at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
          at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
          at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
          at org.eclipse.jetty.server.Server.handle(Server.java:370)
          at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
          at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
          at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
          at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
          at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
          at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
          at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:701)

          Patricia Wright added a comment - After this change, the slaves no longer disconnect. Instead, the underlying issue causes the slaves to just stop doing whatever they were doing. Running jobs on those slaves hang forever and can not be cancelled. The Jenkins server starts to spam this in the logs until the filesystem fills up: Sep 11, 2014 10:38:13 AM org.kohsuke.stapler.export.Property writeValue WARNING: null org.kohsuke.stapler.export.NotExportableException: class hudson.plugins.parameterizedtrigger.CapturedEnvironmentAction doesn't have @ExportedBean so cannot write hudson.model.Actionable.actions at org.kohsuke.stapler.export.Model.<init>(Model.java:73) at org.kohsuke.stapler.export.ModelBuilder.get(ModelBuilder.java:51) at org.kohsuke.stapler.export.Property.writeValue(Property.java:231) at org.kohsuke.stapler.export.Property.writeValue(Property.java:187) at org.kohsuke.stapler.export.Property.writeValue(Property.java:139) at org.kohsuke.stapler.export.Property.writeTo(Property.java:116) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:190) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeTo(Model.java:157) at org.kohsuke.stapler.ResponseImpl.serveExposedBean(ResponseImpl.java:267) at hudson.model.Api.doPython(Api.java:216) at sun.reflect.GeneratedMethodAccessor387.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:120) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$4.doDispatch(MetaClass.java:210) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:390) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:248) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:631) at org.kohsuke.stapler.Stapler.service(Stapler.java:225) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:58) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:99) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:135) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701)

          Daniel Beck added a comment -

          growflet: The stack traces and log size are a completely unrelated issue (note that this is about HTTP and the Python API), see JENKINS-24458 and issues linked from there.

          Daniel Beck added a comment - growflet : The stack traces and log size are a completely unrelated issue (note that this is about HTTP and the Python API), see JENKINS-24458 and issues linked from there.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          pom.xml
          http://jenkins-ci.org/commit/jenkins/8fc609fe0952b285d5b26a59fd5ff4c29704d33d
          Log:
          [JENKINS-23471 JENKINS-24050]

          Integrated the fix in remoting to Jenkins 1.580.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/8fc609fe0952b285d5b26a59fd5ff4c29704d33d Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580.

          dogfood added a comment -

          Integrated in jenkins_main_trunk #3685
          [JENKINS-23471 JENKINS-24050] (Revision 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

          Result = SUCCESS
          kohsuke : 8fc609fe0952b285d5b26a59fd5ff4c29704d33d
          Files :

          • changelog.html
          • pom.xml

          dogfood added a comment - Integrated in jenkins_main_trunk #3685 [JENKINS-23471 JENKINS-24050] (Revision 8fc609fe0952b285d5b26a59fd5ff4c29704d33d) Result = SUCCESS kohsuke : 8fc609fe0952b285d5b26a59fd5ff4c29704d33d Files : changelog.html pom.xml

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          pom.xml
          http://jenkins-ci.org/commit/jenkins/9c82fc42eb08b89047c544aaa586291ad1485472
          Log:
          [JENKINS-23471 JENKINS-24050]

          Integrated the fix in remoting to Jenkins 1.580.

          (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/9c82fc42eb08b89047c544aaa586291ad1485472 Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580. (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

          dogfood added a comment -

          Integrated in jenkins_main_trunk #4292
          [JENKINS-23471 JENKINS-24050] (Revision 9c82fc42eb08b89047c544aaa586291ad1485472)

          Result = UNSTABLE
          ogondza : 9c82fc42eb08b89047c544aaa586291ad1485472
          Files :

          • pom.xml
          • changelog.html

          dogfood added a comment - Integrated in jenkins_main_trunk #4292 [JENKINS-23471 JENKINS-24050] (Revision 9c82fc42eb08b89047c544aaa586291ad1485472) Result = UNSTABLE ogondza : 9c82fc42eb08b89047c544aaa586291ad1485472 Files : pom.xml changelog.html

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          pom.xml
          http://jenkins-ci.org/commit/jenkins/91c5551d4c7682d4adba28fe591fa7772eee62e0
          Log:
          [JENKINS-23471 JENKINS-24050]

          Integrated the fix in remoting to Jenkins 1.580.

          (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/91c5551d4c7682d4adba28fe591fa7772eee62e0 Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580. (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

            kohsuke Kohsuke Kawaguchi
            kbrowder Kevin Browder
            Votes:
            5 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: