Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Major
Component/s: core
Labels:
None
Environment:
Environment: Debian 5, sun-java-jdk (1.6.0_22)
Jenkins version: 1.414-SNAPSHOT

Similar Issues:

Show

Running Jenkins with the embedded Winstone server for a long time
under constant load conditions causes file descriptor and thread
leakage.

Environment: Debian 5, sun-java-jdk (1.6.0_22)
Jenkins version: 1.414-SNAPSHOT

What happens:

After running for about 1 day the following appears on jenkins log
file:

[Winstone 2011/05/27 07:35:03] - WARNING: Request handler pool limit exceeded - waiting for retry

and a bit later (this starts repeating):

[Winstone 2011/05/27 07:43:25] - WARNING: Request handler pool limit exceeded - waiting for retry
[Winstone 2011/05/27 07:43:26] - ERROR: Request ignored because there were no more request handlers available in the pool
[Winstone 2011/05/27 07:43:36] - WARNING: Request handler pool limit exceeded - waiting for retry
[Winstone 2011/05/27 07:43:37] - ERROR: Request ignored because there were no more request handlers available in the pool

Jenkins then stops handling requests successfully - at the beginning
intermittently, but finally basically failing almost all of the
requests.

Using VisualVM I can see that there is a thousand RequestHandlerThread
threads in wait state, and that over 1200 file descriptors are
currently in use.

I think the requests start failing because winstone has a this limit:

private int MAX_REQUEST_HANDLERS_IN_POOL = 1000;

as it doesn't seem to be running out of available fds (apparently 8192
is the maximum in this setup).

When I restart jenkins I can verify a slow buildup of threads and used
file descriptors:

10 minutes after restart: 136 live threads, 256 fds used
20 minutes: 150 threads, 271 fds
30 minutes: 161 threads, 280 fds
110 minutes: 255 threads, 376 fds

I've looked at the repository version of winstone, and looking at the
code there seems to be a race condition in handling of the request
handler pool.

When a request is received by ObjectPool.handleRequest, it looks for
an available request handler from unusedRequestHandlerThreads and
calls commenceRequestHandling on the available thread.

commenceRequestHandling in turn does this.notifyAll() to wake up the
thread. So far so good. However when the thread has finished
processing the request, it calls
this.objectPool.releaseRequestHandler(this) and then waits. I think
here's a race condition, since what can happen is that object pool
called (CALL) and request handler thread (RH) can interleave like
this:

RH (in RequestHandler.run): this.objectPool.releaseRequestHandler(this)
RH (in ObjectPool.releaseRequestHandler): this.unusedRequestHandlerThreads.add(rh)
CALL (in ObjectPool.handleRequest): take RH from unusedRequestHandlerThreads
CALL (in ObjectPool.handleRequest): rh.commenceRequestHandling(socket, listener);
CALL (in RequestHandler.commenceRequestHandling): this.notifyAll()
RH (in ObjectPool.run): this.wait()

Since notify is lost (no waiters), this.wait() in the last step will
hang forever. This will leak a file descriptor since the socket given
to be processed is never reclaimed, and threads are effectively lost
as Winstone will then create more RequestHandlers.

Now, this is of course a winstone problem, but its development seems
to be d-e-a-d at least looking at its bug tracker. As long as this
problem affect Jenkins, I'd still classify it as a Jenkins problem too.

I've put this into the winstone tracker too: https://sourceforge.net/tracker/?func=detail&aid=3308285&group_id=98922&atid=622497

Workaround: Use Tomcat, not embedded winstone (that's what I'm doing now).

Assignee:: Unassigned

Reporter:: Santeri Paavolainen

Votes:: 2 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2011-06-06 09:38

Updated:: 2012-05-23 13:58

Resolved:: 2012-03-15 19:23

Details

Description

Attachments

Activity

People

Dates