Respond to gocepts threadpool suggestion

Bug #1272423 reported by Paul Everitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Won't Fix
Low
Chris McDonough

Bug Description

Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.

"""
Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:

haproxy triggers test requests to prove, whether the backends are still running. If there raises a situation where all backends are in use for long running requests haproxy is not able to prove the backends any longer, starts thinking they are broken and kicks them out of the pool. This way there are no usable backends any longer and haproxy serves 503 error messages.

To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
haproxy:
 maxconn 5
Karl:
 threadpool_workers = 10
 threadpool_spawn_if_under = 5

I hope that our experiences also helps for the OSF Karl setup.
""""

Revision history for this message
Chris McDonough (chrism-plope) wrote :

The reason we have threadpool_workers set to 1 is to reduce the amount of memory consumed by the appserver. For each new thread we add, an in-memory ZODB cache is kept. We only want one cache to be kept. We definitely don't want 10 caches to be kept. We might be able to bump threadpool_workers to 2 or something, although that'd be not-ideal. The OSF Karl instance has a lot more content than the multikarl setup, and therefore has a lot more that it needs to keep in cache, so it's important that we try to reduce memory usage. We could also maybe instruct HAProxy to consider an appserver "still up" even if the ping request takes a long time. That'd probably be the sanest thing here, although I don't know how to do that.

Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1272423] Re: Respond to gocepts threadpool suggestion

I'd say that's a reasonable verdict. Chris, feel free to mark this as Won't Fix.

--Paul

On Jan 29, 2014, at 3:16 PM, Chris McDonough <email address hidden> wrote:

> The reason we have threadpool_workers set to 1 is to reduce the amount
> of memory consumed by the appserver. For each new thread we add, an in-
> memory ZODB cache is kept. We only want one cache to be kept. We
> definitely don't want 10 caches to be kept. We might be able to bump
> threadpool_workers to 2 or something, although that'd be not-ideal. The
> OSF Karl instance has a lot more content than the multikarl setup, and
> therefore has a lot more that it needs to keep in cache, so it's
> important that we try to reduce memory usage. We could also maybe
> instruct HAProxy to consider an appserver "still up" even if the ping
> request takes a long time. That'd probably be the sanest thing here,
> although I don't know how to do that.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1272423
>
> Title:
> Respond to gocepts threadpool suggestion
>
> Status in KARL3:
> Confirmed
>
> Bug description:
>
> Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.
>
> """
> Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:
>
> haproxy triggers test requests to prove, whether the backends are
> still running. If there raises a situation where all backends are in
> use for long running requests haproxy is not able to prove the
> backends any longer, starts thinking they are broken and kicks them
> out of the pool. This way there are no usable backends any longer and
> haproxy serves 503 error messages.
>
> To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
> haproxy:
> maxconn 5
> Karl:
> threadpool_workers = 10
> threadpool_spawn_if_under = 5
>
> I hope that our experiences also helps for the OSF Karl setup.
> """"
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1272423/+subscriptions

Changed in karl3:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.