KARL3

Respond to gocepts threadpool suggestion

Bug #1272423 reported by Paul Everitt on 2014-01-24

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	KARL3	Won't Fix	Low	Chris McDonough	KARL3 m133 "Jan 2014"

Bug Description

Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.

"""
Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:

haproxy triggers test requests to prove, whether the backends are still running. If there raises a situation where all backends are in use for long running requests haproxy is not able to prove the backends any longer, starts thinking they are broken and kicks them out of the pool. This way there are no usable backends any longer and haproxy serves 503 error messages.

To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
haproxy:
maxconn 5
Karl:
threadpool_workers = 10
threadpool_spawn_if_under = 5

I hope that our experiences also helps for the OSF Karl setup.
""""

Revision history for this message

Chris McDonough (chrism-plope) wrote on 2014-01-29:

The reason we have threadpool_workers set to 1 is to reduce the amount of memory consumed by the appserver. For each new thread we add, an in-memory ZODB cache is kept. We only want one cache to be kept. We definitely don't want 10 caches to be kept. We might be able to bump threadpool_workers to 2 or something, although that'd be not-ideal. The OSF Karl instance has a lot more content than the multikarl setup, and therefore has a lot more that it needs to keep in cache, so it's important that we try to reduce memory usage. We could also maybe instruct HAProxy to consider an appserver "still up" even if the ping request takes a long time. That'd probably be the sanest thing here, although I don't know how to do that.

Revision history for this message

Paul Everitt (paul-agendaless) wrote on 2014-01-29: Re: [Bug 1272423] Re: Respond to gocepts threadpool suggestion

I'd say that's a reasonable verdict. Chris, feel free to mark this as Won't Fix.

--Paul

On Jan 29, 2014, at 3:16 PM, Chris McDonough <email address hidden> wrote:

> The reason we have threadpool_workers set to 1 is to reduce the amount
> of memory consumed by the appserver. For each new thread we add, an in-
> memory ZODB cache is kept. We only want one cache to be kept. We
> definitely don't want 10 caches to be kept. We might be able to bump
> threadpool_workers to 2 or something, although that'd be not-ideal. The
> OSF Karl instance has a lot more content than the multikarl setup, and
> therefore has a lot more that it needs to keep in cache, so it's
> important that we try to reduce memory usage. We could also maybe
> instruct HAProxy to consider an appserver "still up" even if the ping
> request takes a long time. That'd probably be the sanest thing here,
> although I don't know how to do that.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1272423
>
> Title:
> Respond to gocepts threadpool suggestion
>
> Status in KARL3:
> Confirmed
>
> Bug description:
>
> Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.
>
> """
> Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:
>
> haproxy triggers test requests to prove, whether the backends are
> still running. If there raises a situation where all backends are in
> use for long running requests haproxy is not able to prove the
> backends any longer, starts thinking they are broken and kicks them
> out of the pool. This way there are no usable backends any longer and
> haproxy serves 503 error messages.
>
> To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
> haproxy:
> maxconn 5
> Karl:
> threadpool_workers = 10
> threadpool_spawn_if_under = 5
>
> I hope that our experiences also helps for the OSF Karl setup.
> """"
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1272423/+subscriptions

I'd say that's a reasonable verdict. Chris, feel free to mark this as Won't Fix.

--Paul

On Jan 29, 2014, at 3:16 PM, Chris McDonough <chrism@plope.com> wrote:

> The reason we have threadpool_workers set to 1 is to reduce the amount
> of memory consumed by the appserver.  For each new thread we add, an in-
> memory ZODB cache is kept.  We only want one cache to be kept.  We
> definitely don't want 10 caches to be kept.  We might be able to bump
> threadpool_workers to 2 or something, although that'd be not-ideal.  The
> OSF Karl instance has a lot more content than the multikarl setup, and
> therefore has a lot more that it needs to keep in cache, so it's
> important that we try to reduce memory usage. We could also maybe
> instruct HAProxy to consider an appserver "still up" even if the ping
> request takes a long time.  That'd probably be the sanest thing here,
> although I don't know how to do that.
> 
> -- 
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1272423
> 
> Title:
>  Respond to gocepts threadpool suggestion
> 
> Status in KARL3:
>  Confirmed
> 
> Bug description:
> 
>  Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.
> 
>  """
>  Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:
> 
>  haproxy triggers test requests to prove, whether the backends are
>  still running. If there raises a situation where all backends are in
>  use for long running requests haproxy is not able to prove the
>  backends any longer, starts thinking they are broken and kicks them
>  out of the pool. This way there are no usable backends any longer and
>  haproxy serves 503 error messages.
> 
>  To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
>  haproxy:
>   maxconn 5
>  Karl:
>   threadpool_workers = 10
>   threadpool_spawn_if_under = 5
> 
>  I hope that our experiences also helps for the OSF Karl setup.
>  """"
> 
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1272423/+subscriptions

Chris McDonough (chrism-plope) on 2014-01-31

Changed in karl3:
status:	Confirmed → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.