Respond to gocepts threadpool suggestion
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
KARL3 |
Won't Fix
|
Low
|
Chris McDonough |
Bug Description
Stefan made some recommendations that need analysis on the tradeoffs. Chris will own that thinking and will make production changes as he sees fit.
"""
Yes, and I like to let you know our experiences we also made with the multikarl installation. Setting the haproxy to one request per backend instance works best most of the time, but there is a nasty side effect:
haproxy triggers test requests to prove, whether the backends are still running. If there raises a situation where all backends are in use for long running requests haproxy is not able to prove the backends any longer, starts thinking they are broken and kicks them out of the pool. This way there are no usable backends any longer and haproxy serves 503 error messages.
To solve the situation for multikarl we increased the parallel requests again. The idea is that, even if there are a few long running requests, Karl should be able to answer the haproxy test requests and also some smaller requests reaching from the outside. For the moment we have proper results with the following settings:
haproxy:
maxconn 5
Karl:
threadpool_workers = 10
threadpool_
I hope that our experiences also helps for the OSF Karl setup.
""""
Changed in karl3: | |
status: | Confirmed → Won't Fix |
The reason we have threadpool_workers set to 1 is to reduce the amount of memory consumed by the appserver. For each new thread we add, an in-memory ZODB cache is kept. We only want one cache to be kept. We definitely don't want 10 caches to be kept. We might be able to bump threadpool_workers to 2 or something, although that'd be not-ideal. The OSF Karl instance has a lot more content than the multikarl setup, and therefore has a lot more that it needs to keep in cache, so it's important that we try to reduce memory usage. We could also maybe instruct HAProxy to consider an appserver "still up" even if the ping request takes a long time. That'd probably be the sanest thing here, although I don't know how to do that.