The config documentation says that the memcache_dead_retry option is valid for both dogpile.cache.memcached and oslo_cache.memcache_pool, and that it defaults to 300 (where the default in python-memcached is actually 30). I observed that when using dogpile.cache.memcached as the [cache] backend on keystone, and then taking down one of the memcached instances, the memcache server objects set their deaduntil value to 30 seconds in the future. This is a problem because when a request comes in to an API server with two memcached servers configured, one of which is unroutable, it takes around 15 seconds for it to try each of those servers in each thread it has created and reach the three-second socket timeout limit every time it encounters the one that is down. By the time the user issues another request (and e.g. openstackclient is ready to start making API requests), the deaduntil value has been reached and we go through the whole cycle again. The memcache_socket_timeout option seems to also not be respected.
I'm not sure if this is a documentation bug or a code bug.
Hello,
I implemented changes in dogpile.cache to solve this issue:
https:/ /github. com/sqlalchemy/ dogpile. cache/commit/ 1de93aab14c1274 f20c1f44f8adff3 b143c864f6
My local tests on oslo.cache, keystone and keystonemiddleware didn't show side effects or issues with that changes, however downstream tests showed us that some problems exists between keystone and dogpile.cache with this patch:
https:/ /bugzilla. redhat. com/show_ bug.cgi? id=2103117