oslo.cache

memcache_dead_retry is not respected for dogpile.cache.memcache backend

Bug #1723133 reported by Colleen Murphy on 2017-10-12

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	oslo.cache	Fix Released	Medium	Unassigned

Bug Description

The config documentation says that the memcache_dead_retry option is valid for both dogpile.cache.memcached and oslo_cache.memcache_pool, and that it defaults to 300 (where the default in python-memcached is actually 30). I observed that when using dogpile.cache.memcached as the [cache] backend on keystone, and then taking down one of the memcached instances, the memcache server objects set their deaduntil value to 30 seconds in the future. This is a problem because when a request comes in to an API server with two memcached servers configured, one of which is unroutable, it takes around 15 seconds for it to try each of those servers in each thread it has created and reach the three-second socket timeout limit every time it encounters the one that is down. By the time the user issues another request (and e.g. openstackclient is ready to start making API requests), the deaduntil value has been reached and we go through the whole cycle again. The memcache_socket_timeout option seems to also not be respected.

I'm not sure if this is a documentation bug or a code bug.

Ben Nemec (bnemec) on 2018-06-01

Changed in oslo.cache:
status:	New → Confirmed
importance:	Undecided → Medium

Revision history for this message

Herve Beraud (herveberaud) wrote on 2022-07-04:

Hello,

I implemented changes in dogpile.cache to solve this issue:

https://github.com/sqlalchemy/dogpile.cache/commit/1de93aab14c1274f20c1f44f8adff3b143c864f6

My local tests on oslo.cache, keystone and keystonemiddleware didn't show side effects or issues with that changes, however downstream tests showed us that some problems exists between keystone and dogpile.cache with this patch:

https://bugzilla.redhat.com/show_bug.cgi?id=2103117