Multiple memcached back-end instances breaks caching

Bug #1743036 reported by Christian Sarrasin
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Invalid
Undecided
Unassigned
oslo.cache
Fix Released
Undecided
Morgan Fainberg

Bug Description

Environment
~~~~~~~~~~~
* Keystone deployed in containers, running on uWSGI (per openstack-ansible defaults)
* Keystone baseline (as provided by OSA 16.0.2): 6a67918f9d5f39564af8eacc57b80cba98242683 # HEAD of "stable/pike" as of 28.09.2017
* openstack-ansible version: 16.0.2
* Target OS: Ubuntu 16.04 Xenial
* keystone.conf: http://paste.openstack.org/show/643865/ ([cache] backend = dogpile.cache.memcached)

Symptom
~~~~~~~
Running keystone against multiple memcached backends (as per OSA standard deployment pattern) results in caching being completely defeated - meaning Keystone's performance is as if caching was disabled.

Switching `backend_argument = url:<cache1ip>,<cache2ip>,<cache3ip>:11211` to `url:<cache1ip>:11211` restores caching (and performance).

Analysis
~~~~~~~~
Having turned on oslo.cache debugging with the following settings:

`
[DEFAULT]
debug = True
default_log_levels = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=INFO,oslo.messaging=INFO,oslo_messaging=INFO,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=DEBUG,dogpile.core.dogpile=INFO

[cache]
debug_cache_backend = True
`
It becomes obvious that every attempt to retrieve a cached value fails. This appears to be because a different key is generated (through hashing?) despite the paylod (eg: token) being identical.

Evidence
~~~~~~~~
The following log excerpts demonstrate the issue:
* http://paste.openstack.org/show/643886/
* http://paste.openstack.org/show/643887/

This is generated through two subsequent attempts to validate the same token (payload shown on the last line of both logs is the same), 3 seconds apart, within the same keystone uWSGI worker process, through the same client invokation:

curl -X GET -H "X-Auth-Token: $ADMIN_TOKEN" -H "X-Subject-Token: $SUBJECT_TOKEN" http://${HOST}:35357/v3/auth/tokens -w "\n\n%{time_total}\n\n"

Yet the cache keys for both requests are different ('5bd08aa07bf8b4bcafeac88469769a4554297df6' vs '27eb98ba5a77f3701547be6ed06f3301e124e853'). Because the keys are different, the 2nd request (for the same token) doesn't hit the cache and the (same) token is stored a 2nd time in the cache with the 2nd key.

Revision history for this message
Lance Bragstad (lbragstad) wrote :

I'm able to recreate this with 3 memcached instances, but only with the following configuration:

backend_argument = url:192.168.122.243,192.168.122.241,192.168.122.144:11211

If I rework that configuration option into using memcache_servers I don't see the issue:

memcache_servers = 192.168.122.243:11211,192.168.122.241:11211,192.168.122.144:11211

I can confirm memcache_servers works as expected by looping a token validation call and seeing traffic get routed to each instance in the ring of memcached servers.

The backend_argument doesn't seem to be working as expected. Looks like you can work around it with memcache_servers for the time being.

Changed in oslo.cache:
status: New → Confirmed
Changed in keystone:
status: New → Invalid
Changed in oslo.cache:
assignee: nobody → Morgan Fainberg (mdrnstm)
Revision history for this message
Christian Sarrasin (sxc731) wrote :

Confirm that with the workaround, the caches are properly hit, resulting in ~5x performance boost on KS token validation benchmarks (~150ms down to ~30ms).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (master)

Fix proposed to branch: master
Review: https://review.openstack.org/611117

Changed in oslo.cache:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (master)

Reviewed: https://review.openstack.org/611117
Committed: https://git.openstack.org/cgit/openstack/oslo.cache/commit/?id=685a05edfef24cb2300df24aa3417ad0ecd2c2c2
Submitter: Zuul
Branch: master

commit 685a05edfef24cb2300df24aa3417ad0ecd2c2c2
Author: Morgan Fainberg <email address hidden>
Date: Tue Oct 16 10:50:33 2018 -0700

    Config dict arguments.url must be a list

    For memcache, arguments.url is assumed to be a list, this means
    that the underlying memcache library could not handle a case where
    a string (comma separated) was passed in. The idea is to force a
    split on ',' in the special case of backend_argument=url:...
    so that a list is passed to the backend.

    Change-Id: I65afb46455423d7f96e17c471330c8d74dc17cac
    Closes-Bug: #1743036

Changed in oslo.cache:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/611369

Revision history for this message
Ben Nemec (bnemec) wrote :

The fix has merged to master, but it looks like this was reported against Pike so we'll want to backport it there.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/rocky)

Reviewed: https://review.openstack.org/611369
Committed: https://git.openstack.org/cgit/openstack/oslo.cache/commit/?id=6e6d3d6b06b5588feeaaff7b23a4afbc0768dbf7
Submitter: Zuul
Branch: stable/rocky

commit 6e6d3d6b06b5588feeaaff7b23a4afbc0768dbf7
Author: Morgan Fainberg <email address hidden>
Date: Tue Oct 16 10:50:33 2018 -0700

    Config dict arguments.url must be a list

    For memcache, arguments.url is assumed to be a list, this means
    that the underlying memcache library could not handle a case where
    a string (comma separated) was passed in. The idea is to force a
    split on ',' in the special case of backend_argument=url:...
    so that a list is passed to the backend.

    Change-Id: I65afb46455423d7f96e17c471330c8d74dc17cac
    Closes-Bug: #1743036
    (cherry picked from commit 685a05edfef24cb2300df24aa3417ad0ecd2c2c2)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.cache 1.31.1

This issue was fixed in the openstack/oslo.cache 1.31.1 release.

Revision history for this message
Eyal B (eyalb1) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.cache 1.30.2

This issue was fixed in the openstack/oslo.cache 1.30.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.