keystonemiddleware

keystonemiddleware connections to memcached from neutron-server grow beyond configured values

Bug #1883659 reported by Justinas Balciunas on 2020-06-16

284

This bug affects 6 people

	Status	Importance	Assigned to
OpenStack Security Advisory	Won't Fix	Undecided	Unassigned
keystonemiddleware	Confirmed	Undecided	Unassigned
oslo.cache	Invalid	Undecided	Unassigned

Bug Description

Using: keystone-17.0.0, Ussuri

I've noticed a very odd behaviour of keystone_authtoken with memcached and neutron-server. The connection count to memcached grows over time, ignoring the settings of memcache_pool_maxsize and memcache_pool_unused_timeout. The keystone_authtoken middleware configuration is as follows:

[keystone_authtoken]
www_authenticate_uri = http://keystone_vip:5000
auth_url = http://keystone_vip:35357
auth_type = password
project_domain_id = default
user_domain_id = default
project_name = service
username = neutron
password = neutron_password_here
cafile =
memcache_security_strategy = ENCRYPT
memcache_secret_key = secret_key_here
memcached_servers = memcached_server_1:11211,memcached_server_2:11211,memcached_server_3:11211
memcache_pool_maxsize = 100
memcache_pool_unused_timeout = 600
token_cache_time = 3600

Commenting out memcached settings under [keystone_authtoken] and restarting neutron-server drops the connection count in memcached to normal levels, i.e. hundreds, rather than thousands when neutron-server is using memcached. Neutron team (slaweq) suggested this is a Keystone issue because quote: "Neutron is just using keystonemiddleware as one of the middlewares in the pipeline".

Grafana memcached connection graphs: https://ibb.co/p3TCJqC AND https://ibb.co/nmmvvH4

The drops in the graphs indicate the restart of the neutron-server, so not sure if this is something to be expected, or there is an issue with the configuration, or it's a bug?

Justinas Balciunas (justinas-balciunas) on 2020-06-16

summary:

- keystonemiddleware connections to memcached from neutron-server grows
+ keystonemiddleware connections to memcached from neutron-server grow
beyond configured values

Revision history for this message

Gage Hugo (gagehugo) wrote on 2020-06-16:

Added keystonemiddleware

Revision history for this message

Gage Hugo (gagehugo) wrote on 2020-06-16:

Added oslo.cache, not 100% sure which is affected yet.

no longer affects:

keystone

Revision history for this message

Justinas Balciunas (justinas-balciunas) wrote on 2020-06-16:

Few additions:
1) the situation is not noticeable immediately, therefore automated tests don't trigger this as the whole setup (three memcached nodes, three neutron-servers with keystone_authtoken configured to use memcached) needs to run for a while to see that memcached connection count has exceeded the defined limits;
2) it was also observed that only two memcached nodes out of three are being hit by the uncontrollable growth in the number connections, i.e. one memcached node takes the most load, the second trails by 30-40% less and the third serves usual connection count;
3) the open connection count rises until the limits in memcached configuration are reached (25k per memcached node in my case);

Revision history for this message

Pierre Riteau (priteau) wrote on 2020-07-23:

I can confirm that I am seeing this issue with neutron-server, using three memcached servers through keystonemiddleware. This is with the Train release deployed on CentOS 8 with Kolla, which uses the following RDO packages:

openstack-neutron-15.1.0-1.el8.noarch
python3-keystonemiddleware-7.0.1-2.el8.noarch
python3-oslo-cache-1.37.0-2.el8.noarch

Revision history for this message

Pierre Riteau (priteau) wrote on 2020-07-23:

I am able to make the problem go away with this extra setting in neutron.conf:

[keystone_authtoken]
memcache_use_advanced_pool = True

This is the documentation for this setting:

# (Optional) Use the advanced (eventlet safe) memcached client pool. The
# advanced pool will only work under python 2.x. (boolean value)

This description dates from 2016. For now I haven't seen any issue enabling this setting with Python 3.

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2020-08-21:

It affects keystonemiddleware but I guess the fix is needed in oslo.cache

Changed in keystonemiddleware:
status:	New → Confirmed
Changed in oslo.cache:
status:	New → Confirmed

Revision history for this message

Herve Beraud (herveberaud) wrote on 2020-08-24:

Hello,

If I correctly understood this top you say that the connections grow more than allowed by the given config, right?

Few weeks ago another bug was opened [1] and it was due to `flush_on_reconnect` that can cause exponential raising of connections to memcached servers.

IIRC this option was mostly introduced for keystone's.

The submitted patch [1] is moving flush_on_reconnect from code to oslo.cache config block to be configurable.

It could be worth to follow a bit this track, and maybe try to turn off flush_on_reconnect manually and then observe the behavior with your context.

So either you can edit the code to remove this option, or you may try to apply this patch [1] to disable it by using config.

Please let me know if it help you.

[1] https://review.opendev.org/#/c/742193/

Revision history for this message

Herve Beraud (herveberaud) wrote on 2020-08-24:

s/top/topic/

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2020-10-02:

There is something linking this issue to https://bugs.launchpad.net/neutron/+bug/1864418 (neutron unable to run behind mod_wsgi). I sense a threading issue. Could be just me. :-)

Revision history for this message

Jeremy Stanley (fungi) wrote on 2020-10-08:

#10

It looks like this may be the same as public security bug 1892852 and bug 1888394.

Changed in ossa:
status:	New → Incomplete
information type:	Public → Public Security

Revision history for this message

Michal Arbet (michalarbet) wrote on 2021-02-01:

#11

I think this is caused by usage of obsolete code which was used in keystonemiddleware long time ago - before oslo.cache support was added to keystonemiddleware as new library (in past time :)).

Services which are using keystonemiddleware should use memcache_use_advanced_pool = True (oslo.cache memcached_pool implementation) instead of obsolete code.

Or better said option memcache_use_advanced_pool should be removed and keystonemiddleware should use oslo.cache implementation by default.

Oslo.cache introducing and adding to requirements in :
https://review.opendev.org/c/openstack/keystonemiddleware/+/268664
https://review.opendev.org/c/openstack/keystonemiddleware/+/527466/

Revision history for this message

Jeremy Stanley (fungi) wrote on 2021-02-17:

#12

At this point there's no clear exploit scenario and the description of this and the other two presumed related reports seems to be of a normal (albeit potentially crippling) bug. As such, the vulnerability management team is going to treat this as a class D report per our taxonomy and not issue an advisory once it's fixed, but if anyone disagrees we can reconsider the position: https://security.openstack.org/vmt-process.html#incident-report-taxonomy

Changed in ossa:
status:	Incomplete → Won't Fix

Revision history for this message

Ben Nemec (bnemec) wrote on 2021-02-18:

#13

I'm closing this for oslo.cache since according to comment 5 the problem goes away when using the oslo.cache backend.

Changed in oslo.cache:
status:	Confirmed → Invalid

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-06-01: Fix included in openstack/keystonemiddleware 9.3.0

#14

This issue was fixed in the openstack/keystonemiddleware 9.3.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-06-01: Fix proposed to keystonemiddleware (stable/wallaby)

#15

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/keystonemiddleware/+/793917

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-03-26: Change abandoned on keystonemiddleware (stable/wallaby)

#16

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/keystonemiddleware/+/793917
Reason: stable/wallaby branch of openstack/keystonemiddleware is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/wallaby if you want to further work on this patch.

Report a bug

This report contains Public Security information

Everyone can see this security related information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.