HA: control plane API performance can be degraded during a controller outage

Bug #1913733 reported by Damien Ciabrini
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
In Progress
High
Unassigned

Bug Description

In the HA control plane, many API services rely on memcached for caching ephemeral data that are often accessed. The API services access the memcached service directly, so they are configured with the list of all memcached servers (how they are accessing the servers is out of scope):

   # grep -m1 -B3 '^memcache' /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf
   # Memcache servers in the format of "host:port". (dogpile.cache.memcache and
   # oslo_cache.memcache_pool backends only) (list value)
   #memcache_servers=localhost:11211
   memcache_servers=172.17.1.115:11211,172.17.1.62:11211,172.17.1.60:11211

When a controller node is in outage, it can happen that the API service access a memcached server that is hosted on an unresponsive node, and the recovery from an unsuccessful memcache access will cost some time, degrading the API service's response time.

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → wallaby-3
tags: added: train-backport-potential
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Changed in tripleo:
milestone: xena-1 → xena-2
Changed in tripleo:
milestone: xena-2 → xena-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "James Slagle <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/773094
Reason: Abandoning this patch per the TripleO Patch Abandonment guidelines
(https://specs.openstack.org/openstack/tripleo-specs/specs/policy/patch-abandonment.html).
If you wish to have this restored and cannot do so yourself, please reach out
via #tripleo on OFTC or the OpenStack Dev mailing list.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.