[memcache]dead_retry and [cache]memcache_dead_retry should be set back to 300

Bug #1471318 reported by Boris Bobrov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
Oleksiy Molchanov
5.1.x
Invalid
High
Alex Ermolov
6.0.x
Invalid
High
Alex Ermolov
6.1.x
In Progress
High
Alex Ermolov

Bug Description

As part of fixing bug 1461036 we set [cache]memcache_dead_retry and [memcache]dead_retry to 30 (https://review.openstack.org/#/c/190248/). It was a big mistake. Due to this change keystone checks dead memcache servers every 30 seconds, causing socket_timeout delay every 30 seconds. This badly affects user experience during node failure, as was shown by recent escalation from one of our big customers.

Setting dead_retry to 30 does not even solve the initial problem, when keystone was not able to find a token set by woken up keystone.

Consider the following situation:

1. keystone-1, keystone-2, keystone-3; memcache-1, memcache-2, memcache-3.
2.
keystone-1 marks alive memcache-1, memcache-2, memcache-3;
keystone-2 marks alive memcache-1, memcache-2, memcache-3.
keystone-3 marks alive memcache-1, memcache-2, memcache-3.
haproxy knows about keystone-1, keystone-2, keystone-3.
3. keystone-3 and memcache-3 (being on the same controller) go down.
keystone-1 marks alive memcache-1, memcache-2, marks dead for N seconds memcache-3
keystone-2 marks alive memcache-1, memcache-2, marks dead for N seconds memcache-3
haproxy knows about keystone-1, keystone-2.
4. keystone-3 and memcache-3 immediately go up.
keystone-1 marks alive memcache-1, memcache-2; memcache-3 is marked as dead and will be considered dead for N seconds
keystone-2 marks alive memcache-1, memcache-2, memcache-3 is marked as dead and will be considered dead for N seconds
keystone-3 (haproxy marks it alive in 6-7 seconds after going up) marks alive memcache-1, memcache-2 AND memcache-3.
haproxy knows keystone-1, keystone-2, keystone-3,

Since keystone-3 knows that memcache-3 is alive it writes its token there.
keystone-1 does not know that memcache-3 is alive. It will look for requested token in memcache-1 and memcache-2, will not find it and will raise 401 Unauthorized.

We should set [cache]memcache_dead_retry and [memcache]dead_retry to 300. This will increase the time of understanding that the node went up. To prevent the situation described above I suggest to set haproxy settings for keystone like this:

check inter 10s fastinter 2s downinter 3s rise 150 fall 3

note 150 in `rise`. This setting guarantees that keystone-3 will not be marked as alive for 300 seconds after going up, solving our issue.

Tags: ha keystone
Changed in mos:
milestone: 6.1 → 6.1-updates
assignee: Fuel Library Team (fuel-library) → Oleksiy Molchanov (omolchanov)
status: New → Confirmed
Changed in mos:
milestone: 6.1-updates → 7.0
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

IIUC, fernet tokens in the Kilo will not require memcached backend for the 7.0. If so, this issue is not applicable for the 7.0

Changed in mos:
status: Confirmed → Invalid
Revision history for this message
Boris Bobrov (bbobrov) wrote :

No, it is valid for 7.0. This is also about cache, which still will be with fernet tokens. It will also generally be good for memcache backend for tokens if someone would still like to use it.

Changed in mos:
status: Invalid → Incomplete
status: Incomplete → Confirmed
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :
Changed in mos:
status: Confirmed → In Progress
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :
Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
OSCI Robot (oscirobot) wrote :

NOTE: Changeset is not merged, created temporary package repository.
RPM package fuel-library6.0 has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0-6.0.0-6206.2.gerrit207360.1.git0fcf43f.noarch.rpm
fuel-library6.0-6.0.0-6206.2.gerrit207360.1.git0fcf43f.noarch.rpm
Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable-LP1471318/centos .

Revision history for this message
OSCI Robot (oscirobot) wrote :

NOTE: Changeset is not merged, created temporary package repository.
DEB package fuel-library has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0_6.0.0-6206.2.gerrit207360.1.git0fcf43f_all.deb
fuel-library6.0_6.0.0-6206.2.gerrit207360.1.git0fcf43f_all.deb
Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable-LP1471318/ubuntu .

Revision history for this message
Alex Ermolov (aermolov) wrote :

Marked {5.1, 5.1.1, 6.0}-updates as invalid because we have no way to deliver such updates for existing installations prior to 6.1.

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Verified on MOS 7.0 ISO #265.

Changed in mos:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.