[memcache]dead_retry and [cache]memcache_dead_retry should be set back to 300

Bug #1471318 reported by Boris Bobrov on 2015-07-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
High
Oleksiy Molchanov
5.1.x
High
Alex Ermolov
6.0.x
High
Alex Ermolov
6.1.x
High
Alex Ermolov

Bug Description

As part of fixing bug 1461036 we set [cache]memcache_dead_retry and [memcache]dead_retry to 30 (https://review.openstack.org/#/c/190248/). It was a big mistake. Due to this change keystone checks dead memcache servers every 30 seconds, causing socket_timeout delay every 30 seconds. This badly affects user experience during node failure, as was shown by recent escalation from one of our big customers.

Setting dead_retry to 30 does not even solve the initial problem, when keystone was not able to find a token set by woken up keystone.

Consider the following situation:

1. keystone-1, keystone-2, keystone-3; memcache-1, memcache-2, memcache-3.
2.
keystone-1 marks alive memcache-1, memcache-2, memcache-3;
keystone-2 marks alive memcache-1, memcache-2, memcache-3.
keystone-3 marks alive memcache-1, memcache-2, memcache-3.
haproxy knows about keystone-1, keystone-2, keystone-3.
3. keystone-3 and memcache-3 (being on the same controller) go down.
keystone-1 marks alive memcache-1, memcache-2, marks dead for N seconds memcache-3
keystone-2 marks alive memcache-1, memcache-2, marks dead for N seconds memcache-3
haproxy knows about keystone-1, keystone-2.
4. keystone-3 and memcache-3 immediately go up.
keystone-1 marks alive memcache-1, memcache-2; memcache-3 is marked as dead and will be considered dead for N seconds
keystone-2 marks alive memcache-1, memcache-2, memcache-3 is marked as dead and will be considered dead for N seconds
keystone-3 (haproxy marks it alive in 6-7 seconds after going up) marks alive memcache-1, memcache-2 AND memcache-3.
haproxy knows keystone-1, keystone-2, keystone-3,

Since keystone-3 knows that memcache-3 is alive it writes its token there.
keystone-1 does not know that memcache-3 is alive. It will look for requested token in memcache-1 and memcache-2, will not find it and will raise 401 Unauthorized.

We should set [cache]memcache_dead_retry and [memcache]dead_retry to 300. This will increase the time of understanding that the node went up. To prevent the situation described above I suggest to set haproxy settings for keystone like this:

check inter 10s fastinter 2s downinter 3s rise 150 fall 3

note 150 in `rise`. This setting guarantees that keystone-3 will not be marked as alive for 300 seconds after going up, solving our issue.

Changed in mos:
milestone: 6.1 → 6.1-updates
assignee: Fuel Library Team (fuel-library) → Oleksiy Molchanov (omolchanov)
status: New → Confirmed
Changed in mos:
milestone: 6.1-updates → 7.0
Bogdan Dobrelya (bogdando) wrote :

IIUC, fernet tokens in the Kilo will not require memcached backend for the 7.0. If so, this issue is not applicable for the 7.0

Changed in mos:
status: Confirmed → Invalid
Boris Bobrov (bbobrov) wrote :

No, it is valid for 7.0. This is also about cache, which still will be with fernet tokens. It will also generally be good for memcache backend for tokens if someone would still like to use it.

Changed in mos:
status: Invalid → Incomplete
status: Incomplete → Confirmed
Oleksiy Molchanov (omolchanov) wrote :
Changed in mos:
status: Confirmed → In Progress
Changed in mos:
status: In Progress → Fix Committed
OSCI Robot (oscirobot) wrote :

NOTE: Changeset is not merged, created temporary package repository.
RPM package fuel-library6.0 has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0-6.0.0-6206.2.gerrit207360.1.git0fcf43f.noarch.rpm
fuel-library6.0-6.0.0-6206.2.gerrit207360.1.git0fcf43f.noarch.rpm
Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable-LP1471318/centos .

OSCI Robot (oscirobot) wrote :

NOTE: Changeset is not merged, created temporary package repository.
DEB package fuel-library has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0_6.0.0-6206.2.gerrit207360.1.git0fcf43f_all.deb
fuel-library6.0_6.0.0-6206.2.gerrit207360.1.git0fcf43f_all.deb
Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable-LP1471318/ubuntu .

Alex Ermolov (aermolov) wrote :

Marked {5.1, 5.1.1, 6.0}-updates as invalid because we have no way to deliver such updates for existing installations prior to 6.1.

Verified on MOS 7.0 ISO #265.

Changed in mos:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers