Keystone becomes not operatable if there is not connectivity on br-mgmt

Bug #1438279 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
Critical
MOS Keystone

Bug Description

{"build_id": "2015-03-26_09-08-29", "ostf_sha": "a4cf5f218c6aea98105b10c97a4aed8115c15867", "build_number": "231", "release_versions": {"2014.2-6.1": {"VERSION": {"build_id": "2015-03-26_09-08-29", "ostf_sha": "a4cf5f218c6aea98105b10c97a4aed8115c15867", "build_number": "231", "api": "1.0", "nailgun_sha": "7f0e0af1f54db840230745ee4f7aec6824dac9b9", "production": "docker", "python-fuelclient_sha": "e5e8389d8d481561a4d7107a99daae07c6ec5177", "astute_sha": "631f96d5a09cc48bfbddcbf056b946c8a80438f0", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "320b5f46fc1b2798f9e86ed7df51d3bda1686c10", "fuellib_sha": "345a98b34dd0cd450a45d405ac47a6a9fa48b6d8"}}}, "auth_required": true, "api": "1.0", "nailgun_sha": "7f0e0af1f54db840230745ee4f7aec6824dac9b9", "production": "docker", "python-fuelclient_sha": "e5e8389d8d481561a4d7107a99daae07c6ec5177", "astute_sha": "631f96d5a09cc48bfbddcbf056b946c8a80438f0", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "320b5f46fc1b2798f9e86ed7df51d3bda1686c10", "fuellib_sha": "345a98b34dd0cd450a45d405ac47a6a9fa48b6d8"}

Steps to reproduce:
1. Deploy Ha on Centos with neutron
- 3 controllers
- 2 computes
2. When cluster ready run ostf ha, smoke and sanity suites
3. As soon as tests passed ssh on any controller and block input/output traffic on br-mgmt
4. Wait until cluster recovers after fail-over (I waiting for ~30 minutes)
5. manually check rabbitmq health and galera health, check crm
6. Try to login in horizon

Actual result:
Authorization failed. ssh on node and execute . openrc nova list. Command failed with 401 from keystone.
execute telnet to memcached on each controller. telned on controller where we block traffic failed(and it is expected), on other 2 controllers we can successfully connect to memcached.
On controller where we block traffic check haproxy backends for keystone
[root@node-5 ~]# haproxy-status | grep keystone
2015/03/30 10:17:58 socat[4902] E connect(3, AF=1 "/var/lib/haproxy/stats", 24): Connection refused
[root@node-5 ~]#
check haproxy-backends for keystone from healthy controller:
root@node-2 ~]# haproxy-status | grep keystone
keystone-1 FRONTEND Status: OPEN Sessions: 0 Rate: 0
keystone-1 node-2 Status: UP/L7OK Sessions: 0 Rate: 0
keystone-1 node-4 Status: UP/L7OK Sessions: 0 Rate: 0
keystone-1 node-5 Status: DOWN/L4TOUT Sessions: 0 Rate: 0
keystone-1 BACKEND Status: UP Sessions: 0 Rate: 0
keystone-2 FRONTEND Status: OPEN Sessions: 0 Rate: 0
keystone-2 node-2 Status: UP/L7OK Sessions: 0 Rate: 0
keystone-2 node-4 Status: UP/L7OK Sessions: 0 Rate: 0
keystone-2 node-5 Status: DOWN/L4TOUT Sessions: 0 Rate: 0
keystone-2 BACKEND Status: UP Sessions: 0 Rate: 0
[root@node-2 ~]#
(node-5 out node with blocked traffic)

run command . openrc nova-list one more time from healthy controller - result 401 from keystone

edit keystone.conf on both healthy controller - remove here node with failed memcached from section [memchache] and [cache], restart keystone on both controllers, run command . openrc keystone token get - it is passed
rum command . openrc nova -list - it failed with 401 error from keystone, user can not pass authorization in horizon, services also failed to communicate (according keystone send all the time 401 error)
http://paste.openstack.org/show/197570/

Tags: ha keystone
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Boris Bobrov (bbobrov) wrote :

Why is it "critical"?

Revision history for this message
Alexander Makarov (amakarov) wrote :

At first glance it looks like tokens getting lost

Boris Bobrov (bbobrov)
Changed in mos:
assignee: MOS Keystone (mos-keystone) → Boris Bobrov (bbobrov)
Revision history for this message
Boris Bobrov (bbobrov) wrote :

driver=keystone.token.backends.memcache.Token is in keystone.conf.

This is a possible duplicate of bug https://bugs.launchpad.net/fuel/6.0.x/+bug/1405549 .

Boris Bobrov (bbobrov)
Changed in mos:
assignee: Boris Bobrov (bbobrov) → MOS Keystone (mos-keystone)
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

After discussion in Alex M move issue to incomplete for reproduce on fresh iso.

Changed in mos:
status: New → Incomplete
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

{"build_id": "2015-03-30_12-53-07", "ostf_sha": "674a08e57a451c902b9ad27edd7e57a6b8f36d4a", "build_number": "249", "release_versions": {"2014.2-6.1": {"VERSION": {"build_id": "2015-03-30_12-53-07", "ostf_sha": "674a08e57a451c902b9ad27edd7e57a6b8f36d4a", "build_number": "249", "api": "1.0", "nailgun_sha": "8bc89eee197089ae38a023dd0215caae219f24b1", "production": "docker", "python-fuelclient_sha": "05ec53f94206decdce19bb9373523022e5616b83", "astute_sha": "f595715750a2c4820722a96e0236f5c89ca6521c", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "320b5f46fc1b2798f9e86ed7df51d3bda1686c10", "fuellib_sha": "6d366b4e7d2d6722c245c4691a6605e2e3bc3b4a"}}}, "auth_required": true, "api": "1.0", "nailgun_sha": "8bc89eee197089ae38a023dd0215caae219f24b1", "production": "docker", "python-fuelclient_sha": "05ec53f94206decdce19bb9373523022e5616b83", "astute_sha": "f595715750a2c4820722a96e0236f5c89ca6521c", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "320b5f46fc1b2798f9e86ed7df51d3bda1686c10", "fuellib_sha": "6d366b4e7d2d6722c245c4691a6605e2e3bc3b4a"}

look at the operc file and got the reason , each keystone instance work over it own internal ip (not public vip), that leads to such strange behavior

Changed in mos:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.