Ubuntu HA with neutron, rabbit cluster is completelly broken after failovers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
New
|
Critical
|
Fuel Library (Deprecated) |
Bug Description
{"build_id": "2014-12-
Steps:
1. Deploy Ubuntu ha with neeutron: 3 controllers, 2 computes
2. When cluster ready - run ostf (it passed)
3. Turn off non primary controller, wait while cluster recovers and run ostf (ostf -passed)
4. Turn on non primary controller, sync time on it(if needed) wait while cluster recovers, run ostf (it passed again)
5. Turn of primary controller
6. Wait 20 minutes
7. Run ostf ha suit
Actual result:
test on rabbit failed
crm_mon -1 says next:
Online: [ node-4 node-5 ]
OFFLINE: [ node-1 ]
vip__public (ocf::mirantis:
Clone Set: clone_ping_
Started: [ node-4 node-5 ]
vip__management (ocf::mirantis:
Clone Set: clone_p_heat-engine [p_heat-engine]
Started: [ node-4 node-5 ]
Master/Slave Set: master_
Slaves: [ node-4 node-5 ]
Clone Set: clone_p_
Started: [ node-4 node-5 ]
p_neutron-
Clone Set: clone_p_
Started: [ node-4 node-5 ]
Clone Set: clone_p_
Started: [ node-4 node-5 ]
Clone Set: clone_p_mysql [p_mysql]
Started: [ node-4 node-5 ]
Clone Set: clone_p_haproxy [p_haproxy]
Started: [ node-4 node-5 ]
root@node-
Cluster status of node 'rabbit@node-5' ...
Error: unable to connect to node 'rabbit@node-5': nodedown
DIAGNOSTICS
===========
attempted to contact: ['rabbit@node-5']
rabbit@node-5:
* connected to epmd (port 4369) on node-5
* epmd reports: node 'rabbit' not running at all
* suggestion: start the node
current node details:
- node name: 'rabbitmqctl183
- home dir: /var/lib/rabbitmq
- cookie hash: soeIWU2jk2YNseT
status on node-4
[root@nailgun ~]# ssh node-4
Warning: Permanently added 'node-4' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-40-generic x86_64)
* Documentation: https:/
New release '14.04.1 LTS' available.
Run 'do-release-
Last login: Fri Dec 12 14:42:32 2014 from 10.120.0.2
root@node-4:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-4' ...
[{nodes,
{running_
{cluster_
{partitions,[]}]
...done.
root@node-4:~#
This bug is not a dup of https:/ /bugs.launchpad .net/bugs/ 1394635. I'm working on this case as well as a related fixes for https:/ /bugs.launchpad .net/fuel/ +bug/1396946. So it probably a dup for https:/ /bugs.launchpad .net/fuel/ +bug/1396946