Node doesn't come back online after br-mgmt interface shutdown and recovery

Bug #1367298 reported by Kirill Omelchenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
New
High
Fuel Library (Deprecated)

Bug Description

{

    "build_id": "2014-09-09_00-01-11",
    "ostf_sha": "f7b5d9d0d1cfaba5f1fe1e2c634493e92bce11db",
    "build_number": "505",
    "auth_required": true,
    "api": "1.0",
    "nailgun_sha": "7683df5722975c1cae48a1a3efdad872b4aaace6",
    "production": "docker",
    "fuelmain_sha": "6cdd8c3deaa5e52806a5c75c4177f3b41d157a22",
    "astute_sha": "b622d9b36dbdd1e03b282b9ee5b7435ba649e711",
    "feature_groups": [
        "mirantis"
    ],
    "release": "5.1",
    "fuellib_sha": "203ef3179007cffe3236032e61ecbaf1cd20605f"

}

Steps to reproduce:
1. Deploy ha on Centos with neutron vlan with 3 controllers
2. When deployment finishes successfully, ssh to a controller and see where vips are running (crm_mon -1)
3. ssh to the node where vip__management is running
4. shut down br-mgmt interface (controller A)
5. check if vip__management is recovered with the help of crm_mon -1 on one of two other controllers (controller B)
6. Visit Horizon dashboard, cluster seems to operate well (Can create volumes, instances, etc.)
7. Try to run OSTF
  * Every test fails due to 'Keystone client is not available.'
  * Next errors appear in /var/log/keystone-all.log: http://paste.openstack.org/show/108906/
8. then turn the br-mgmt back on
9. try to run crm_mon -1 on controller A

Expected:
Crm_mon status is shown

Actual output:
Could not establish cib_ro connection: Connection refused (111)

Connection to cluster failed: Transport endpoint is not connected

10. On controller B run crm_mon -1.

Expected:
Status output shows that all three controllers are online, etc.

Actual: http://paste.openstack.org/show/108904/

Revision history for this message
Kirill Omelchenko (komelchenko) wrote :
description: updated
description: updated
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

this is an invalid test case as corosync does not really handle ifconfig down of the interfaces. please, simulate failover by shutting down interfaces on the switch or on the host system.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.