Request for HA deployment

Bug #1284002 reported by Nastya Urlapova
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Nikolay Fedotov

Bug Description

Nikolay,
plaese on iso #181 check scenarios in HA deployment with neutron:
- reboot primary controller
- reboot non-primary controller

Changed in fuel:
milestone: none → 4.1
Mike Scherbakov (mihgen)
Changed in fuel:
status: New → Confirmed
information type: Private Security → Public
Revision history for this message
Nikolay Fedotov (nfedotov) wrote :

Bellow you can find scenarios and test results. They were carried on in order they are listed and may depend on each other.
Preconditions:
- Deployed environment: CentOS, HA mode, Neutron with GRE segmentation, 3 controllers, 2 compute, 1 cinder

#1.
- Deploy environment

Result:
l3 agent is not started on the second controller

#2.
- Launch instance with both net04 and net04_ext attached to it.
- Try to ping the instance from controller (controller has configured interfaces in network namespace) with "ip netns exec qrouter-XXX ping <instance ip>"

Result:
No response was received. But OSTF "Check that VM is accessible via floating IP address" is passed.

#3.
- Turn off the controller (controller has configured interfaces in network namespace). It is assumed it is possible to ping an instance from the controller

Result:
Interfaces of a virtual router were not configure / migrated to other controller

#4.
- Turn on the controller.

Result:
There is no qdhcp-XXX network namespace.

Revision history for this message
Nikolay Fedotov (nfedotov) wrote :
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

1. L3 agent may not start on the second controller, as it is not strictly required, right after the deployment - this is expected behaviour
2. This is the expected behaviour of Neutron.
3. We need to see the logs and which action were taken to ensure the interfaces are not up
4. Q-dhcp namespace will be created only after dhcp-agent is started on the controller and by default Pacemaker will not migrate resources to the newer controller -

I am closing this bug as invalid - please pick only the 3rd case and provide all the required bug information.

Changed in fuel:
status: Confirmed → Invalid
Mike Scherbakov (mihgen)
Changed in fuel:
status: Invalid → Incomplete
importance: Critical → High
Revision history for this message
Mike Scherbakov (mihgen) wrote :

Nick, please try the following instead:
1) deploy HA w/Neutron, run OSTF. make sure all green. Provision VM, start pinging Internet.
2) reboot primary controller (find one with router), and run OSTF at the same time. should be green, at least for a second run (after certain timeout). ping from VM should not be interrupted for more than 30 sec (Vladimir, how long does it take to switch router?)
3) repeat #1, #2 action with non-primary controller.

Revision history for this message
Nikolay Fedotov (nfedotov) wrote :

Test results:
>> 1) deploy HA w/Neutron, run OSTF. make sure all green. Provision VM, start pinging Internet.
All OSTF tests are green. Possible to ping internet from a VM

>> 2) reboot primary controller (find one with router), and run OSTF
All OSTF tests are red / failed
>> ping from VM should not be interrupted for more than 30 sec
Instance can not ping internet. "ping: bad address"

Revision history for this message
Nikolay Fedotov (nfedotov) wrote :
Revision history for this message
Mike Scherbakov (mihgen) wrote :

Nikolay, we need to go deeper :) We need to localize where issues happen. https://bugs.launchpad.net/fuel/+bug/1285449 could be the reason of the failure.

Changed in fuel:
assignee: Nikolay Fedotov (nfedotov) → Sergii Golovatiuk (sgolovatiuk)
assignee: Sergii Golovatiuk (sgolovatiuk) → Nikolay Fedotov (nfedotov)
Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.