[doc] HA reference architecture should be revised in docs

Bug #1415398 reported by Bogdan Dobrelya
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
High
Fuel Documentation Team
6.0.x
Won't Fix
High
Fuel Documentation Team
6.1.x
Won't Fix
High
Fuel Documentation Team
7.0.x
Won't Fix
High
Fuel Documentation Team
8.0.x
Won't Fix
Medium
Fuel Documentation Team
Future
Won't Fix
High
Fuel Documentation Team
Mitaka
Won't Fix
High
Fuel Documentation Team

Bug Description

This issue is related to https://bugs.launchpad.net/fuel/+bug/1326605 but it is broader and more specific.
Please note, that the most important part is missing HA reference architecture for SDN deployed by Fuel, so I'd like to see this bug with high priority.

HA reference architecture is obsolete and should be revised:

0) The documentation should explain:
* why current HA reference architecture requires at least 3 controller nodes
* what are downsides of 2 nodes + arbitrator node case
* should contain a reference to the Openstack HA-guide http://docs.openstack.org/high-availability-guide/content/index.html which explains what is A/A, and what is Active/Passive

1) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#multi-node-with-ha-deployment
* The drawing http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/deployment-ha-compact.svg is obsolete:
* MySQL cluster should be depicted as active/standby and the notes about Openstack support status for multi-master writes should be provided in comments as well.
* RabbitMQ cluster should be depicted as active/active

2) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#how-fuel-deploys-ha
* The info about RabbitMQ AMQP cluster ha reference architecture is missing. I guess it should be somewthere nearby of http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#mysql-and-galera

3) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#ha-logical-setup
* The drawing http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/logical-diagram-controller.svg should be done for 3 controllers cluster instead of 2, if that would not overcomplicated the drawing too much.
* Depicted MySQL cluster should be changed to active/standby and the notes about Openstack support status for multi-master writes should be provided in comments as well.
* Need details about HA for Neutron agents (some details could be found here: https://github.com/stackforge/fuel-docs/blob/master/pages/frequently-asked-questions/0300-other-questions.rst but are obsolete).
This issue looks critical as HA for SDN is the most important part from the ops perspective. For example, then DB and Messaging cluster complete shutdown and the cloud cannot process any requests, the running instances must be able to use L3 services, such as routing, at least. Notes about HA for agents could be provided here as well http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#network-architecture

4) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#details-of-multi-node-with-ha-deployment
* The drawing http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/ha-overview.svg is obsolete:
* Most of Openstack services do not use haproxy for AMQP anymore and are directly connected to controllers as a shifted many-to-many mesh)
* Memcached should be depicted as well as it plays an important role for caching requests and tokens
* Ceilometer with Mongo cluster should be also depicted

5) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#how-ha-with-pacemaker-and-corosync-works
- Pacemaker info is obsolete and should be revised - need details about multi mode for neutron agents, details about rabbitmq, description of pacemaker resources is inaccurate.
- Corosync info:
* Need details about pacemaker startup mode (as a plugin for Corosync, ver: 0) should be added.
* Need details about Fuel Astute orchestraion hooks for scaling the Corosync cluster's node list

6) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#neutron-with-gre-segmentation-and-ovs
Perhaps some tuning notes, such as GRO (generic receive offload) on physical NICs when using the Neutron GRE network topology, could be put here as well.

7) http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#advanced-network-configuration-using-open-vswitch
Perhaps some generic notes about how to integrate Fuel with existing userspace OVS accelaration solutions could be put here as well.

8) The HA failover "SLA" should be described, ( https://bugs.launchpad.net/fuel/+bug/1384285 ) for example:
- For Neutron agents (directly impacts instances connectivity), VIPs, MySQL, RabbitMQ, API endpoints:
* which is the cloud control plane impact, like full downtime, and which is impact for the data plane, like all instances temporary L3+ isolation?
* how much could it take? (30 seconds for API endpoints, for example)
* how many failures of controller nodes, in a cascade fashion, the cluster could survive without a complete downtime?

9) Networking configuration section should explain all namespaces, bridges and interfaces Fuel configures at the networking deployment stage, such as:
* why does br-ex-hapr have the same IP as br-ex
* how traffic flow works from outside to the haproxy

actual result
version
expected result
steps to reproduce
free
no sms and registration

Changed in fuel:
milestone: none → 6.0.1
assignee: nobody → Fuel Documentation Team (fuel-docs)
importance: Undecided → High
description: updated
Revision history for this message
Meg McRoberts (dreidellhasa) wrote : Re: [Bug 1415398] [NEW] HA reference architecture should be revised in docs
Download full text (4.5 KiB)

Thanks! I'll study this more carefully tomorrow. But this should give me a
very good start at fixing this mess!

On Wed, Jan 28, 2015 at 2:11 AM, Launchpad Bug Tracker <
<email address hidden>> wrote:

> Bogdan Dobrelya (bogdando) has assigned this bug to you for Fuel for
> OpenStack:
>
> This issue is related to related
> https://bugs.launchpad.net/fuel/+bug/1326605 but it is broader and more
> specific.
> Please note, that the most important part is missing HA reference
> architecture for SDN deployed by Fuel, so I'd like to see this bug with
> high priority.
>
> HA reference architecture is obsolete and should be revised:
>
> 1)
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#multi-node-with-ha-deployment
> * The drawing
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/deployment-ha-compact.svg
> is obsolete:
> * MySQL cluster should be depicted as active/standby and the notes
> about Openstack support status for multi-master writes should be provided
> in comments as well.
> * RabbitMQ cluster should be depicted as active/active
>
> 2)
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#how-fuel-deploys-ha
> * The info about RabbitMQ AMQP cluster ha reference architecture is
> missing. I guess it should be somewthere nearby of
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#mysql-and-galera
>
> 3)
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#ha-logical-setup
> * The drawing
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/logical-diagram-controller.svg
> should be done for 3 controllers cluster instead of 2, if that would not
> overcomplicated the drawing too much.
> * Depicted MySQL cluster should be changed to active/standby and the
> notes about Openstack support status for multi-master writes should be
> provided in comments as well.
> * Need details about HA for Neutron agents (some details could be
> found here:
> https://github.com/stackforge/fuel-docs/blob/master/pages/frequently-asked-questions/0300-other-questions.rst
> but are obsolete).
> This issue looks critical as HA for SDN is the most important part from
> the ops perspective. For example, then DB and Messaging cluster complete
> shutdown and the cloud cannot process any requests, the running instances
> must be able to use L3 services, such as routing, at least. Notes about HA
> for agents could be provided here as well
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#network-architecture
>
> 4)
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/reference-architecture.html#details-of-multi-node-with-ha-deployment
> * The drawing
> http://docs.mirantis.com/openstack/fuel/fuel-6.0/_images/ha-overview.svg
> is obsolete:
> * Most of Openstack services do not use haproxy for AMQP anymore and
> are directly connected to controllers as a shifted many-to-many mesh)
> * Memcached should be depicted as well as it plays an important role
> for caching requests and tokens
> * Ceilometer with Mongo cluster should be also depicted
>
> 5)
> htt...

Read more...

Changed in fuel:
status: New → Confirmed
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-docs (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/153283

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/154130

tags: added: ha-guide
Changed in fuel:
milestone: 6.0.1 → 6.1
status: Confirmed → New
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-docs (master)

Reviewed: https://review.openstack.org/154130
Committed: https://git.openstack.org/cgit/stackforge/fuel-docs/commit/?id=27625571f30f6660db7ebf54730ec72c4cd88048
Submitter: Jenkins
Branch: master

commit 27625571f30f6660db7ebf54730ec72c4cd88048
Author: Bogdan Dobrelya <email address hidden>
Date: Mon Feb 9 18:08:35 2015 +0100

    Add HA deployment for Networking section

    Add basic content for Networking HA section.
    Provide decription of haproxy namespace and VIP
    HA failover details.
    TODO - provide details for Neutron agents HA, failover,
    cleanup and reschedule procedures.

    Related bug: #1415398

    Change-Id: I4485da25bdbab98c9496e7c13598c2dcd1ea2eb6
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/153283
Committed: https://git.openstack.org/cgit/stackforge/fuel-docs/commit/?id=70d412ce5f256f4091b2478956107f719e547c7d
Submitter: Jenkins
Branch: master

commit 70d412ce5f256f4091b2478956107f719e547c7d
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Feb 5 17:24:12 2015 +0100

    Describe HA controller role more precisely

    * Heat is a core component and always installed.
    * Describe which HA components Fuel configures as
      Active/Active and which ones as Active/Passive.
    * Fix controller node HA deployment details for RabbitMQ, MySQL,
      HAproxy, Neutron agents.
    * Describe why it is required >=3 controller nodes for HA
    * Provide a reference to HA-guide as we don't want to duplicate
      the description of basic HA concepts in Fuel docs.
    * Fix terminology for HA stanza, add missing term references
    * Remove duplicating sections for controller node description and
      terms

    Related bug: #1415398

    Change-Id: I5b1d6830b43de871120157448c6df68ccd327c9d
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: New → Confirmed
description: updated
description: updated
tags: added: ha
Changed in fuel:
status: Confirmed → Triaged
description: updated
description: updated
tags: added: docs
summary: - HA reference architecture should be revised in docs
+ [doc] HA reference architecture should be revised in docs
description: updated
Revision history for this message
Sheena Conant (sheena-conant) wrote :

Is there any reason we should be backporting these updates to 6.0/6.1? I would recommend leaving the old versions as they are and updating docs in time for 8.0.

Changed in fuel:
milestone: 6.1 → 9.0
status: Triaged → New
tags: added: area-docs
Revision history for this message
Michele Fagan (michelefagan) wrote :

Request for new and rewritten content. Moved to Medium priority. Will be addressed in 9.0.

Revision history for this message
Bug Checker Bot (esikachev-l) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
description: updated
tags: removed: need-info
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 9.0 → 10.0
Changed in fuel:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.