ovn vip should not be tied to haproxy

Bug #1841811 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

At the time we merged the OVN work in tripleo we thought that there was value in just leveraging the haproxy VIP code bits.

This means that OVS-DB will simply reuse the VIP that is being created for the network on which it is running (typically internal_api).

This was largely a mistake and we should not have done that because that VIP is bound to the haproxy service (i.e. that VIP can only be on the role where haproxy is, aka the controller). That means that 1) we currently cannot split off ovn-dbs to a separate role/node and 2) we have an extra unneeded constraint on this VIP with haproxy, which is something we prolly do not
want.

This was actually caught as real-world problem in rhbz#1728118 because during a deploy when the OVN resource gets created at step3 it will force a move of the internal VIP which then makes all connections to mysql (amongst others) fail.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/679244

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
milestone: train-3 → ussuri-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)
Download full text (3.7 KiB)

Reviewed: https://review.opendev.org/678168
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=139334c200cc2a887fadf2daf0d24aa0bb3d1a29
Submitter: Zuul
Branch: master

commit 139334c200cc2a887fadf2daf0d24aa0bb3d1a29
Author: Michele Baldessari <email address hidden>
Date: Tue Sep 10 09:16:06 2019 +0200

    Move OVN VIP to all_nodes and consider if OVN is configured for a separate VIP

    In the same vein as I7ca94dff4acf0816708110b9fe6f78d19dcc7b4d
    (Move redis_vip to all_nodes.j2) we want to have the ovn_dbs_vip moved
    to all nodes. Because we ultimately want to revert
    I0d9eb663405d1113ea84e3c12651a3f0dbdfc75d (Add OvnDbInternal to EndpointMap
    and use it for ovn_db_host) as that forces all connections to OVN to go
    through the haproxy vip.

    We set ovn_dbs_vip to:
    A) the ovn_dbs network VIP if net_vip_map.ovn_dbs is not defined
    B) to the separate VIP if net_vip_map.ovn_dbs is defined

    The separate OVN VIP THT patch that makes the OVN VIP separate is
    in I620e37117c26b5b51bf9e1eda91daeb00fdf0f43.

    We choose this approach of setting the ovn_dbs_vip in two different
    cases because it makes landing all the patches to have OVN live with
    a separate VIP a lot simpler (less dependencies).

    Testes as follows:
    1) Deployed a vanilla tripleo master deployment with this patch and
    observed that the OVN VIP is set to the VIP on the internal_api network:
    [root@overcloud-controller-2 ~]# pcs resource show ovn-dbs-bundle |grep $(hiera -c /etc/puppet/hiera.yaml ovn_dbs_vip)
       Attributes: inactive_probe_interval=180000 manage_northd=yes master_ip=172.16.2.168 nb_master_port=6641 sb_master_port=6642
    [root@overcloud-controller-2 ~]# pcs resource show ovn-dbs-bundle |grep $(hiera -c /etc/puppet/hiera.yaml internal_api_virtual_ip)
       Attributes: inactive_probe_interval=180000 manage_northd=yes master_ip=172.16.2.168 nb_master_port=6641 sb_master_port=6642

    2) Also applied I4e4bf0a91751fb4f9e4c7233242cdc5649c421f8 "Revert Add
    OvnDbInternal to EndpointMap and use it for ovn_db_host" and observed
    that the deploy completes fine and that the OVN VIP is still pointing to
    the to the internal_api one:
    [root@overcloud-controller-2 ~]# pcs resource show ovn-dbs-bundle |grep $(hiera -c /etc/puppet/hiera.yaml ovn_dbs_vip)
       Attributes: inactive_probe_interval=180000 manage_northd=yes master_ip=172.16.2.220 nb_master_port=6641 sb_master_port=6642
    [root@overcloud-controller-2 ~]# pcs resource show ovn-dbs-bundle |grep $(hiera -c /etc/puppet/hiera.yaml internal_api_virtual_ip)
       Attributes: inactive_probe_interval=180000 manage_northd=yes master_ip=172.16.2.220 nb_master_port=6641 sb_master_port=6642

    3) Also applied I620e37117c26b5b51bf9e1eda91daeb00fdf0f43
    "OVN DBS separate vip" (tht) and Ic62b0fbc0fee40638811a5cd77a5dc5a4d82acf5
    "OVN separate vip" (puppet-tripleo) with this review and observed that the OVN VIP is
    not the same as the internal_api one:
    [root@overcloud-controller-2 hieradata]# hiera -c /etc/puppet/hiera.yaml ovn_dbs_vip
    172.16.2.4
    [root@overcloud-controller-2 hieradata...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/682294

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/679244
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e3b528af4fa664fbcac0598cf1fb3d50107b58a0
Submitter: Zuul
Branch: master

commit e3b528af4fa664fbcac0598cf1fb3d50107b58a0
Author: Michele Baldessari <email address hidden>
Date: Tue Sep 10 10:16:32 2019 +0200

    Revert Add OvnDbInternal to EndpointMap and use it for ovn_db_host

    We revert I0d9eb663405d1113ea84e3c12651a3f0dbdfc75d and we instead
    export ovn_dbs_vip on all nodes so it can be used in cells. Reason for this
    is that we want a separate VIP for OVN because a) composable roles and b)
    we do not want to impose the extra promote master constraints on the internal_api
    VIP which ends up being used by OVN.

    In the same vein as I7ca94dff4acf0816708110b9fe6f78d19dcc7b4d
    (Move redis_vip to all_nodes.j2) we will have the ovn_dbs_vip moved
    to all nodes (via I1d80587752ffca6c3eb5281aa89ea3d7cf5535ce).

    Depends-On: I1d80587752ffca6c3eb5281aa89ea3d7cf5535ce

    Change-Id: I4e4bf0a91751fb4f9e4c7233242cdc5649c421f8
    Related-Bug: #1841811

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-tripleoclient (master)

Change abandoned by Emilien Macchi (<email address hidden>) on branch: master
Review: https://review.opendev.org/682294
Reason: We are facing gate issue: https://bugs.launchpad.net/tripleo/+bug/1844446

To clear the gate we need to abandon this patch and I will restore once the gate is ready again to land patches in TripleO. Please don't touch this patch, and ask on #tripleo Wes or Emilien for any question. Thanks for your patience.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.opendev.org/682294
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=5c0e05aed6aa94750beba712ecf75617a7774ca6
Submitter: Zuul
Branch: master

commit 5c0e05aed6aa94750beba712ecf75617a7774ca6
Author: Michele Baldessari <email address hidden>
Date: Mon Sep 16 10:17:21 2019 +0200

    Add ovn_dbs_vip to export_data

    Via I4e4bf0a91751fb4f9e4c7233242cdc5649c421f8 we are creating a separate
    OVN VIP (i.e. one independent from haproxy internal_api VIP) so like
    redis_vip we want to add ovn_dbs_vip to the data exported for cell
    support.

    Depends-On: I4e4bf0a91751fb4f9e4c7233242cdc5649c421f8

    Change-Id: I40e7df90fb75916dc78934850620f46faa414e74
    Related-Bug: #1841811

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/682968
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=b008b77e4a6decaecc48cbafa7d14b02900a6982
Submitter: Zuul
Branch: master

commit b008b77e4a6decaecc48cbafa7d14b02900a6982
Author: Michele Baldessari <email address hidden>
Date: Wed Sep 18 21:15:51 2019 +0200

    Fix the non-HA case where OVN has its own VIP

    When OVN has its own VIP, the non-HA case is currently broken.
    Tempest will fail when creating networks as neutron components
    will try to reach the OVN VIP (managed by keepalived)
    that never gets created:
    2019-09-18 16:41:27.780 47975 ERROR ovsdbapp.backend.ovs_idl.idlutils [-] Unable to open stream to tcp:192.168.24.12:6642 to retrieve schema: No route to host: Exception: Could not retrieve schema from tcp:192.168.24.12:6642
    2019-09-18 16:41:27.780 47687 ERROR ovsdbapp.backend.ovs_idl.idlutils [-] Unable to open stream to tcp:192.168.24.12:6642 to retrieve schema: No route to host: Exception: Could not retrieve schema from tcp:192.168.24.12:6642

    With this change we add the correct hiera key for keepalived
    to create the dedicated OVN DBS VIP.

    Related-Bug: #1841811
    Change-Id: I8f7f5534ab3d053ba07d4fd56975e02b3f83eedc

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to puppet-tripleo (master)

Reviewed: https://review.opendev.org/672673
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=6b05849ab9db9efab011bea6d17b374eddf0c0ed
Submitter: Zuul
Branch: master

commit 6b05849ab9db9efab011bea6d17b374eddf0c0ed
Author: Michele Baldessari <email address hidden>
Date: Thu Jul 25 11:21:59 2019 +0200

    Add support for separate VIP in ovn_dbs

    We make the manifest contemplate the fact that ovn_dbs might have
    its own separate DB. This way when I620e37117c26b5b51bf9e1eda91daeb00fdf0f43
    lands and OVN DB will have its own separate VIP, it will all just work.

    We also switch the explicit ordering constraints around pacemaker
    resources to collectors, so the case of non existing resources (like the
    separate VIP for ovn_dbs) will automatically work.

    In the case of separate VIP and stack update we also add code to remove
    the additional constraints that were imposed on the internal API VIP due
    to being part of haproxy.

    Related-Bug: #1841811
    Depends-On: I8f7f5534ab3d053ba07d4fd56975e02b3f83eedc
    Change-Id: Ic62b0fbc0fee40638811a5cd77a5dc5a4d82acf5

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/669847
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=176b30649b18f14818480ba3b6a76cfcf9f3aa26
Submitter: Zuul
Branch: master

commit 176b30649b18f14818480ba3b6a76cfcf9f3aa26
Author: Michele Baldessari <email address hidden>
Date: Wed Jul 24 08:57:36 2019 +0200

    Give the OVN DBS service a separate Vip

    This change (with its dependent reviews) creates a separate VIP for the OVN DBS
    service. A more detailed explanation can be found in https://bugs.launchpad.net/tripleo/+bug/1841811.
    The short explanation is that the OVN DBS HA service puts some additional constraints on the VIP it
    uses and that is problematic when that VIP is used by other services (e.g. a change in OVN DBS master
    will move the VIP and will also reset all mysql connections. It also prevents us splitting OVN DBS from
    where haproxy runs).

    Tested as follows:
    A) Deployed a mster environment with this review and all its dependencies and correctly obtained
    an OVN DBS service with its own Vip and the OVN services
    (controller/metadata) pointing to this separate Vip

    B) Deployed a master environment as is and then applied this review +
    dependencies and observed that a redeploy correctly created a new VIP,
    reconfigured the services to point to the new VIP and that the old
    obsolete constraints created around the per-network VIP were removed

    Closes-Bug: #1841811

    Depends-On: Ic62b0fbc0fee40638811a5cd77a5dc5a4d82acf5
    Change-Id: I620e37117c26b5b51bf9e1eda91daeb00fdf0f43

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.3.0

This issue was fixed in the openstack/tripleo-heat-templates 11.3.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.