ovn-octavia-provider associates to wrong logical router when LogicalRouterPortEvent is fired and network has multiple routers

Bug #1966052 reported by Gabriel Barazer
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Hi all,

When a loadbalancer is created with a VIP in a network having multiple routers, the LogicalRouterPortEvent event sent by OVN NB associates the floating VIP to all the routers' external ports, which in turn causes the FIP to be ARP announced on all these external ports. This causes traffic disruption because the floating VIP traffic can be then routed to the wrong router. This can be verified by looking at the ARP replies on the (public) network on which the external router interfaces are located, where we can see the wrong MAC address is sent.

This is caused by the LogicalRouterPortEvent event handler (event.py:25) which in turn calls lb_create_lrp_assoc_handler (https://opendev.org/openstack/ovn-octavia-provider/src/commit/d6adbcef86e32bc7befbd5890a2bc79256b7a8e2/ovn_octavia_provider/helper.py#L237) and then sends a request to https://opendev.org/openstack/ovn-octavia-provider/src/commit/d6adbcef86e32bc7befbd5890a2bc79256b7a8e2/ovn_octavia_provider/helper.py#L248

When the event is fired for a Logical Router Port not owned by the router owning the default gateway, lb_create_lrp_assoc adds all the load balancers having a VIP in the associated network to the router by calling _update_lb_to_lr_association which uses the lr_lb_add method and updates the load balancer external ids "lr_refs" key.
This behavior can be verified by looking a the load balancer external ids and the "load_balancer" key of the router owning the logical router port event fired.

During a logical router port event, the load balancer ids must be added only if the router is the router owning the default gateway for the network, using the same logic found in method _find_lr_of_ls.

tags: added: ovn-octavia-provider
Revision history for this message
Fernando Royo (froyoredhat) wrote :
Download full text (5.3 KiB)

I have been doing some tests about this bug and I have come to the following conclusions:

- As you said when the LogicalRouterPortEvent (add interface to N2 to R1) is triggered these actions are taking place for each LB present on the new network (N2) that is not present on the router (R1):

 1 - The unique LBs on the network (N2) are added to the router's (R1) LB field OVN NB DB
 2 - The unique LBs on the network (N2) are added to the Logical_Switch LB field OVN NB DB related to router (R1)
 3 - The external_ids.lr_ref field of the LB been iterated is modified by adding the affected router (R1)
 4 - The unique LBs on the router (R1) are added to the Logical_Switch LB field OVN NB DB related to network (N2)

so taking that into consideration and reading several times your comment, I understand that your proposal it is as follow:

- Step 1, 2 and 4 keeps same
- Step 3 needs to be modified and only modify the lr_ref of the LB if the router (R1) it is the default gateway of network (N2)

Am I right?

Assuming above change, we can see some examples (I hope I manage to explain myself and don't mess up :D)

case a) Different networks (nx, snx and rx created and after that LBx created)

              LB=LB1 LB=LB2
              ┌─────┐ ┌─────┐
              │n1 │ │ n2 │
              └──┬──┘ └──┬──┘
                 │ │
                 │ │
              ┌──┴──┐ ┌──┴──┐
    LB1 ─────►sn1 │ │ sn2 ◄──── LB2
              └──┬──┘ └──┬──┘
 ls_ref=n1 │ │ ls_ref=n2
 lr_ref=r1 │ │ lr_ref=r2
              ┌──┴──┐ ┌──┴──┐
              │r1 │ │ r2 │
              └─────┘ └─────┘
               LB=LB1 LB=LB2

─────────────────────────────────────────────────────────────────────────────
When we add a interface from r1 to n2, this is the final result:

               LB=LB1,LB2 LB=LB2,LB1
               ┌─────┐ ┌─────┐
               │n1 │ │ n2 │
               └──┬──┘ └──┬──┘
                  │ │
                  │ │
               ┌──┴──┐ ┌──┴──┐
     LB1 ─────►sn1 │ ┌────┤ sn2 ◄──── LB2
               └──┬──┘ │ └──┬──┘
  ls_ref=n1 │ │ │ ls_ref=n2
  lr_ref=r1 │ │ │ lr_ref=r2
               ┌──┴──┐ │ ┌──┴──┐
               │r1 ├──────┘ │ r2 │
               └─────┘ └─────┘
                LB=LB1,LB2 LB=LB2

case b) Same network (nx, snx and rx created and after that LBx created)

                    LB=LB1,LB2
              ┌───────────────────┐
              │ n1 │
              └──┬─────────────┬──┘
                 │ │
                 │ │
              ┌──┴──┐ ┌──┴──┐
    LB1 ─────►sn1 │ │ sn2 ◄──── LB2
              └──┬──┘ └──┬──┘
 ls_ref=n1 │ │ ls_ref=n1
 lr_ref=r1 │ │ lr_ref=r2
              ┌──┴──┐ ┌──┴──┐
              │r1 │ │ r2 │
              └─────┘ └─────┘
               LB=LB1 LB=LB2
 ────────────────────────...

Read more...

Revision history for this message
Fernando Royo (froyoredhat) wrote :

As I guessed the drawing was not very clear hehe here it is better one, sorry!

https://paste.openstack.org/show/814193/

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

two questions here Gabriel
1) if the router is not associated to the provider network, I suppose we still need to add the loadbalancer to that router right? in case there are members in any network connected to that router
2) could it be that the problem was fixed by this? (for the VIP to FIP association) https://review.opendev.org/c/openstack/neutron/+/832594

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

In fact, I don't fully get the problem or better say the deployment

You have 1 provider network (prov1) with 2 routers connected to it (r1 and r2), then you have subnet/network 1 connected to r1 and to r2, and then a LB VIP on n1/s1, where you later associate s FIP to it. Something like

    provider
     / \
    r1 r2
    | /
    | /
    | /
   n1/s1
    |
   LB VIP

Both routers have provider as default gateway in that case. And in the case the members are in a subnet connected to r2, you also need to add the loadbalancer there:

    provider
     / \
    r1 r2
    | / \
    | / \
    | / |
   n1/s1 n2/s2
    | |
   LB VIP LB member

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.