[RFE] Add l2pop support for floating IP resources

Bug #1803494 reported by ChenjieXu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Wishlist
ChenjieXu

Bug Description

Layer 2 population (l2pop) mechanism driver implements the ML2 driver to improve open source plugins overlay implementations (VXLAN with Linux bridge and GRE/VXLAN with OVS)[1]. L2pop avoid the broadcast in MAC learning and ARP resolution by prepopulate the bridge forwarding table[2]. However l2pop doesn’t have support for floating IP resources. If the floating IP resources can be prepopulated, the broadcast for scenarios when two VM instances, residing on different networks, communicate via their respective FIP addresses can be avoided.

Problem Description
==========================================================
Figure-1 illustrates the scenarios for floating IP. The IP address “182.34.4.2” in Port3 is the IP address in floating IP resources. When the client in the external network tries to access VM1, the destination IP “182.34.4.2” is replaced by the IP address “10.0.0.3” of the Port1. Notice that the IP address of Port1 is the IP address of VM1. When VM1 send packets back to client, the source IP “10.0.0.3” will be replaced by “182.34.4.2”.

                         (Attached in the comment)
                                Figure-1
                       Port1 10.0.0.3, Port2 10.0.0.1
                       Port3 182.34.4.2, Port4 188.34.4.1

When users want to use floating IP, they should first require allocating a floating IP from floating IP pool or choose a pre allocated floating IP. Allocating floating IP will create a new port and also updates the new port. Creating port will never trigger l2pop but updating port will trigger l2pop on the condition that the port’s new status is ACTIVE or DOWN. However the status of the port for floating IP will always be N/A. Thus allocating floating IP won’t trigger l2pop.

After getting the available floating IP, users can require to associate the VM and the floating IP. Associating the IP address of the VM with the floating IP only updates the floating IP object and doesn’t update the port for floating IP. Thus associating the VM and the floating IP won’t trigger l2pop.

Based on the above analysis, the FDB of floating IP is not prepopulated. It seems that to we can simply change the status of Port3 from N/A to ACTIVE/DOWN to prepopulate the pair (MAC, IP) of Port3 . But we can’t do that because the MAC address of Port3 is never used. In the non-DVR scenario, the MAC address of router gateway will be replied to answer the ARP request for floating IP. In the DVR scenario, the MAC address of floating IP agent gateway will be replied to answer the ARP request for floating IP.

In the following use case, ARP request will be sent out to query the MAC address related to the specific floating IP. Figure-2 illustrates an environment (DVR disabled) that there are two network nodes and two compute nodes.

                         (Attached in the comment)
                                  Figure-2
                      Port1 10.0.0.1, Port2 198.0.0.2
                      Port5 182.34.4.5, Port6 182.34.4.6

Use Case 1
1. Tenant-1 creates Network-1 and Tenant-2 creates Network-2.
2. Tenant-1 creates Subnet-1 belonging to Network-1 and Tenant-2 creates
   Subnet-2 belonging to Network-2.
3. Tenant-1 creates Router-1 and links the Router-1 to the external provider
   network External_Provider_Network. Tenant-2 creates Router-2 and links the
   Router-2 to the same external provider network External_Provider_Network.
4. Tenant-1 links the Router-1 to the Subnet-1 and Tenant-2 links Router-2 to
   the Subnet-2.
5. Tenant-1 creates VM-1 in Subnet-1 and Tenant-2 creates VM-2 in Subnet-2.
6. Tenant-1 creates FloatingIP-1 and Tenant-2 creates FloatingIP-2.
7. Tenant-1 associates VM-1 with FloatingIP-1 and Tenant2 associates VM-2 with
   FloatingIP-2.
8. VM-1 communicates VM-2 by floating IP. For example: VM-1 pings FloatingIP-2
   for the first time.

VM-1 has the IP address “10.0.0.1” which is the IP address of Port1 and VM-2 has the IP address “198.0.0.2” which is the IP address of Port2. FloatingIP-1 has the IP address “182.34.4.5” and FloatingIP-2 has the IP address “182.34.4.6”(The related ports are not drawn). The MAC address of Router-1 gateway(Gateway-1) is “fa:16:3e:1b:ee:2b” and the MAC address of Router-2 gateway(Gateway-2) is “fa:16:3e:ab:cf:34”.

VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, since FloatingIP-2 is not in the same subnet as Port1, VM-1 will send the the ICMP echo request directly to its default gateway which is Gateway-1. And Gateway-1 will first send out an ARP request for a FloatingIP-2 to resolve its MAC address. And the MAC address “fa:16:3e:ab:cf:34” of Gateway-2 will be answered. After L2 MAC address resolution succeeding, L3 level protocols will route the request to the correct destination. If the FDB is pre-populated, the ARP request between the two router gateway interfaces can be avoided. For non-DVR use case, (host_ip, router_gateway_mac_address, floating_ip_address) should be prepopulated.

The figure-3 illustrates an environment (DVR enabled). The use case is listed below:

Use Case 2
1. Tenant-1 creates Network-1 and Tenant-2 creates Network-2.
2. Tenant-1 creates Subnet-1 belonging to Network-1 and Tenant-2 creates
   Subnet-2 belonging to Network-2.
3. Tenant-1 creates a distributed virtual router DVR-1 and Router-1 is the
   distributed router on compute-1. Tenant-2 creates a distributed virtual
   router DVR-2 and Router-2 is the distributed router on compute-2.
4. Tenant-1 links the DVR-1 to the external provider network
   External_Provider_Network. Tenant-2 links the DVR-2 to the same external
   provider network External_Provider_Network.
5. Tenant-1 links the DVR-1 to the Subnet-1 and Tenant-2 links DVR-2 to the
   Subnet-2.
6. Tenant-1 creates VM-1 in Subnet-1 and Tenant-2 creates VM-2 in Subnet-2.
7. Tenant-1 creates FloatingIP-1 and Tenant-2 creates FloatingIP-2.
8. Tenant-1 associates VM-1 with FloatingIP-1 and Tenant2 associates VM-2 with
   FloatingIP-2.
9. VM-1 communicates VM-2 by floating IP. For example: VM-1 pings FloatingIP-2
   for the first time.

                         (Attached in the comment)
                                 Figure-3

VM-1 has the IP address “10.0.0.1” which is the IP address of Port1 and VM-2 has the IP address “190.0.0.2” which is the IP address of Port2. FloatingIP-1 has the IP address “182.34.4.5” which is the IP address of Port5 and FloatingIP-2 has the IP address “182.34.4.6” which is the IP address of Port6. When a floating IP is attached to a VM, the L3 agent will create a FIP namespace (If one does not already exist) for the external network that the FIP belongs to. After the step 8, FIP-1 and FIP-2 will be created. Port9 is FIP-1 floating IP agent gateway and Port10 is FIP-2 floating IP agent gateway. The MAC address of Port9 is “fa:16:3e:e9:87:24” and the MAC address of Port10 is “fa:16:3e:e7:86:db”.

VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, since FloatingIP-2 is not in the same subnet as Port1, VM-1 will send the the ICMP echo request directly to its default gateway which is FIP-1 floating IP agent gateway. And FIP-1 floating IP agent gateway will first send out an ARP request for a FloatingIP-2 to resolve its MAC address And the MAC address “fa:16:3e:e7:86:db” of FIP-2 floating IP agent gateway will be answered. After L2 MAC address resolution succeeding, L3 level protocols will route the request to the correct destination. If the FDB is prepopulated, the ARP request between the two floating IP agent gateway interfaces can be avoided. For DVR use case, (host_ip, floatingip_agent_gateway_mac_address, floating_ip_address) should be prepopulated.

Proposed Change
==================================================================
The idea is that advertising the FDBs for floating IP when the FIP status changes to "ACTIVE" and withdraw the FDBs for floating IP whenever the status is set to "DOWN" or the resource is deleted or disassociated.

Function _notify_fip_status will be added to send an event after updating floating IPs. Function _delete_floatingip should be modified to use _notify_fip_status to notify the floating IP status. Function _update_fip_assoc is used by function _create_floatingip and _update_floatingip. Both functions will send events after updating floating IP with the result from _update_fip_assoc. Thus we can just insert the status of floating IP into the result of _update_fip_assoc. Function disassociate_floatingips already sends event out after updating floating IP. However after disassociating floating IPs, the router_id and fixed_port_id both are None. Thus need to insert floating IP status, last_known_router_id and last_fixed_port_id into the event.

A class L3RouterL2PopMixin should be added to process the event sent after updating floating IP. This class should subscribe the event and register the callback to l2pop mechanism driver to extend the l2pop FDBs (Depends on RFE: https://bugs.launchpad.net/neutron/+bug/1793653). Thus Callback l3_fdb_extend_func should be added. Function handle_fip_status_update should be added to process the event and send the l2pop FDBs out. For non-DVR, (host_ip, router_gateway_mac_address, floating_ip_address) should be prepopulated. For DVR, (host_ip, floatingip_agent_gateway_mac_address, floating_ip_address) should be prepopulated. What’s more, some other functions used by the functions described above should be added.

All changes can be viewed through the link below:
https://review.openstack.org/#/c/611261/
https://review.openstack.org/#/c/611284/

Data Model Impact
============================================================
None

REST API Impact
============================================================
None

Command Line Client Impact
============================================================
None

Other Impact
============================================================
None

Other Deployer Impact
============================================================
None

Performance Impact
============================================================
Performance testing should be conducted to see test the overhead of adding more information to FDB.

Implementation
Assignee(s)

Work Items
===========================================================
Add function _notify_fip_status to neutron/db/l3_db.py and modify some existing functions in l3_db.py.
Add class L3RouterL2PopMixin to neutron/services/l3_router/service_providers/l2pop.py
Add related tests.

Dependencies
============================================================
Need the RFE “Enable other subprojects to extend l2pop FDB information” merged in OpenStack. This RFE enables other subprojects to extend l2pop FDB information. We need to add floating IP related information to l2pop FDB. Thus we need to use the ability provided by the RFE. The link for RFE is below:
https://bugs.launchpad.net/neutron/+bug/1793653

Testing
============================================================
Unit tests are necessary.

Documentation Impact
============================================================
None.

References
============================================================
[1] https://github.com/openstack/neutron/tree/master/neutron/plugins/ml2/drivers/l2pop
[2] https://wiki.openstack.org/wiki/L2population_blueprint

ChenjieXu (midone)
Changed in neutron:
assignee: nobody → ChenjieXu (midone)
description: updated
description: updated
Revision history for this message
ChenjieXu (midone) wrote :
Revision history for this message
ChenjieXu (midone) wrote :
Revision history for this message
ChenjieXu (midone) wrote :
ChenjieXu (midone)
description: updated
Miguel Lavalle (minsel)
tags: added: rfe
Changed in neutron:
importance: Undecided → Wishlist
Revision history for this message
Miguel Lavalle (minsel) wrote :

Maybe I am misunderstanding, but I am having difficulty understanding the need for this RFE:

1) In both use cases described above, you state that "When VM-1 pings FloatingIP-2 for the first time, it needs to know the MAC address for FloatingIP-2. Thus ARP request is sent out". Since FloatingIP-2 is not in the same subnet as Port1, wouldn't just VM send the the ICMP echo request directly to its default gateway and let L3 level protocols route the request to the correct destination?

2) The summary of your proposal is "The idea is that advertising the FDBs for floating IP when the FIP status changes to "ACTIVE" and withdraw the FDBs for floating IP whenever the status is set to "DOWN" or the resource is deleted or disassociated." Aren't we mixing L2 and L3 concepts here unnecessarily. The FDB entries in L2pop are meant to optimize the communication at L2. Floating IPs should be handled by L3. Am I missing something?

Revision history for this message
Allain Legacy (alegacy) wrote :

Miguel,

1) The need to update the L2POP information with the Floating IP information is to eliminate broadcast packets between the two router gateway interfaces rather than between the VM and its local router interface. If VM1 sends a packets to VM2's FIP that packet will be handled by the virtual router (R1) attached to the external network. If VM2's FIP is owned by a different virtual router (R2) on the same external network then there will be an ARP packet sent from R1 to FIP2. If there is no L2POP information for FIP2 then an ARP packet will be broadcast.

2) FIP resources are applicable to both L2 and L3 because an external gateway will first send out an ARP request for a FIP to resolve its MAC address. L3 routing to the FIP is only possible once L2 MAC address resolution succeeds. Therefore, when a FIP is associated the L2POP information sent out to nodes must be updated. Similarly, when a FIP is disassociated or deleted then the L2POP information must also be updated.

Revision history for this message
ChenjieXu (midone) wrote :

Miguel,

Allain is the original author of this patch. I asked him to provide some comments on your questions. And during the communication with him, I find that I mistakenly think that it is the ARP sent out by VM-1 will be avoided. However, according to Allain's comment, it is the broadcast between the two router gateway will be avoided. As a result of my misunderstanding, the use cases in the RFE is not correct and I will update the use cases as soon as possible. Sorry for my misunderstanding!

ChenjieXu (midone)
description: updated
Revision history for this message
ChenjieXu (midone) wrote :
Download full text (3.6 KiB)

Miguel,

The RFE has been updated. I only change the steps and analysis in 2 use cases. And the changes are below:

Use Case 1
3. Tenant-1 creates Router-1 and links the Router-1 to the external provider network. Tenant-2 creates Router-2 and links the Router-2 to the external provider network.
======================================================>
3. Tenant-1 creates Router-1 and links the Router-1 to the external provider network External_Provider_Network. Tenant-2 creates Router-2 and links the Router-2 to the same external provider network External_Provider_Network.

VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, it needs to know the MAC address for FloatingIP-2. Thus ARP request is sent out. And the MAC address “fa:16:3e:ab:cf:34” of Gateway-2 will be answered. If the FDB is pre-populated, the ARP request can be avoided. For non-DVR use case, (host_ip, router_gateway_mac_address, floating_ip_address) should be prepopulated.
======================================================>
VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, since FloatingIP-2 is not in the same subnet as Port1, VM-1 will send the the ICMP echo request directly to its default gateway which is Gateway-1. And Gateway-1 will first send out an ARP request for a FloatingIP-2 to resolve its MAC address. And the MAC address “fa:16:3e:ab:cf:34” of Gateway-2 will be answered. After L2 MAC address resolution succeeding, L3 level protocols will route the request to the correct destination. If the FDB is pre-populated, the ARP request between the two router gateway interfaces can be avoided. For non-DVR use case, (host_ip, router_gateway_mac_address, floating_ip_address) should be prepopulated.

Use Case 2
4. Tenant-1 links the DVR-1 to the external provider network. Tenant-2 links the DVR-2 to the external provider network.
======================================================>
4. Tenant-1 links the DVR-1 to the external provider network External_Provider_Network. Tenant-2 links the DVR-2 to the same external provider network External_Provider_Network.

VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, it needs to know the MAC address for FloatingIP-2. Thus ARP request is sent out. And the MAC address “fa:16:3e:e7:86:db” of FIP-2 floating IP agent gateway will be answered. If the FDB is prepopulated, the ARP request can be avoided. For DVR use case, (host_ip, floatingip_agent_gateway_mac_address, floating_ip_address) should be prepopulated.
======================================================>
VM-1 and VM-2 reside on different networks and communicate via their respective floating IP addresses. When VM-1 pings FloatingIP-2 for the first time, since FloatingIP-2 is not in the same subnet as Port1, VM-1 will send the the ICMP echo request directly to its default gateway which is FIP-1 floating IP agent gateway. And FIP-1 floating IP agent gateway will first send out an ARP ...

Read more...

Revision history for this message
Miguel Lavalle (minsel) wrote :

Allain, Chenjie,

Thanks for your responses. I have a few comments and further clarifying questions:

1) Yes, I think we all agree that for L3 to work, several ARP resolutions need to happen across the several L2 broadcast domains involved in the end to end communication at L3 level. However, we also need to agree that a pair of an ARP request and its ARP response are local in a particular L2 broadcast domain. Right?

2) Are you saying that in your use cases the "external provider network" is a VXLAN / GRE/ Geneve tunnel network and this is the reason L2pop is involved? How do you route this "external provider network" to the outside?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I have impression here, as Miguel mentioned in his last comment, that this could be only useful if Your provider network would be VXLAN/GRE/Geneve network which isn't common use case, right?
In case when provider network is vlan or flat network type, Your ICMP packet from VM1 to FIP of VM2 would go like:

VM1 --- L2 network --- Router 1 ----- External network ----- Router 2 ----- L2 network --- VM2

and You want to use L2 population "between" routers, so in this "External network". If so, it will not work for VLAN/Flat network as L2pop is taking care of configuration to which tunnel send packets to avoid sending them to all tunnels. Am I right or am I misunderstanding something here?

Revision history for this message
ChenjieXu (midone) wrote :

Miguel,

Thanks for your response! For your questions:

1) Yes, the pair of an ARP request and its ARP response are local in a particular L2 broadcast domain. To guarantee this, in Use Case 1, Router-1 and Router-2 should be linked to the same external provider network. In Use Case 2, DVR-1 and DVR-2 should be linked to the same external provider network.

2) Yes, in my use cases, the "external provider network" is a VXLAN network. For question "How do you route this "external provider network" to the outside":because external provider network is a mapping to the outside network, I tried to setup a VXLAN network environment(OVS is used as VTEP) and add a VXLAN port on br-ex to route the network traffic from OpenStack to VXLAN network environment. But it didn't work. Thus I sent an email to Allain to ask him give me some advice on how to setup the environment.

Revision history for this message
ChenjieXu (midone) wrote :

Slawek,

Thanks for your response! I think you are right and this RFE should be useful only in the cases that external provider network is VXLAN/GRE/Geneve network.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi,

Thx for confirmation. Do You know how common use case it is to have VXLAN/GRE/Geneve network used as external network? I don't have such data and I'm not sure if this is worth to introduce new complexity to code for something which isn't used by many users.
But I also think we can discuss that on drivers meeting, especially now when we have clarification :)

Revision history for this message
ChenjieXu (midone) wrote :

Slawek,

Thanks for your proposal! Sorry, I don't know how common use case it is to have VXLAN/GRE/Geneve network used as external network. I tried to setup a VXLAN network environment(OVS is used as VTEP) and add a VXLAN port on br-ex to route the network traffic from OpenStack to VXLAN network environment. But it didn't work. Thus I sent an email to Allain to ask him give me some advice on how to setup the environment.

Revision history for this message
ChenjieXu (midone) wrote :

Miguel/Slawek,

Allain confirmed that this RFE doesn't apply to Use Case 1 & 2 due to external provider network can only be FLAT/VLAN. And this RFE is only valid within BGP-EVPN use case which is currently not being pursued for upstreaming from the corresponding starlingx-staging projects (stx-neutron-dynamic-routing and stx-networking-bgpvpn). Therefore, until this feature is required, and an attempt is made to have it accepted by the OpenStack community, this feature cannot be used as a justification for this RFE. This RFE is being abandoned for now and can be revived if the BGP-EVPN feature becomes a priority. Thank you so much for reviewing this RFE!

Revision history for this message
Miguel Lavalle (minsel) wrote :

Thanks for the update and the submission

tags: added: rfe-postponed
removed: rfe
Changed in neutron:
status: New → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by ChenjieXu (<email address hidden>) on branch: master
Review: https://review.openstack.org/611261
Reason: The related RFE has been abandoned and can be revived if the BGP-EVPN feature becomes a priority.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by ChenjieXu (<email address hidden>) on branch: master
Review: https://review.openstack.org/611284
Reason: The related RFE has been abandoned and can be revived if the BGP-EVPN feature becomes a priority.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.