[RFE] Limit Geneve to within Neutron availability zones

Bug #1808594 reported by Dan Sneddon
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
networking-ovn
Confirmed
Undecided
Unassigned
neutron
New
Wishlist
usha veepuri

Bug Description

Creating multiple Neutron availability zones allows the operator to schedule DHCP and L3 agents within a single AZ. Neutron OVN will still try to form a Geneve mesh between all nodes in all availability zones, which creates inter-AZ dependencies and may not work when strict firewalls are placed between AZs.

Note that this RFE is a clone of https://bugs.launchpad.net/neutron/+bug/1808062 but applies to Neutron OVN instead of ML2/OVS.

This behavior should be configurable, so that L2 may be limited to a particular AZ, and no tunnels are formed between different AZs. This will prevent Neutron from trying to form tunnels when the tunnel cannot function, and may enhance security when AZs are in different security zones.

The desired end-state configuration would have separate DHCP and L3 agents hosted in each AZ, along with tunnels formed only inside the AZ. This would allow, for instance, multiple edge sites within a single deployment that each performed local networking only. Any particular Neutron network would be limited to one AZ. A new flag would allow AZs to be truly autonomous and remove cross-AZ dependencies.

Note that it appears that NSX-T has a concept called "Transport Zones" that enables the feature that is being requested here. Compute nodes within a given transport zone will only be able to communicate with compute nodes within that same transport zone. This prevents network traffic from being sent between zones. More information here:

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.3/com.vmware.nsxt.install.doc/GUID-F47989B2-2B9D-4214-B3BA-5DDF66A1B0E6.html

NSX-T also supports Availability Zones, but it appears that those are separate from the Transport Zone functionality:

https://docs.vmware.com/en/VMware-Integrated-OpenStack/5.0/com.vmware.openstack.admin.doc/GUID-37F0E9DE-BD19-4AB0-964C-D1D12B06345C.html

It's possible that limiting tunneling traffic to a particular AZ may be outside the intended functions of Neutron AZs, but I think this is a valid use case.

tags: added: rfe
Changed in neutron:
assignee: nobody → usha veepuri (usha-veepuri)
Revision history for this message
Miguel Lavalle (minsel) wrote :

@usha veepuri,

Please note that this RFE has not been approved for implementation and even if it is, may require a spec. I strogly recommend holding of on any code implementation effort, unless you intend it as a PoC and are willing to discard it if the RFE doesn't get approved

Revision history for this message
Miguel Lavalle (minsel) wrote :

@Dan,

Is this RFE's scope OVN exclusively?

Revision history for this message
Dan Sneddon (dsneddon) wrote :

@minsel,

The original RFE was filed against ML2/OVS with VXLAN, and this is a clone of that RFE for Geneve on OVN. However, I would hope for similar behavior with any mechanism driver that used tunneling, whether Geneve, VXLAN, or STT for that matter.

If the ML2 plugin agent will be doing the filtering, then I think we would need separate RFEs for each project, and this RFE is only for OVN.

It appears to me from reading some ML2 plugin code that the list of tunnel peers is obtained via RPC. If it's possible to limit the list of tunnel peers that is sent to the ML2 plugin agent, or if we could fail to bind a port if a compute is in the wrong AZ, I think perhaps that could be done in a way that worked for multiple ML2 plugins. Someone already suggested doing filtering in the l2_pop driver, but l2_pop doesn't work in all deployment scenarios.

I can think of several ways to implement this, which can be discussed in a spec:

Method 1) A global flag for Neutron for limiting traffic within AZs. When set, compute nodes would only form tunnels with other computes in the same AZ. If it were possible to limit the list of remote compute nodes via RPC (one queue per AZ?), perhaps this could be implemented in a way that worked for multiple Neutron drivers. This wouldn't prevent binding two ports on the same network in different AZs, but the computes would only be able to pass East-West traffic within their local AZ (and to the L3 and DHCP agents for the network).

Method 2) One-way association between network and autonomous zone. A network could be assigned to one particular AZ, and would only work within that AZ. Networks that were not associated with a particular AZ would function as normal and could exist in all AZs. This would work for most use cases, but would require networks to be assigned to AZs in the DB. Perhaps binding a port would fail if the compute were not in the specified AZ.

Method 3) Many-to-many association between autonomous zone and the network. A network could be assigned to more than one autonomous zone, and a compute could only bind to that network if it were in one of the assigned AZs. This would require a network-to-AZ multi-way association in the DB and agents would need to be aware of this mapping.

For reference, I think this is where that filtering would be relevant if it were done in the openvswitch-agent:

https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1843

And I think this is where the filtering would be relevant if it were done in the l2_pop driver:

https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/l2pop/rpc_manager/l2population_rpc.py#L310

Changed in networking-ovn:
status: New → Confirmed
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

This RFE is clone of https://bugs.launchpad.net/neutron/+bug/1808062 which was created for VXLAN network and is already approved by drivers team.
So I will mark this one as approved too based on approval of https://bugs.launchpad.net/neutron/+bug/1808062

Changed in neutron:
importance: Undecided → Wishlist
tags: added: ovn rfe-approved
removed: rfe
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.