Removing a subnet from DVR router also removes DVR MAC flows for other router on network

Bug #1838699 reported by Arjun Baindur
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Confirmed
High
Unassigned

Bug Description

This bug builds on issue seen in https://bugs.launchpad.net/neutron/+bug/1838697

In that issue, if you create a tenant network, some VMs, and attach it to 2 DVR routers, only the DVR MAC rules exist for the first router.

With this issue, simply removing the subnet or deleting the second router ends up deleting all the DVR MAC flows for the first router. It deleted both the table=1 and table=60 rules for ALL local endpoints on that network.

For example:

fa:16:3e:ce:f8:cd = MAC of a VM on this particular host
fa:16:3e:5c:44:da = MAC of router_interface_distributed port of 1st router
fa:16:3e:19:67:9e = MAC of router_interface_distributed port on 2nd router

When simple network is attached to 2 routers:

[<email address hidden> arjun(admin)]# openstack port list --network 8cd0e19e-9041-4a62-9cc9-6bfb5b10f955 --long
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+
| ID | Name | MAC Address | Fixed IP Addresses | Status | Security Groups | Device Owner | Tags |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+
| 16e971ae-0ce9-4f4a-aaab-6ab3fc71bf93 | | fa:16:3e:79:66:c8 | ip_address='10.23.23.9', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 1ef66f53-7818-4281-b407-9be7d55b3b17 | | fa:16:3e:ce:f8:cd | ip_address='10.23.23.7', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 21553560-5491-4036-9d03-65d7bedb28dc | | fa:16:3e:0a:ff:1b | ip_address='10.23.23.2', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| 386d3d98-6c86-4748-9c2e-8b60fbe3f6cc | | fa:16:3e:c9:19:14 | ip_address='10.23.23.25', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 4e211475-91e0-4627-8342-837210219fbc | | fa:16:3e:19:67:9e | ip_address='10.23.23.199', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | ecd04202-0111-4e29-8e2f-39a203123c75 | network:router_interface_distributed | |
| 7be10a79-e581-4ba9-95c9-870e845dbea0 | | fa:16:3e:0b:9b:e3 | ip_address='10.23.23.28', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| be9d8d83-0c55-49aa-836e-bb4f483bde48 | | fa:16:3e:21:76:67 | ip_address='10.23.23.4', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| d266f85c-14b1-4c47-a357-44cd0fa4b557 | | fa:16:3e:c4:f0:ce | ip_address='10.23.23.3', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| de2fb0b6-9756-4418-8501-be202afbf006 | | fa:16:3e:e7:f6:6c | ip_address='10.23.23.14', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f00fa134-da4d-4663-8d94-52de0840f9d4 | | fa:16:3e:2e:3c:8a | ip_address='10.23.23.5', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f33d9ba4-cfdc-42f3-aff4-e5221f84ac03 | | fa:16:3e:c9:86:97 | ip_address='10.23.23.6', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f763ba3f-fae2-4608-8ef9-10ccc023eacc | | fa:16:3e:5c:44:da | ip_address='10.23.23.1', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:router_interface_distributed | |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+

[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:ce:f8:cd
 cookie=0xbdf055421ffc2398, duration=222.793s, table=1, n_packets=0, n_bytes=0, idle_age=1843, priority=4,dl_vlan=13,dl_dst=fa:16:3e:ce:f8:cd actions=mod_dl_src:fa:16:3e:5c:44:da,resubmit(,60)
 cookie=0xbdf055421ffc2398, duration=1838.019s, table=25, n_packets=103, n_bytes=10155, idle_age=1826, priority=2,in_port=2350,dl_src=fa:16:3e:ce:f8:cd actions=resubmit(,60)
 cookie=0xbdf055421ffc2398, duration=222.788s, table=60, n_packets=4, n_bytes=864, idle_age=1826, priority=4,dl_vlan=13,dl_dst=fa:16:3e:ce:f8:cd actions=strip_vlan,output:2350
[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:19:67:9e
[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:5c:44:da
 cookie=0xbdf055421ffc2398, duration=256.770s, table=1, n_packets=0, n_bytes=0, idle_age=1877, priority=4,dl_vlan=13,dl_dst=fa:16:3e:ce:f8:cd actions=mod_dl_src:fa:16:3e:5c:44:da,resubmit(,60)
[root@chef ~]#
[root@chef ~]#

Now I remove the subnet from second router (with .199 IP):

[<email address hidden> arjun(admin)]# openstack router remove subnet cbe3180b-745e-44f9-a2a2-15bb548cc281 f012101e-91ac-4b85-947e-0f9eca83d5e8

[<email address hidden> arjun(admin)]# openstack port list --network 8cd0e19e-9041-4a62-9cc9-6bfb5b10f955 --long
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+
| ID | Name | MAC Address | Fixed IP Addresses | Status | Security Groups | Device Owner | Tags |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+
| 16e971ae-0ce9-4f4a-aaab-6ab3fc71bf93 | | fa:16:3e:79:66:c8 | ip_address='10.23.23.9', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 1ef66f53-7818-4281-b407-9be7d55b3b17 | | fa:16:3e:ce:f8:cd | ip_address='10.23.23.7', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 21553560-5491-4036-9d03-65d7bedb28dc | | fa:16:3e:0a:ff:1b | ip_address='10.23.23.2', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| 386d3d98-6c86-4748-9c2e-8b60fbe3f6cc | | fa:16:3e:c9:19:14 | ip_address='10.23.23.25', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| 7be10a79-e581-4ba9-95c9-870e845dbea0 | | fa:16:3e:0b:9b:e3 | ip_address='10.23.23.28', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| be9d8d83-0c55-49aa-836e-bb4f483bde48 | | fa:16:3e:21:76:67 | ip_address='10.23.23.4', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| d266f85c-14b1-4c47-a357-44cd0fa4b557 | | fa:16:3e:c4:f0:ce | ip_address='10.23.23.3', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:dhcp | |
| de2fb0b6-9756-4418-8501-be202afbf006 | | fa:16:3e:e7:f6:6c | ip_address='10.23.23.14', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f00fa134-da4d-4663-8d94-52de0840f9d4 | | fa:16:3e:2e:3c:8a | ip_address='10.23.23.5', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f33d9ba4-cfdc-42f3-aff4-e5221f84ac03 | | fa:16:3e:c9:86:97 | ip_address='10.23.23.6', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | bd5274ad-2ff9-443a-9226-473cf129e915 | compute:None | |
| f763ba3f-fae2-4608-8ef9-10ccc023eacc | | fa:16:3e:5c:44:da | ip_address='10.23.23.1', subnet_id='f012101e-91ac-4b85-947e-0f9eca83d5e8' | ACTIVE | | network:router_interface_distributed | |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+--------------------------------------+--------------------------------------+------+

As you can see, all the flows are missing even for the router which I didn't detach:

[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:ce:f8:cd
 cookie=0xbdf055421ffc2398, duration=7970.777s, table=25, n_packets=103, n_bytes=10155, idle_age=7958, priority=2,in_port=2350,dl_src=fa:16:3e:ce:f8:cd actions=resubmit(,60)
[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:ce:f8:cd
 cookie=0xbdf055421ffc2398, duration=8249.547s, table=25, n_packets=103, n_bytes=10155, idle_age=8237, priority=2,in_port=2350,dl_src=fa:16:3e:ce:f8:cd actions=resubmit(,60)
[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:5c:44:da
[root@chef ~]#

Searching by MAC of a local VM:

BEFORE:

[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:ce:f8:cd
 cookie=0xbdf055421ffc2398, duration=222.793s, table=1, n_packets=0, n_bytes=0, idle_age=1843, priority=4,dl_vlan=13,dl_dst=fa:16:3e:ce:f8:cd actions=mod_dl_src:fa:16:3e:5c:44:da,resubmit(,60)
 cookie=0xbdf055421ffc2398, duration=1838.019s, table=25, n_packets=103, n_bytes=10155, idle_age=1826, priority=2,in_port=2350,dl_src=fa:16:3e:ce:f8:cd actions=resubmit(,60)
 cookie=0xbdf055421ffc2398, duration=222.788s, table=60, n_packets=4, n_bytes=864, idle_age=1826, priority=4,dl_vlan=13,dl_dst=fa:16:3e:ce:f8:cd actions=strip_vlan,output:2350

AFTER DELETING 2ND ROUTER:

[root@chef ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:ce:f8:cd
 cookie=0xbdf055421ffc2398, duration=8407.972s, table=25, n_packets=103, n_bytes=10155, idle_age=8396, priority=2,in_port=2350,dl_src=fa:16:3e:ce:f8:cd actions=resubmit(,60)
[root@chef ~]#

This basically kills all east-west L3 traffic for this network, even via Router1 which we did not touch.. Restarting OVS agent fixes this issue

Changed in neutron:
status: New → Confirmed
tags: added: l3-dvr-backlog
Changed in neutron:
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I think we have a problem in the OVS DVR agent but I would like some confirmation on this.

Both "install_dvr_to_src_mac" and "delete_dvr_to_src_mac" create/delete flows matching on the VLAN tag and the dst MAC. The problem we have here is that if you have more than one DVR router, the flows for those DVR router will have the same match conditions. If one router is deleted, all flows in br_int from all routers will be deleted. This is the bug described here.

Am I correct?

Delete flow: https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py#L172

Regards.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.