flows lost with noop firewall driver at ovs-agent restart while the db is down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Bence Romsics |
Bug Description
If we restart ovs-agent while neutron-server is up but neutron DB is down, then the agent deletes and cannot recover the per-port flows, if we also use the noop firewall driver. Because the affected flows include the mod_vlan_vid flows this means traffic loss until another agent restart (with the db up) or a full successful resync happens.
For example:
[securitygroup]
firewall_driver = noop
openstack server delete vm0 --wait
openstack server create --flavor cirros256-pinned --image cirros-
sudo ovs-ofctl dump-flows br-int > ~/noop-db-stop.1
# execute these by hand and make sure that each command took effect before moving on to the next
sudo systemctl stop mysql
sudo systemctl restart devstack@q-agt
sudo ovs-ofctl dump-flows br-int > ~/noop-db-stop.2
# diff the flows (for the sake of simplicity this devstack environment has a single vm with a single port, started above)
a=1 ; b=2 ; base=noop-db-stop. ; colordiff -u <( cat ~/$base$a | egrep -v ^NXST_FLOW | sed -r -e 's/(cookie|
--- /dev/fd/63 2023-06-29 08:10:00.142623814 +0000
+++ /dev/fd/62 2023-06-29 08:10:00.142623814 +0000
@@ -1,19 +1,10 @@
table=0 priority=0 actions=
-table=0 priority=
-table=0 priority=
table=0 priority=200,reg3=0 actions=
table=0 priority=
table=0 priority=
-table=0 priority=
-table=0 priority=
table=0 priority=
-table=0 priority=
table=23 priority=0 actions=drop
table=24 priority=0 actions=drop
-table=24 priority=
-table=24 priority=
-table=24 priority=
-table=25 priority=
table=30 priority=0 actions=
table=31 priority=0 actions=
table=58 priority=0 actions=
The same loss of flows does not happen with the openvswitch firewall driver:
[securitygroup]
firewall_driver = openvswitch
openstack server delete vm0 --wait
openstack server create --flavor cirros256-pinned --image cirros-
sudo ovs-ofctl dump-flows br-int > ~/openvswitch-
sudo systemctl stop mysql
sudo systemctl restart devstack@q-agt
sudo ovs-ofctl dump-flows br-int > ~/openvswitch-
a=1 ; b=2 ; base=openvswitc
[no diff]
The same loss of flows does not happen either if neutron-server is down while ovs-agent restarts:
[securitygroup]
firewall_driver = noop
openstack server delete vm0 --wait
openstack server create --flavor cirros256-pinned --image cirros-
sudo ovs-ofctl dump-flows br-int > ~/noop-
sudo systemctl stop devstack@q-svc
sudo systemctl restart devstack@q-agt
sudo ovs-ofctl dump-flows br-int > ~/noop-
a=1 ; b=2 ; base=noop-
[no diff]
devstack b10c0602
neutron 0c5d4b8728
I'll push a proposed fix soon.
Changed in neutron: | |
status: | New → In Progress |
importance: | Undecided → High |
The proposed fix: https:/ /review. opendev. org/c/openstack /neutron/ +/887257