[DVR] Recovery from openvswitch restart fails when veth are used for bridges interconnection
Bug #1877977 reported by
Slawek Kaplonski
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Confirmed
|
Medium
|
Unassigned |
Bug Description
In case of DVR routers, when use_veth_
All works fine when patch ports are used for interconnection.
To post a comment you must log in.
I managed to reproduce this and also noticed that the reproduction is indeterministic. Sometimes connectivity recovers after ovs restart, other times it does not. Both cases are quite frequent, so they can be easily caught.
For the record this is the exact reproduction:
# the default is to not use veth pairs, check that we don't have them at start
$ sudo ip l | egrep phy-br-ex
[nothing]
# change the config to use veth interconnections plugins/ ml2/ml2_ conf.ini interconnection = True
$ vim /etc/neutron/
[ovs]
use_veth_
$ sudo systemctl restart devstack@ neutron- agent
# now we have veth interconnections ex@int- br-ex: <BROADCAST, MULTICAST, UP,LOWER_ UP> mtu 9000 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000 ex@phy- br-ex: <BROADCAST, MULTICAST, UP,LOWER_ UP> mtu 9000 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000
$ sudo ip l | egrep phy-br-ex
37: phy-br-
38: int-br-
$ sudo ethtool -S phy-br-ex
NIC statistics:
peer_ifindex: 38
# boot vm with floating ip 0.4.0-x86_ 64-disk --nic net-id=private --wait
$ openstack server create vm0 --flavor cirros256 --image cirros-
$ openstack floating ip create --port "$( openstack port list --device-id "$( openstack server show vm0 -f value -c id )" -f value -c id | head -1 )" public -f value -c floating_ip_address
172.24.4.211
# start ping and keep it running, while...
$ ping 172.24.4.211
# ... we restart ovs
$ sudo systemctl restart openvswitch-switch
In some cases ping recovers in a few seconds. In other cases it never recovers.
flow diff for br-int (.0 is the working state before ovs restart, .1 is when ping did not recover):
# diff -u <( cat dump-flows.br-int.0 | cut -d ' ' -f4,8- | sort ) <( cat dump-flows.br-int.1 | cut -d ' ' -f4,8- | sort ) 10,icmp6, in_port= 18,icmp_ type=136 actions= resubmit( ,24) 2,in_port= 23 actions=drop 2,in_port= 24 actions=drop 3,in_port= 23,vlan_ tci=0x0000/ 0x1fff actions= mod_vlan_ vid:2,resubmit( ,60) 3,in_port= 24,dl_vlan= 100 actions= mod_vlan_ vid:3,resubmit( ,60) 2,in_port= 41 actions=drop 2,in_port= 42 actions=drop 2,in_port= 43 actions=drop 2,in_port= ANY actions=drop 3,in_port= 43,dl_vlan= 100 actions= mod_vlan_ vid:3,resubmit( ,60) 3,in_port= ANY,vlan_ tci=0x0000/ 0x1fff actions= mod_vlan_ vid:2,resubmit( ,60) 5,in_port= 23,dl_dst= fa:16:3f: ca:bf:17 actions= resubmit( ,4) 5,in_port= 24,dl_dst= fa:16:3f: ca:bf:17 actions= resubmit( ,4) 5,in_port= 3,dl_dst= fa:16:3f: ca:bf:17 actions= resubmit( ,3)
--- /dev/fd/63 2020-05-18 13:25:50.235895198 +0000
+++ /dev/fd/62 2020-05-18 13:25:50.239895241 +0000
@@ -4,8 +4,12 @@
table=0, priority=
table=0, priority=
table=0, priority=
-table=0, priority=
-table=0, priority=
+table=0, priority=
+table=0, priority=
+table=0, priority=
+table=0, priority=
+table=0, priority=
+table=0, priority=
table=0, priority=
table=0, priority=
table=0, priority=
flow diff for br-ex:
# diff -u <( cat dump-flows.br-ex.0 | cut -d ' ' -f4,8- | sort ) <( cat dump-flows.br-ex.1 | cut -d ' ' -f4,8- | sort )
--- /dev/fd/63 2020-05-18 13:27:07.036710753 +0000
+++ /dev/fd/62 2020-05-18 13:27:07.036710753 +0000
@@ -1,8 +1,10 @@
table=0, priority=0 actions=NORMAL resubmit( ,3) 2,in_port= 14 acti...
table=0, priority=1 actions=
+table=0, priority=