Comment 0 for bug 2051351

Revision history for this message
Bence Romsics (bence-romsics) wrote :

I believe this issue was already reported earlier:

That bug has a fix committed:

However I believe the above change fixed only part of the issue (with firewall_driver=noop).
But the same problem is still not fixed with firewall_driver=openvswitch.

First, I re-opened bug #1884708, but then I realized that nobody will notice a several year old bug's status change, so I rather opened this new bug report instead.


# config
firewall_driver = openvswitch
explicitly_egress_direct = True
bridge_mappings = physnet0:br-physnet0,...

# a random IP on net0 we can ping
sudo ip link set up dev br-physnet0
sudo ip link add link br-physnet0 name br-physnet0.100 type vlan id 100
sudo ip link set up dev br-physnet0.100
sudo ip address add dev br-physnet0.100

# code
devstack 6b0f055b
neutron $ git log --oneline -n2
27601f8eea (HEAD, origin/bug/2048785, origin/HEAD) Set trunk parent port as access port in ovs to avoid loop
3ef02cc2fb (origin/master) Consume code from neutron-lib
openvswitch 2.17.8-0ubuntu0.22.04.1
linux 5.15.0-91-generic

# clean up first
openstack server delete vm0 --wait
openstack port delete port0
openstack network delete net1 net0

# build the environment
openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
openstack subnet create --network net0 --subnet-range subnet0
openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address= port0
openstack server create --flavor cirros256 --image cirros-0.6.2-x86_64-disk --nic port-id=port0 --availability-zone :devstack0a --wait vm0

# mac addresses for reference
$ openstack port show port0 -f value -c mac_address
$ ifdata -ph br-physnet0

# generate traffic that will keep fdb entries fresh
sudo virsh console "$( openstack server show vm0 -f value -c OS-EXT-SRV-ATTR:instance_name )"

# clear all past junk
for br in br-physnet0 br-int ; do sudo ovs-appctl fdb/flush "$br" ; done

# br-int does not learn port0's mac despite the ongoing ping
for br in br-physnet0 br-int ; do echo ">>> $br <<<" ; sudo ovs-appctl fdb/show "$br" | egrep -i "$( openstack port show port0 -f value -c mac_address )|$( ifdata -ph br-physnet0 )" ; done
>>> br-physnet0 <<<
    1 100 fa:16:3e:96:58:ab 0
LOCAL 100 82:e8:18:67:7e:40 0
>>> br-int <<<
    1 4 82:e8:18:67:7e:40 0

# port and physnet bridge mac in all fdbs, egress == vnic -> physnet bridge
# in br-int we have a direct output action
$ sudo ovs-appctl ofproto/trace br-int in_port="$( sudo ovs-vsctl -- --columns=ofport find Interface name=$( echo "tap$( openstack port show port0 -f value -c id )" | cut -b1-14 ) | awk '{ print $3 }' )",dl_vlan=0,dl_dst=$( ifdata -ph br-physnet0 ),dl_src=$( openstack port show port0 -f value -c mac_address )
Flow: in_port=45,dl_vlan=0,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000

 0. priority 0, cookie 0x2b36d6b4a42fe7b5
58. priority 0, cookie 0x2b36d6b4a42fe7b5
60. in_port=45, priority 100, cookie 0x2b36d6b4a42fe7b5
73. reg5=0x2d, priority 80, cookie 0x2b36d6b4a42fe7b5
94. reg6=0x4,dl_src=fa:16:3e:96:58:ab,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 10, cookie 0x2b36d6b4a42fe7b5

 0. in_port=1,dl_vlan=4, priority 4, cookie 0x85bc1a5077d54d3f
     -> forwarding to learned port

Final flow: reg5=0x2d,reg6=0x4,in_port=45,dl_vlan=4,dl_vlan_pcp=0,dl_vlan1=0,dl_vlan_pcp1=0,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000
Megaflow: recirc_id=0,eth,in_port=45,dl_vlan=0,dl_vlan_pcp=0,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000
Datapath actions: pop_vlan,push_vlan(vid=100,pcp=0),1

# port and physnet bridge mac in all fdbs, ingress == physnet bridge -> vnic
# in br-int we have the normal action flooding, despite the ongoing ping
$ sudo ovs-appctl ofproto/trace br-physnet0 in_port=LOCAL,dl_vlan=100,dl_src=$( ifdata -ph br-physnet0 ),dl_dst=$( openstack port show port0 -f value -c mac_address )
Flow: in_port=LOCAL,dl_vlan=100,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=82:e8:18:67:7e:40,dl_dst=fa:16:3e:96:58:ab,dl_type=0x0000

 0. priority 0, cookie 0x85bc1a5077d54d3f
     -> forwarding to learned port

 0. in_port=1,dl_vlan=100, priority 3, cookie 0x2b36d6b4a42fe7b5
58. priority 0, cookie 0x2b36d6b4a42fe7b5
60. priority 3, cookie 0x2b36d6b4a42fe7b5
     -> no learned MAC for destination, flooding

 0. in_port=1, priority 1, cookie 0xc8cfff9c6bbea88d
 2. dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 0, cookie 0xc8cfff9c6bbea88d
20. priority 0, cookie 0xc8cfff9c6bbea88d
22. priority 0, cookie 0xc8cfff9c6bbea88d

Final flow: unchanged
Megaflow: recirc_id=0,eth,in_port=LOCAL,dl_vlan=100,dl_vlan_pcp=0,dl_src=82:e8:18:67:7e:40,dl_dst=fa:16:3e:96:58:ab,dl_type=0x0000
Datapath actions: pop_vlan,push_vlan(vid=4,pcp=0),8,13,pop_vlan,9,11

This bug has a long history:

round #1 - some unnecessary flooding in the egress direction
fix introducing explicitly_egress_direct:

round #2 - the fix above introduced some unnecessary ingress flooding
fix for firewall_driver=noop
also related:
may be related:

round #3 (today)