OVS plugin tunnel bridges never learn
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Aaron Rosen |
Bug Description
The tunnel bridges never learn which ports the remote VM MACs are on, so they flood every frame onto every gre port. This means every VM frame that needs to go to a VM on another physical node will actually be sent to every other physical node in the mesh.
See diagram https:/
Running the latest build with devstack on ubuntu 12.04. There are 3 compute nodes. Folsom1 is also the controller node running nova-network. There are 2 running VMs: vm1(10.0.0.4) on folsom1(
Every packet from vm1->vm2 moves on the tunnel folsom1->folsom2 but is also seen on tunnel folsom1->folsom3. Similarly every packet from vm2->vm1 moves on the tunnel from folsom2->folsom1, but is also seen on folsom2->folsom3.
From a ssh session running on vm1 to vm2, pressing the enter key causes this traffic on folsom3(
u1@folsom3:~$ sudo tcpdump -n -i eth1 proto GRE
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
16:34:34.827337 IP 172.241.0.41 > 172.241.0.43: GREv0, key=0x1, length 118: IP 10.0.0.4.49919 > 10.0.0.5.22: Flags [P.], seq 3279504606:
16:34:34.872076 IP 172.241.0.42 > 172.241.0.43: GREv0, key=0x1, length 74: IP 10.0.0.5.22 > 10.0.0.4.49919: Flags [.], ack 44, win 7776, options [nop,nop,TS val 1916631 ecr 1987313], length 0
16:34:34.932079 IP 172.241.0.42 > 172.241.0.43: GREv0, key=0x1, length 118: IP 10.0.0.5.22 > 10.0.0.4.49919: Flags [P.], seq 1:45, ack 44, win 7776, options [nop,nop,TS val 1916648 ecr 1987313], length 44
16:34:34.935827 IP 172.241.0.41 > 172.241.0.43: GREv0, key=0x1, length 74: IP 10.0.0.4.49919 > 10.0.0.5.22: Flags [.], ack 45, win 8372, options [nop,nop,TS val 1987334 ecr 1916648], length 0
16:34:34.965302 IP 172.241.0.42 > 172.241.0.43: GREv0, key=0x1, length 118: IP 10.0.0.5.22 > 10.0.0.4.49919: Flags [P.], seq 45:89, ack 44, win 7776, options [nop,nop,TS val 1916653 ecr 1987334], length 44
16:34:34.973799 IP 172.241.0.41 > 172.241.0.43: GREv0, key=0x1, length 74: IP 10.0.0.4.49919 > 10.0.0.5.22: Flags [.], ack 89, win 8372, options [nop,nop,TS val 1987341 ecr 1916653], length 0
u1@folsom1:~$ sudo ovs-vsctl show
b5df6d74-
Bridge br-tun
Port br-tun
Port patch-int
Port "gre-1"
Port "gre-0"
Bridge br-int
Port "gw-2d6158fb-55"
tag: 4
Port "tap4a4df867-57"
tag: 4
Port patch-tun
Port br-int
ovs_version: "1.4.0+build0"
u1@folsom1:~$ sudo ovs-ofctl show br-tun
OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:00009e9e10
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
1(patch-int): addr:9e:
config: 0
state: 0
2(gre-0): addr:d6:
config: 0
state: 0
3(gre-1): addr:92:
config: 0
state: 0
LOCAL(br-tun): addr:9e:
config: PORT_DOWN
state: LINK_DOWN
OFPT_GET_
u1@folsom1:~$ sudo ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=2977.562s, table=0, n_packets=198, n_bytes=42502, priority=
cookie=0x0, duration=2977.703s, table=0, n_packets=199, n_bytes=40780, priority=
cookie=0x0, duration=6234.7s, table=0, n_packets=3, n_bytes=970, priority=1 actions=drop
The MAC table has just the macs for vm1 and the gateway tap from the integration bridge on the patch-int port. It never gets an entry for vm2.
u1@folsom1:~$ sudo ovs-appctl fdb/show br-tun
port VLAN MAC Age
1 0 fa:16:3e:50:2c:8f 30
1 0 fa:16:3e:3d:ff:74 16
-------
u1@folsom2:~$ sudo ovs-vsctl show
3eee2d04-
Bridge br-int
Port patch-tun
Port "tapafca0da5-30"
tag: 5
Port br-int
Bridge br-tun
Port br-tun
Port patch-int
Port "gre-0"
Port "gre-1"
ovs_version: "1.4.0+build0"
u1@folsom2:~$ sudo ovs-ofctl show br-tun
OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:00008286d3
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
1(patch-int): addr:36:
config: 0
state: 0
2(gre-0): addr:1a:
config: 0
state: 0
3(gre-1): addr:62:
config: 0
state: 0
LOCAL(br-tun): addr:82:
config: PORT_DOWN
state: LINK_DOWN
OFPT_GET_
u1@folsom2:~$ sudo ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=3178.517s, table=0, n_packets=207, n_bytes=42252, priority=
cookie=0x0, duration=3178.923s, table=0, n_packets=206, n_bytes=45370, priority=
cookie=0x0, duration=6952.312s, table=0, n_packets=15, n_bytes=3106, priority=1 actions=drop
u1@folsom2:~$ sudo ovs-appctl fdb/show br-tun
port VLAN MAC Age
1 0 fa:16:3e:65:77:b5 26
-------
u1@folsom3:~$ sudo ovs-vsctl show
293333a5-
Bridge br-int
Port patch-tun
Port br-int
Bridge br-tun
Port "gre-1"
Port br-tun
Port patch-int
Port "gre-0"
ovs_version: "1.4.0+build0"
u1@folsom3:~$ sudo ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=4285.438s, table=0, n_packets=530, n_bytes=113476, priority=1 actions=drop
Changed in quantum: | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → dan wendlandt (danwent) |
Changed in quantum: | |
status: | Confirmed → In Progress |
Changed in quantum: | |
status: | Fix Committed → Fix Released |
Changed in quantum: | |
milestone: | folsom-rc1 → 2012.2 |
Thanks for the detailed report. I've confirmed this in our current setup, and think I understand why this started happening, but I'm not totally sure, as my initial attempt at fixing it did not work as expected either :)
Will keep looking.