Path MTU discovery fails for VMs with Floating IP behind DVR routers

Bug #1799124 reported by Trent Lloyd on 2018-10-22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Brian Haley

Bug Description

Tenant VMs using an overlay network with an MTU <1500 and less than the MTU of the external network are unable to negotiate MTU using Path MTU discovery.

In most cases, since the instance MTU is configured by DHCP, direct instance traffic is not affected however if the VM acts as a router for other traffic (e.g. to bridge for Docker, LXD, Libvirt, etc) that have the MTU set to 1500 (which is the default in most cases) then they rely on Path MTU discovery to discover the 1450 MTU.

On normal routers and DVR routers where the VM does not have a floating IP (and thus is routed via a centralized node), this works as expected.

However on DVR routers where the VM has a Floating IP (and thus traffic is routed directly from the compute node) this fails. When a packet comes from the external network towards the VM with a size larger than the overlay network's MTU, the packet is dropped and no ICMP too large fragmentation required response is received by the external host. This prevents Path MTU discovery from working to fix the connection, the result is that most TCP connections will stall if they attempt to send more than 1500 bytes, e.g. a simple HTTP download.

My diagnosis is that the qrouter namespace on the compute host has no default route. It has a default route in the alternative routing table (16) used for traffic matching an "ip rule" which selects all traffic being sent from the VM subnet but there is no default route in the global default routing table.

I have not 100% confirmed this part, however, my understanding is that since there is no global default route the kernel is unable to select a source IP for the ICMP error. Additionally, even if it did somehow select a source IP, the appropriate default route appears to be via the RFP interface on the subnet back to the FIP namespace which would not match the rule for traffic from the VM subnet to use the alternate routing table anyway.

In testing, if I add a default route through the rfp interface then ICMP errors are sent and Path MTU discovery successfully works, allowing TCP connections to work.

root@maas-node02:~# ip netns exec qrouter-1752c73a-be9f-4326-97cc-99dbe0988b3c ip r dev qr-ec03268e-fb proto kernel scope link src dev rfp-1752c73a-b proto kernel scope link src

root@maas-node02:~# ip -n qrouter-1752c73a-be9f-4326-97cc-99dbe0988b3c route show table 16
default via dev rfp-1752c73a-b

root@maas-node02:~# ip -n qrouter-1752c73a-be9f-4326-97cc-99dbe0988b3c route add default via dev rfp-1752c73a-b

It's not clear to me if there is an intentional reason not to install a default route here, particularly since such a route exists for non-DVR routers. I would appreciate input from anyone who knows if this was an intentional design decision or simply oversight.

 = Steps to reproduce =

(1) Deploy a cloud with DVR and global-physnet-mtu=1500
(2) Create an overlay tenant network (MTU: 1450), VLAN/flat external network (MTU: 1500), router.
(3) Deploy an Ubuntu 16.04 container
(4) Verify that a large download works; "wget"
(5) Configure LXD to use a private subnet and NAT; "dpkg-reconfigure -pmedium lxd" - you can just hit yes and accept the defaults bascially
(6) Create an lxd image, "lxc launch ubuntu:16.04 test", then test a download
(7) lxc exec test "wget"

An alternative simple test to using LXD/docker is to force the MTU of the VM back to 1500. "ip link set eth0 mtu 1500" -- this same scenario will fail with DVR and work without DVR.

Brian Haley (brian-haley) wrote :

I will try and confirm this failure not using containers as it's probably not required.

Also, I don't think this was intended behavior, just possibly oversight.

Changed in neutron:
importance: Undecided → High
assignee: nobody → Brian Haley (brian-haley)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers