Linux Bridge MTU bug when the VXLAN tunneling is used

Bug #1242534 reported by Édouard Thuleau
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
openstack-manuals
Fix Released
Medium
Unassigned

Bug Description

I made some tests with the ML2 plugin and the Linux Bridge agent with VXLAN tunneling.

By default, physical interface (used for VXLAN tunneling) has an MTU of 1500 octets. And when LB agent creates a VXLAN interface, the MTU is automatically 50 octets less than the physical interface (so 1450 octets) [1]. Therefore, the bridge use to plug tap of VM, veth from network namespaces (l3 or dhcp) and VXLAN interface has an MTU of 1450 octets (Linux bridges take minimum of all the underlying ports [2]).

So the bridge could only forward packets of length smaller than 1450 octets to VXLAN interface [3].

But the veth interfaces used to link network namespaces and bridges are spawn by l3 and dhcp agents (and perhaps other agents) with an MTU of 1500 octets. So, packets which arriving from them are dropped if they need to be forwarded to the VXLAN interface.

A simple workaround is to increase by 50 at least the MTU of the physical interface to harmonize MTU between interfaces. But by default (without MTU customizing), the LB/VXLAN mode have strange behavior (cannot make curl from server behind a router or execute command with verbose output in SSH through a floating IP (SSH connection works)...)

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/vxlan.c#n2437
[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_if.c#n402
[3] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_forward.c#n74

Changed in neutron:
status: New → Opinion
Revision history for this message
Kyle Mestery (mestery) wrote :

The concept of a system-wide MTU setting in a configuration file is intriguing here, per your suggestion on the mailing list. Have you evaluated this option and tested it out by any chance?

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

I didn't yet test this concept.
But it seems to be the simplest one.

I had a look to dynamically define the MTU but as I said on the mailing list, only the LB agent is aware of the MTU size.
So if we want to use this information, the LB agent should set the MTU in network namespaces of other agent. It doesn't seem to be the right solution.

I can make patch for the concept of a system-wide MTU setting in a configuration file.

I saw a flag already exists in the OVS agent configuration named 'veth_mtu' https://github.com/openstack/neutron/blob/master/neutron/plugins/openvswitch/common/config.py#L74
Do you think we can reuse it for that problem ?

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

In fact, a flag already exists to set the MTU of interfaces: 'network_device_mtu'.
This flag is define in neutron/agent/linux/interface.py but it's not documented in config files.

So this flag is what we need to solve this bug. We just need to add a warning in the documentation.
I will affect this bug to 'openstack-manuals'.

no longer affects: neutron
Revision history for this message
Anne Gentle (annegentle) wrote :

I'm seeing network_device_mtu is documented in http://docs.openstack.org/havana/config-reference/content/section_networking-options-reference.html with a default of None. Where would the warning exist?

Changed in openstack-manuals:
status: New → Incomplete
Revision history for this message
Édouard Thuleau (ethuleau) wrote :

It's not about the flag 'network_device_mtu'.

It's in case we use VLXAN overlay technology with LinuxBridge agents without any MTU customization.
In that case, flows passing through a virtual router will have some strange behavior (cannot make curl from server behind a router or execute command with verbose output in SSH through a floating IP (SSH connection works)...).

This could be corrected by customizing the physical device MTU (by increasing it by 50 octets) or by decreasing the MTU of interfaces created by Neutron agents with flag 'network_device_mtu'.

Revision history for this message
Mathieu Rohon (mathieu-rohon) wrote : RE: [Bug 1242534] Re: Linux Bridge MTU bug when the VXLAN tunneling is used
Download full text (4.0 KiB)

Hi,
it's a bit trick to know exactly where to mention the necessity of this flag. It's a flag that has to be activated in neutron.conf when ML2 plugin use LinuxBridge agent and vxlan technologie. May be here :
http://docs.openstack.org/havana/config-reference/content/networking-plugin-ml2_vxlan.html

-----Message d'origine-----
De : <email address hidden> [mailto:<email address hidden>] De la part de Édouard Thuleau
Envoyé : mercredi 30 octobre 2013 17:33
À : ROHON Mathieu IMT/OLPS
Objet : [Bug 1242534] Re: Linux Bridge MTU bug when the VXLAN tunneling is used

It's not about the flag 'network_device_mtu'.

It's in case we use VLXAN overlay technology with LinuxBridge agents without any MTU customization.
In that case, flows passing through a virtual router will have some strange behavior (cannot make curl from server behind a router or execute command with verbose output in SSH through a floating IP (SSH connection works)...).

This could be corrected by customizing the physical device MTU (by
increasing it by 50 octets) or by decreasing the MTU of interfaces
created by Neutron agents with flag 'network_device_mtu'.

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1242534

Title:
  Linux Bridge MTU bug when the VXLAN tunneling is used

Status in OpenStack Manuals:
  Incomplete

Bug description:
  I made some tests with the ML2 plugin and the Linux Bridge agent with
  VXLAN tunneling.

  By default, physical interface (used for VXLAN tunneling) has an MTU
  of 1500 octets. And when LB agent creates a VXLAN interface, the MTU
  is automatically 50 octets less than the physical interface (so 1450
  octets) [1]. Therefore, the bridge use to plug tap of VM, veth from
  network namespaces (l3 or dhcp) and VXLAN interface has an MTU of 1450
  octets (Linux bridges take minimum of all the underlying ports [2]).

  So the bridge could only forward packets of length smaller than 1450
  octets to VXLAN interface [3].

  But the veth interfaces used to link network namespaces and bridges
  are spawn by l3 and dhcp agents (and perhaps other agents) with an MTU
  of 1500 octets. So, packets which arriving from them are dropped if
  they need to be forwarded to the VXLAN interface.

  A simple workaround is to increase by 50 at least the MTU of the
  physical interface to harmonize MTU between interfaces. But by default
  (without MTU customizing), the LB/VXLAN mode have strange behavior
  (cannot make curl from server behind a router or execute command with
  verbose output in SSH through a floating IP (SSH connection works)...)

  [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/vxlan.c#n2437
  [2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_if.c#n402
  [3] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_forward.c#n74

To manage notifications about this bug go to:
https://bugs.launchpad.net/openstack-manuals/+bug/1242534/+subscriptions

______________________________________________________________________________________...

Read more...

Tom Fifield (fifieldt)
Changed in openstack-manuals:
status: Incomplete → Triaged
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-manuals (master)

Fix proposed to branch: master
Review: https://review.openstack.org/63312

Changed in openstack-manuals:
assignee: nobody → Tom Fifield (fifieldt)
status: Triaged → In Progress
Changed in openstack-manuals:
assignee: Tom Fifield (fifieldt) → Diane Fleming (diane-fleming)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-manuals (master)

Reviewed: https://review.openstack.org/63312
Committed: https://git.openstack.org/cgit/openstack/openstack-manuals/commit/?id=5d59b6d7fbe828d38627c89ee353c4ad305625e7
Submitter: Jenkins
Branch: master

commit 5d59b6d7fbe828d38627c89ee353c4ad305625e7
Author: Tom Fifield <email address hidden>
Date: Fri Dec 20 12:56:48 2013 +0800

    Add note about Linux Bridge MTU Bug w/ VXLAN tun

    There's a horrible hard-to-diagnose bug when using vxlan tunelling.

    This patch adds a note including the simple workaround.

    Change-Id: I20a8ec46a91c7ffe1533338de5968d1b83badb6f
    Closes-Bug: 1242534

Changed in openstack-manuals:
status: In Progress → Fix Released
Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

Note that (pedantically) network_device_mtu wants to be 1 encap less than the *minimum* MTU in your cluster, because that is the *maximum* MTU of any packet you can guarantee to be able to forward. That's not necessarily same as 1 encap less than the local MTU; it may be smaller.

Encaps are weird.

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

Also, you refer to a bug in your documentation, but there's not actually a bug as such. The network MTU is not defined (which could tenuously be called be a bug; it should also be in the DHCP response automatically; but really, if you want it defined it's a feature enhancement) and it's not what you might reasonably expect it to be with a conventional setup, that is, for a 1500 infrastructure MTU you don't get a 1550 MTU on virtual networks. It's surprising but on the technicality that we make no promises it's not actually wrong.

Changed in openstack-manuals:
assignee: Diane Fleming (diane-fleming) → xiaosheng meng (xiaosheng-meng)
assignee: xiaosheng meng (xiaosheng-meng) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-manuals 15.0.0

This issue was fixed in the openstack/openstack-manuals 15.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.