Bug #1889454 “br-int has an unpredictable MTU” : Bugs : neutron

Revision history for this message

Mohammed Naser (mnaser) wrote on 2020-07-29:

#1

Interesting, so after further research, it seems that Open vSwitch will set the MTU to the lowest MTU of all the ports attached, which isn't correct in this case.

ovs-vsctl set int br-int mtu_request=1500

By setting this, it will set the MTU of the bridge appropriately, see: http://docs.openvswitch.org/en/latest/faq/issues/ under "Q: How can I configure the bridge internal interface MTU? Why does Open vSwitch keep changing internal ports MTU?"

Should we have Neutron update that setting instead?

Revision history for this message

Darragh O'Reilly (darragh-oreilly) wrote on 2020-07-30:

#2

When you create a bridge in OVS, it also creates a port+interface with the same name. The br-int interface is not used by neutron. It is a trunk port and receives BUM traffic on br-int - you can tcpdump -eni br-int to see that. The "br-int: dropped over-mtu packet" messages can be ignored.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2020-07-31:

#3

Should the 'dropped' statistics on br-int interfaces be ignored then?

# ip -s link show dev br-int
14: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether c6:b4:78:bd:0a:4f brd ff:ff:ff:ff:ff:ff
    RX: bytes packets errors dropped overrun mcast
    0 0 0 633528456 0 0
    TX: bytes packets errors dropped carrier collsns
    0 0 3607 0 0 0

I see this steadily increasing...

Revision history for this message

Darragh O'Reilly (darragh-oreilly) wrote on 2020-07-31:

#4

Yes. But it's high, so you probably have flooding due to some learning not happening. There are open bugs about flooding on br-int.

Revision history for this message

Rodolfo Alonso (rodolfo-alonso-hernandez) wrote on 2020-08-06:

#5

Hello:

As Darragh commented, when a bridge is created, a port with type=internal is created. Because we are not sending or receiving traffic from this port, why do we need to care about those log messages?

Are your VMs dropping traffic because the MTU is too low? Do you have problems setting the Neutron MTU config? Remember there are several parameters to define: global_physnet_mtu, physical_network_mtus, segment_mtu, path_mtu.

https://docs.openstack.org/newton/networking-guide/config-mtu.html

Regards.

Revision history for this message

Slawek Kaplonski (slaweq) wrote on 2020-08-07:

#6

I second Rodolfo's question. Is this mtu issue just a logging message, "cosmetic" issue or do You experience any real traffic drop due to that?
I will mark this LP as incomplete for now. Feel free to switch it to new when You will provide additional data and that issue is really impacting traffic from/to instances.

Changed in neutron:
status:	New → Incomplete

Revision history for this message

Trent Lloyd (lathiat) wrote on 2020-09-08:

#7

This has caused a lot of nuisance reports from users and engineers alike when working on OpenStack systems. I looked into this and found the following:

- This error refers specifically to the MTU being too large when attempting to transmit a packet out of openvswitch onto the linux interface 'br-int', which is a 'LOCAL' port plugged into the br-int switch. As opposed to transmitting a packet into the br-int "switch" or some other real port.

- This happens generally for both broadcast/multicast packets and when packets are "flooded" due to the location of the MAC being unknown (which is a normal occurance due to l2 switching table expiry). However the packets are not useful, since the interface is down, they don't go anywhere. Hence the error is harmless and does not represent a real problem. Note that such errors with a different interface name such as a tap device, physical device, etc are likely to be real errors and should not be ignored. They can only be ignored for these un-used "LOCAL" ports with the same name as the bridge, i.e. br-int.

- The MTU of this br-int 'local' port follows the minimum MTU of all ports plugged into the switch, you can temporarily set it with "ip link set" however the next time a port is added or removed from the switch, it will be reset.

- It is expected to have interfaces with different MTUs on br-int when you have networks with different MTUs, and this shows particularly in the case of DVR where you have native (1500) byte MTU external networks and GRE/VXLAN overlay networks with a reduce MTU on the same br-int. But it can also happen without DVR with tenant networks of different MTUs or when you connect provider networks directly to VMs.

- This "local" br-int port on the switch is not used in OpenStack, and is down, despite that, the current openvswitch kernel code attempts to transmit the packet out this down interface resulting in the MTU error. I have looked at whether it would be a sensible operation to skip this at the openvswitch level or not but have not come to a determination on that.

- You can request openvswitch to set this MTU specifically to some other value, using the following ovs-vsctl command to set the mtu_request parameter. This is a harmless workaround that prevents the error from being logged and I would recommend people use that. I also think that neutron should make this request itself. Note that 9000 suits most environments but it needs to be set to the same (or at least higher) value as global-physnet-mtu. This setting is stored in the OVS database and will persist reboots/restarts.

ovs-vsctl set Interface br-int mtu_request=9000

This has caused a lot of nuisance reports from users and engineers alike when working on OpenStack systems. I looked into this and found the following:

- This error refers specifically to the MTU being too large when attempting to transmit a packet out of openvswitch onto the linux interface 'br-int', which is a 'LOCAL' port plugged into the br-int switch. As opposed to transmitting a packet into the br-int "switch" or some other real port.

- This happens generally for both broadcast/multicast packets and when packets are "flooded" due to the location of the MAC being unknown (which is a normal occurance due to l2 switching table expiry). However the packets are not useful, since the interface is down, they don't go anywhere. Hence the error is harmless and does not represent a real problem. Note that such errors with a different interface name such as a tap device, physical device, etc are likely to be real errors and should not be ignored. They can only be ignored for these un-used "LOCAL" ports with the same name as the bridge, i.e. br-int.

- The MTU of this br-int 'local' port follows the minimum MTU of all ports plugged into the switch, you can temporarily set it with "ip link set" however the next time a port is added or removed from the switch, it will be reset.

- It is expected to have interfaces with different MTUs on br-int when you have networks with different MTUs, and this shows particularly in the case of DVR where you have native (1500) byte MTU external networks and GRE/VXLAN overlay networks with a reduce MTU on the same br-int. But it can also happen without DVR with tenant networks of different MTUs or when you connect provider networks directly to VMs.

- This "local" br-int port on the switch is not used in OpenStack, and is down, despite that, the current openvswitch kernel code attempts to transmit the packet out this down interface resulting in the MTU error. I have looked at whether it would be a sensible operation to skip this at the openvswitch level or not but have not come to a determination on that.

- You can request openvswitch to set this MTU specifically to some other value, using the following ovs-vsctl command to set the mtu_request parameter. This is a harmless workaround that prevents the error from being logged and I would recommend people use that. I also think that neutron should make this request itself. Note that 9000 suits most environments but it needs to be set to the same (or at least higher) value as global-physnet-mtu. This setting is stored in the OVS database and will persist reboots/restarts.

ovs-vsctl set Interface br-int mtu_request=9000

Revision history for this message

Launchpad Janitor (janitor) wrote on 2020-11-08:

#8

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status:	Incomplete → Expired

Revision history for this message

Khadhraoui karam (karam12) wrote on 2021-10-26:

#9

Hello,

I'm actually working on this bug.

In fact, it may cause a serious problem when it comes to receiving paquet with size larger than mtu, OVS bridge will drop, the log that you are seing is a serious problem, it means that you are receiving paquet larger than your bridge mtu and OVS drop it !!!!

You can test that on IXIA or with iperf.

This problem isn't reproduced on Linux bridge because, like you have already said, it has no local port.
Virtuel port with type internal for ovs bridge is the main cause of this problem.

So what I'm actually working on, is to drop packet with size larger than mtu only for ports who there type is different than internal.
Actually that's the normal behave because bridge should behave as a switch in this case.

I will keep you up-to-date.

neutron

br-int has an unpredictable MTU

Bug Description

Other bug subscribers

Remote bug watches