Support of fragmentation - ovn_emit_need_to_frag

Bug #1947391 reported by Nobuto Murata
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
NULL Project
Invalid
Undecided
Unassigned
OpenStack Neutron API OVN Plugin Charm
In Progress
Undecided
Yamen Hatahet

Bug Description

As described in the following links, there seems to be a fragmentation support in OVN by now.

https://bugs.launchpad.net/networking-ovn/+bug/1838405

It's disabled by default, and it would be nice if the charms can make an intelligent decision to enable it if conditions are met or having a config knob at least would be nice.

https://opendev.org/openstack/networking-ovn/commit/d7e950002d7bcf1bbebab5cd289b59d65366492e
> it requires a specific kernel
> version for it to work (upstream >= 5.2).
...
> At earlier versions of this patch we thought about having networking-ovn
> to discover if this option is supported or not (by issuing that
> ovs-appctl command above) but, since networking-ovn doesn't run in the
> network nodes we thought that an upper level tool (such as puppet-ovn or
> TripleO) could do a better job at checking for it and configuring things
> accordingly.

Nobuto Murata (nobuto)
affects: charm-ovn-chassis → charm-neutron-api-plugin-ovn
Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

This option must go in "ovn" section at the neutron-api ml2_conf.ini file. But it seems reasonable that it is "injected" from the neutron-api-plugin-ovn charm.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote (last edit ):

Ok, I'm testing this. I have a VM on an internal network with MTU 9000 (on a geneve overlay network which can support up to 9058 bytes sized packets). I also have a provider network with mtu 1500. I have a router plugged on both networks, using the latter as an external (and FIPs) network.

When I make both nets the same MTU size (both 1500 or both 9000) everything works great, I can route traffic, get FIPs, etc. When I make them different MTUs as described above, I get dropped packets at the router when size >1500 (on both directions).

I expected the router to 1) fragment or 2) send need-to-frag messages. I'm happy with (2) because all source hosts will have PTMUD enabled and that will suffice. It seems that enabling this option in neutron on plugin (ovn_emit_need_to_frag) will do that.

I forked the charm neutron-api-plugin-ovn and hardcoded the option in the options array to be passed to neutron-api over the relation. The neutron-api is receiving this option and adding it correctly to the ml2_conf.ini file, in the [ovn] section as expected. Here is a snippet:

root@juju-235408-4-lxd-16:/etc/neutron/plugins/ml2# cat ml2_conf.ini
[...]

[ovn]
ovn_nb_connection = ssl:100.126.1.25:6641,ssl:100.126.1.96:6641,ssl:100.126.0.241:6641
ovn_nb_private_key = /etc/neutron/plugins/ml2/key_host
ovn_nb_certificate = /etc/neutron/plugins/ml2/cert_host
ovn_nb_ca_cert = /etc/neutron/plugins/ml2/neutron-api-plugin-ovn.crt
ovn_sb_connection = ssl:100.126.1.25:16642,ssl:100.126.1.96:16642,ssl:100.126.0.241:16642
ovn_sb_private_key = /etc/neutron/plugins/ml2/key_host
ovn_sb_certificate = /etc/neutron/plugins/ml2/cert_host
ovn_sb_ca_cert = /etc/neutron/plugins/ml2/neutron-api-plugin-ovn.crt
ovn_l3_scheduler = leastloaded
ovn_metadata_enabled = True
enable_distributed_floating_ip = False
dns_servers = 172.21.12.32,172.21.12.62
dhcp_default_lease_time = 43200
ovn_dhcp4_global_options =
ovn_dhcp6_global_options =
vhost_sock_dir = /run/libvirt-vhost-user
ovn_emit_need_to_frag = true

(last line is the added option)

Is this sufficient or should it also added to other places (like ovn-chassis?).

What I'm seeing that this option did not change the behavior at all, I still get packets dropped at the router without getting an ICMP need to frag back from it.

Note: I redeployed the whole cloud to make sure the option was in place, using a forked charm (instead of adding it to an existing cloud).

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Adding neutron to the bug because of last comment. I can open separate issues if anyone thinks it's more appropriate.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote (last edit ):

Just for clarification, in my tests I am:

- deploying openstack wallaby on ubuntu focal
- using OVN 20.12 (this was introduced at 20.03)
- using kernel 5.4 (focal GA -- this needs minimum 5.2)
- have checked the "Check pkt length action" is "yes"
- using latest charms as today for ovn-central / ovn-chassis / neutron / neutron-api / neutron-api-plugin-ovn (although I forked the latest version to hardcode the option)

I have been doing many different tests but the most obvious one is to send a big packet to outside from the VM, see it dropped on the router and no icmp come back.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

https://pastebin.canonical.com/p/r4BVn2KRg6/

Some evidences about gateway_mtu being set.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

The config option "ovn_emit_need_to_frag" is already in Neutron [1]. Neutron does not configure OVN to enable this feature but is aware of it. What else is needed in this project?

Regards.

[1]https://github.com/openstack/neutron/blob/4952baaf6fe090ed302cd266a8231c8c377b21d7/neutron/conf/plugins/ml2/drivers/ovn/ovn_conf.py#L191-L199

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

@rodolfo

Can you clarify what do you mean with "Neutron does not config OVN"?

The "neutron" component of this bug is because I am adding the option to neutron config and still not getting the expected behavior.

The other component (the charm one) is about propagating this to charm options, which will be done later (not important yet).

Revision history for this message
Frode Nordahl (fnordahl) wrote :

@Andre as you show in comment #5 the gateway_mtu is set on the LRP, this is the part that Neutron does.

So do you still think there is a bug in Neutron here?

Revision history for this message
Andre Ruiz (andre-ruiz) wrote (last edit ):

I'm really not sure, and I was hoping that I would get some suggestions on how to better test it. If that is all that neutron does, then ok, neutron does not seem to have a problem.

I'm removing neutron (marking invalid) from the bug, sorry for the noise.

Changed in neutron:
status: New → Invalid
Revision history for this message
Frode Nordahl (fnordahl) wrote :

I wonder if the charm should just enable this option by default, it is expected behavior for a router to do this.

Does anyone see scenarios where one would not want the router to send ICMP need to frag messages?

Changed in charm-neutron-api-plugin-ovn:
status: New → Confirmed
no longer affects: charm-layer-ovn
Revision history for this message
Nobuto Murata (nobuto) wrote :

When filing this feature request, I thought newer releases of OS or cloud-archive was required to support ovn_emit_need_to_frag. But looks like focal-ussuri just satisfies the conditions defined in the upstream so +1 to enable it whenever supported.

Nobuto Murata (nobuto)
tags: added: good-first-bug
affects: neutron → null-and-void
Revision history for this message
Trent Lloyd (lathiat) wrote :

The only reason not to do this is if the kernel is <5.2. It should be enabled by default in all other cases. This would only matter on Bionic which may have such a kernel. Focal-Usuri on should always have this supported as 5.4 is the default kernel there.

So +1 to enable by default on Usurri except it would need some logic for detecting bionic-usurri installs and not enabling it in that case unless it has a new enough kernel.

Revision history for this message
Trent Lloyd (lathiat) wrote :

While testing this on Usurri, enabling the option and restarting neutron-server was not sufficient to get the 'gateway_mtu' option added in 'ovn-nbctl list Logical_Router_port' for already deployed networks.

I also had to set the MTU on the network again to trigger an update:
openstack network set ext_net --mtu 1500

Revision history for this message
Patricia Hayes (patricia453) wrote :

     Thank you for posting that it could be just the thing to give inspiration to someone who needs it! Keep up the great work! https://apps.apple.com/us/app/dinar-guru-dinarguru-app/id1581089419

Yamen Hatahet (yhatahet)
Changed in charm-neutron-api-plugin-ovn:
assignee: nobody → Yamen Hatahet (yhatahet)
Changed in charm-neutron-api-plugin-ovn:
status: Confirmed → In Progress
Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
DUFOUR Olivier (odufourc) wrote :
Download full text (3.3 KiB)

From a testing environment and the patch made by Yamen, I was able to test and validate the behavior of OVN when ovn_need_emit_frag is enabled.

For some supplementary details, with the current implementation of OVN, it will only emit a "Need to frag" packet in the direction of Internal to External network.
The motivation behind was in the case where Jumbo frames are enabled, internal network could be with a higher MTU around 8900 but some external networks could be stuck at the more classical MTU of 1500.
It will not work between internal to internal network of different MTU and neither from external to internal networks.

From the ongoing patch, it will most likely work on any new installation. But if upgrading the charm, it will be most likely required to restart manually neutron-server.service on all neutron-api units for the changes to be taken into account.

The details about the testing environment :
* Internal network: 192.168.123.0/24 with MTU 1442
* External network: 192.168.21.0/24 with MTU 900
* self-inst1 is hosted solely on the internal network but with a floating IP to the external network
* a router is configured with the external network as default gateway

Below are the tests after applying the charm with the fix from Yamen. (It wouldn't work at all otherwise on current versions of the charm)
#
# Part 1 ping under the MTU limit
#
ubuntu@self-inst1:~$ ping 192.168.21.2 -c 2 -M do -s 800 [1/1]
PING 192.168.21.2 (192.168.21.2) 800(828) bytes of data.
808 bytes from 192.168.21.2: icmp_seq=1 ttl=63 time=3.26 ms
808 bytes from 192.168.21.2: icmp_seq=2 ttl=63 time=1.45 ms

#
# Part 2 ping over the MTU limit
#
ubuntu@self-inst1:~$ ping 192.168.21.2 -c 2 -M do -s 1400
PING 192.168.21.2 (192.168.21.2) 1400(1428) bytes of data.
From 192.168.123.1 icmp_seq=1 Frag needed and DF set (mtu = 900)
ping: local error: message too long, mtu=900

ubuntu@self-inst1:~$ sudo tcpdump -lnei ens2 icmp
sudo: unable to resolve host self-inst1: Temporary failure in name resolution
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens2, link-type EN10MB (Ethernet), snapshot length 262144 bytes
#
# Part 1 under the MTU
#
07:00:08.653716 fa:16:3e:11:ea:c9 > fa:16:3e:a1:5e:ee, ethertype IPv4 (0x0800), length 842: 192.168.123.168 > 192.168.21.2: ICMP echo request, id 1, seq 1, length 808
07:00:08.656941 fa:16:3e:a1:5e:ee > fa:16:3e:11:ea:c9, ethertype IPv4 (0x0800), length 842: 192.168.21.2 > 192.168.123.168: ICMP echo reply, id 1, seq 1, length 808
07:00:09.655592 fa:16:3e:11:ea:c9 > fa:16:3e:a1:5e:ee, ethertype IPv4 (0x0800), length 842: 192.168.123.168 > 192.168.21.2: ICMP echo request, id 1, seq 2, length 808
07:00:09.657005 fa:16:3e:a1:5e:ee > fa:16:3e:11:ea:c9, ethertype IPv4 (0x0800), length 842: 192.168.21.2 > 192.168.123.168: ICMP echo reply, id 1, seq 2, length 808
#
# Part 2 requesting a packet over the MTU
#
07:10:27.676260 fa:16:3e:a1:5e:ee > fa:16:3e:11:ea...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.