[OVN][HWOL] traffic problems when sriov and non-sriov ports are bound on the same hypervisor

Bug #2020168 reported by Michal Nasiadka
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
New
Medium
Rodolfo Alonso

Bug Description

Environment:
OpenStack Yoga
Mellanox ConnectX-6 cards
OpenvSwitch 2.17
OVN 22.09
ML2/OVN driver

I have two instances on one hypervisor in one vlan type network (tagged)
VM1 is Mellanox ASAP2 SRIOV port with binding profile "switchdev" (10.1.112.89)
VM2 is normal port instance (10.1.112.15)

In that VLAN we have an external router (10.1.112.254)

When both VMs are up - ping to external router - I get a reply for 1/2 first packets, and then nothing (the same with tcp traffic).

What is interesting - if I send the ICMP packets from VM1 to the gateway:
1. I can see ICMP echo request and reply packets on external OVS port (bond0):
# tcpdump -nei bond0 vlan 112 and icmp
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:23:14.722184 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.89 > 10.1.112.254: ICMP echo request, id 2, seq 1, length 64
07:23:14.722395 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 2, seq 1, length 64
07:23:15.723068 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.89 > 10.1.112.254: ICMP echo request, id 2, seq 2, length 64
(and then it stops)

2. I can see the ICMP echo requests on VM2 port (but no replies).
# tcpdump -nei tap53e35d44-27 icmp
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap53e35d44-27, link-type EN10MB (Ethernet), capture size 262144 bytes
07:18:10.991163 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 1, length 64
07:18:11.992577 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 2, length 64
07:18:12.993063 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 3, length 64
07:18:14.018573 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 4, length 64
07:18:15.043013 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 5, length 64
07:18:16.066584 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 6, length 64
07:18:17.090599 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 7, length 64

Is this a Neutron bug or rather an OVN/OpenvSwitch bug?

tags: added: ovn sriov-pci-pt
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Michal:

If I'm not wrong, what you are using is ML2/OVN with HW offload, right? I'm saying that in order to make this distinction clear. In shake of clarity, is better if the remove the SRIOV tag from the title and add HWOL (just to avoid confusing that with ML2/SRIOV, that could be used too with ML2/OVN).

I have some questions:
* Are you using FIPs?
* Did you try pinging another IP on the external network?
* In your reployment, do you have [1]?
* Related to the last point, what is the router configuration? Network attached, type, etc.
* Do you have HA? How many controllers do you have? I'm assuming the GW is in one of these controllers.
* Did you do a full trace of the ICMP packets? I mean, tracking the packet from the VM, though the compute node interface, the switch, the controller HW interface, the controller GW port, etc.

Regards.

[1]https://review.opendev.org/q/I25e5ee2cf8daee52221a640faa7ac09679742707

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :
Download full text (4.3 KiB)

Yes, ML2/OVN with HW offload.

I have some questions:
* Are you using FIPs?

No

* Did you try pinging another IP on the external network?

Yes, other IPs work (the destination IP is a VRR on Cumulus Linux switch)

* In your deployment, do you have [1]?

Yes I do, I did run ovn sync util after updating to a version containing that before raising the bug

* Related to the last point, what is the router configuration? Network attached, type, etc.

The router it outside of OpenStack/Neutron (VRR on Cumulus Linux switch)

* Do you have HA? How many controllers do you have? I'm assuming the GW is in one of these controllers.

3 controllers, but the GW is not on them

* Did you do a full trace of the ICMP packets? I mean, tracking the packet from the VM, though the compute node interface, the switch, the controller HW interface, the controller GW port, etc.

Yes, I did tracing on:
1) ens0f1_6 (HW offload instance port)

(first one/two replies, then nothing)
listening on ens1f0_6, link-type EN10MB (Ethernet), capture size 262144 bytes
11:09:02.305764 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 1, length 64
11:09:02.307805 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 98: 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 14, seq 1, length 64
11:09:03.307190 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 2, length 64
11:09:03.307912 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 98: 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 14, seq 2, length 64
11:09:04.308288 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 3, length 64
11:09:05.313646 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 4, length 64
11:09:06.337654 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 5, length 64
11:09:07.361663 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 6, length 64
11:09:08.385652 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 7, length 64
11:09:09.409641 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 8, length 64
11:09:10.433645 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 9, length 64
11:09:10.786348 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 62: 10.1.112.252 > 10.1.112.89: ICMP echo reply, id 1540, seq 1, length 28
11:09:11.457645 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 10, length 64

2) tap53e35d44-27
(request packets arriving)
11:08:2...

Read more...

summary: - [OVN][SRIOV] traffic problems when sriov and non-sriov ports are bound
- on the same hypervisor
+ [OVN][HWOL] traffic problems when sriov and non-sriov ports are bound on
+ the same hypervisor
Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance: Undecided → Medium
Revision history for this message
Michal Nasiadka (mnasiadka) wrote (last edit ):

According to imaximets in #openvswitch - this might be rather kernel CT bug or VF bug - so I'll try updating the firmware for the NIC and updating kernel + NVIDIA OFED drivers - and retest.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Ok, I'll wait for your update.

In this case, reading c#2, the Neutron network configuration seems trivial: the VM port is sending the traffic to the VLAN network; this traffic is egressing the host via the HWOL NIC. As you mentioned, we are not considering any Neutron GW, router or any other network device.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.