OVN provider network type vlan packets cannot go outside the bond on Intel E810-XXV card
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Ubuntu 20.04.5 LTS
ubuntu@
Linux compute-09 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@
ubuntu@
31:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
31:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
ca:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
ca:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
The test instance with provider network floating ip 10.99.0.213 cannot reach the provider network gateway
openstack server create --key-name ubuntu-keypair --image auto-sync/
ubuntu@
ubuntu@
PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
^C
--- 10.99.0.254 ping statistics ---
419 packets transmitted, 0 received, 100% packet loss, time 428035ms
I found the compute from which the outside traffic is going out
and I see ARP requests with no response
compute-09:~$ sudo tcpdump -vteni bond1 '(vlan 300)'
tcpdump: listening on bond1, link-type EN10MB (Ethernet), capture size 262144 bytes fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.99.0.254 tell 10.99.0.88, length 28
fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.99.0.254 tell 10.99.0.88, length 28
fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.99.0.254 tell 10.99.0.88, length 28
For the test you may ping .254 indifenetely
The error count grows on tx packets on bond1 and the card ens2f0 (which happens to push the traffic)
ubuntu@
tx_errors: 12
tx_errors.nic: 0
rx_
rx_
ubuntu@
ens2f0: flags=6211<
ether b4:83:51:00:83:d1 txqueuelen 1000 (Ethernet)
RX packets 53784 bytes 22064970 (22.0 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 52163 bytes 18393142 (18.3 MB)
TX errors 12 dropped 0 overruns 0 carrier 0 collisions 0
If I create vlan interface directly on bond1 I can ping the gateway with no problem
so that creates opportunity for
WORKAROUND 1: set the network to flat and push traffic on vlan interfaces on computes as for physnet device
Another thing I tried was to install the HWE kernel
ubuntu@
Linux compute-09 5.15.0-60-generic #66~20.04.1-Ubuntu SMP Wed Jan 25 09:41:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Fortunately traffic was still going out from compute-09 after reboot,
that fixed the issue
so we have WORKAROUND 2
ubuntu@
PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
64 bytes from 10.99.0.254: icmp_seq=1 ttl=63 time=2.15 ms
64 bytes from 10.99.0.254: icmp_seq=2 ttl=63 time=0.896 ms
64 bytes from 10.99.0.254: icmp_seq=3 ttl=63 time=1.12 ms
^C
ubuntu@infra-1:~$ ping 10.99.0.213
PING 10.99.0.213 (10.99.0.213) 56(84) bytes of data.
64 bytes from 10.99.0.213: icmp_seq=1 ttl=62 time=5.12 ms
64 bytes from 10.99.0.213: icmp_seq=2 ttl=62 time=2.17 ms
64 bytes from 10.99.0.213: icmp_seq=3 ttl=62 time=0.948 ms
64 bytes from 10.99.0.213: icmp_seq=4 ttl=62 time=1.00 ms
64 bytes from 10.99.0.213: icmp_seq=5 ttl=62 time=0.891 ms
64 bytes from 10.99.0.213: icmp_seq=6 ttl=62 time=1.05 ms
Now I can ping both ways
However I am afraid that we may encounter same issue like for Jammy for the cards when booting, as it happens randomly for the kernel with the same number 5.15.0-60
Here's the bug I am referring to
https:/
---
ProblemType: Bug
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Feb 27 13:33 seq
crw-rw---- 1 root audio 116, 33 Feb 27 13:33 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
CasperMD5CheckR
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 004: ID 1604:10c0 Tascam
Bus 001 Device 003: ID 1604:10c0 Tascam
Bus 001 Device 002: ID 1604:10c0 Tascam
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Lsusb-t:
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=
|__ Port 14: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
MachineType: Dell Inc. PowerEdge R650
NonfreeKernelMo
Package: linux (not installed)
PciMultimedia:
ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=
ProcVersionSign
RelatedPackageV
linux-
linux-
linux-firmware 1.187.36
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal uec-images
Uname: Linux 5.4.0-139-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 09/14/2022
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.8.2
dmi.board.name: 0PJ7YJ
dmi.board.vendor: Dell Inc.
dmi.board.version: A01
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.
dmi.product.family: PowerEdge
dmi.product.name: PowerEdge R650
dmi.product.sku: SKU=0912;
dmi.sys.vendor: Dell Inc.
description: | updated |
Changed in linux (Ubuntu): | |
status: | Incomplete → New |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 2008781
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.