dual port SRIOV NIC with 64 VFs per PF is not configured with switchdev eswitch mode
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
netplan.io (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Trying to deploy Charmed OpenStack (Yoga) Jammy series with OVN Hardware Offload.
# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_
DISTRIB_
DISTRIB_
# uname -a
Linux node3 5.15.0-41-generic #44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/openstack-
OPENSTACK_
As part of the charms bundle the following config is used:
ovn-chassis:
charm: ch:ovn-chassis
# Please update the `bridge-
# hardware used in your deployment. See the referenced documentation at the
# top of this file.
options:
ovn-
bridge-
enable-
sriov-numvfs: "ens1f0:64 ens1f1:64"
channel: 22.03/stable
bindings:
"": *internal-space
data: *overlay-space
This is translated to the following netplan file on the deployed node:
cat /etc/netplan/
#######
# [ WARNING ]
# Configuration file maintained by Juju. Local changes may be overwritten.
# Config managed by ovn-chassis charm
#######
network:
version: 2
ethernets:
ens1f0:
virtual-
embedded-
delay-
ens1f1:
virtual-
embedded-
delay-
After reboot of the deployed servers, the SRIOV VFs are enabled on the NVIDIA NIC, however the embedded-
#lspci | grep Virtual | wc -l
129
# devlink dev eswitch show pci/0000:08:00.0
pci/0000:08:00.0: mode legacy inline-mode none encap-mode basic
NOTE: When using 50 VFs or below, the switchdev configuration is successful.
Syslog (with udev debug):
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Parsed configuration file /run/systemd/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:19 node4 systemd-udevd[712]: Reading rules file: /run/udev/
Jul 14 14:24:55 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) 'Error: mlx5_core: Failed setting eswitch to offloads.'
Jul 14 14:24:55 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) 'kernel answers: Invalid argument'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) 'Traceback (most recent call last):'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/sbin/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' netplan.main()'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' self.run_command()'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' self.func()'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' self.run_command()'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' self.func()'
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' NetplanApply.
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' apply_sriov_
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' pcidev.
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' subprocess.
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/lib/
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) ' raise CalledProcessEr
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: '/usr/sbin/netplan apply --sriov-only'(err) 'subprocess.
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: Process '/usr/sbin/netplan apply --sriov-only' failed with exit code 1.
Jul 14 14:24:56 node4 systemd-udevd[753]: ens1f1np1: Command "/usr/sbin/netplan apply --sriov-only" returned 1 (error), ignoring.
Jul 14 14:24:56 node4 systemd-udevd[763]: ens1f1: Config file /run/systemd/
Jul 14 14:24:56 node4 systemd-
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) 'Error: mlx5_core: Failed setting eswitch to offloads.'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) 'kernel answers: Invalid argument'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) 'Traceback (most recent call last):'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/sbin/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' netplan.main()'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' self.run_command()'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' self.func()'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' self.run_command()'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' self.func()'
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' NetplanApply.
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' apply_sriov_
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' pcidev.
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/share/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' subprocess.
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' File "/usr/lib/
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) ' raise CalledProcessEr
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: '/usr/sbin/netplan apply --sriov-only'(err) 'subprocess.
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: Process '/usr/sbin/netplan apply --sriov-only' failed with exit code 1.
Jul 14 14:25:00 node4 systemd-udevd[754]: ens1f0np0: Command "/usr/sbin/netplan apply --sriov-only" returned 1 (error), ignoring.
Jul 14 14:25:00 node4 systemd-udevd[763]: ens1f0: Config file /run/systemd/
Jul 14 14:25:00 node4 systemd-
Jul 14 14:25:22 node4 netplan[3268]: 0000:08:00.0: bound 64 VFs
Jul 14 14:25:22 node4 netplan[3268]: 0000:08:00.1: bound 0 VFs
Jul 14 14:25:22 node4 systemd[1]: netplan-
tags: | added: rls-jj-incoming rls-kk-incoming |
tags: | removed: rls-kk-incoming |
tags: | removed: rls-jj-incoming |
I suspect that not all of the VFs were able to "unbind" before trying the devlink switchdev command