[SRU] can't move mellanox interface to switchdev when SR-IOV disable

Bug #2020409 reported by Moshe Levi
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Netplan
Fix Released
Medium
Unassigned
netplan.io (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Committed
Undecided
Lukas Märdian
Noble
Fix Released
Undecided
Unassigned

Bug Description

[ Impact ]

Due to limitations in how Netplan handles SR-IOV devices it wasn't possible
to change the embedded switch mode without having to create Virtual Functions.

Setting the e-switch mode should be allowed independently of
the existence of Virtual Functions.
This problem prevents the use of Scalable Functions without SR-IOV.

This fix is available on Ubuntu 24.04.

[ Test Plan ]

To reproduce the problem addressed by this SRU one needs to
have access to specialized hardware (SR-IOV-capable NICs).

This fix was tested on real hardware when they were implemented
(see https://github.com/canonical/netplan/pull/454 for details) but still need to be
tested on Ubuntu 22.04.

We will work with Canonical's Openstack team to do the fix verification.

 * detailed instructions how to reproduce the bug

A configuration like the below can be used to test if the e-switch mode
can be set to "switchdev" without Virtual Functions:

network:
  version: 2
  ethernets:
    enp3s0f0np0:
      match:
        macaddress: 98:03:9b:c3:ef:ba
      mtu: 9000
      set-name: enp3s0f0np0
      embedded-switch-mode: switchdev
    enp3s0f1np1:
      match:
        macaddress: 98:03:9b:c3:ef:bb
      mtu: 9000
      set-name: enp3s0f1np1
      embedded-switch-mode: switchdev

After applying the configuration, the e-switch mode can be checked with
the devlink tool. For example:

root@node-laveran:~# devlink dev eswitch show pci/0000:03:00.0
pci/0000:03:00.0: mode switchdev inline-mode none encap-mode basic
root@node-laveran:~# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode switchdev inline-mode none encap-mode basic

[ Where problems could occur ]

These changes should affect only SR-IOV related scenarios.
Undetected problems could cause Netplan to fail to configure the device
and Virtual Functions wouldn't be created anymore.

[ Other Info ]

Related work:

https://bugs.launchpad.net/netplan/+bug/2020409
https://github.com/canonical/netplan/pull/454

A PPA for Ubuntu 22.04 can be found here https://launchpad.net/~danilogondolfo/+archive/ubuntu/netplan-sru

---- Original bug description ----

I am looking on the netplan implementation of switchdev [1]. The current code assume that we can move to switchdev only if SR-IOV enabled.
This assumption is incorrect, as we can move to switchdev even if SR-IOV is disabled.

There 2 use-case come to mind:

1. VF Lag with Subfunction (you don't need SR-IOV to enable Subfunction)
2. VF Lag creation. It better to first move the PF (physicals function) to switchdev mode before creating the SR-IOV VF. In this case you don't need to unbind and bind the VFs, which mean you save time at boot.

Who will be the best person at canonical side so help use fix this issues?

[1] - https://github.com/canonical/netplan/blob/3279c57e8b1745be0d19119b4ad1a061c327593e/netplan/cli/sriov.py#L373-L459

Related branches

CVE References

Revision history for this message
Moshe Levi (moshele) wrote :
Revision history for this message
Lukas Märdian (slyon) wrote :
Changed in netplan:
importance: Undecided → Medium
status: New → Triaged
tags: added: foundations-todo
Revision history for this message
Lukas Märdian (slyon) wrote :
tags: removed: foundations-todo
Changed in netplan:
status: Triaged → Fix Committed
Lukas Märdian (slyon)
Changed in netplan:
status: Fix Committed → Fix Released
tags: added: sru-next
Changed in netplan:
status: Fix Released → Fix Committed
Revision history for this message
Lukas Märdian (slyon) wrote :

released as 1.0.1

Changed in netplan:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.8 KiB)

This bug was fixed in the package netplan.io - 1.0.1-1ubuntu1

---------------
netplan.io (1.0.1-1ubuntu1) oracular; urgency=medium

  * Merge from Debian unstable. Remaining changes:
    - d/p/0003-Revert-wait-online-disabled-wait-online-for-stable-1.patch:
      Fix wait-online via s-n-wait-online.service.d/10-netplan.
    - d/libnetplan1.symbols: Update for new (private) symbol

netplan.io (1.0.1-1) unstable; urgency=medium

  * New upstream release: 1.0.1:
    - sriov: accept setting the eswitch mode without VFs (LP: #2020409)
    - cli/sriov: refactoring
    - tests: use proper 0o600 file permissions in more places
    - doc: Adding missing 'watchfiles' dependency for Sphinx
    - doc: Minor fixes in lang. and mark-up in YAML reference
    - doc: Tutorial reorg & lang. + formatting improvements
    - networkd: add wait-online enumeration utils
    - generate: enable systemd-networkd-wait-online for non-optional interfaces
    - CLI:utils: Do not ask for daemon-reload password interactively
    - CLI:generate: call daemon-reload after (re-)generating services
    - wait-online: Do not block on loopback interface
    - generate: Do not touch wait-online, if we don't have any networkd NetDefs
    - wait-online: wait for existing interfaces only and downgrade operational
      state for interfaces without IP configuration
    - wait-online: account for DHCPv4/v6 addresses
    - wait-online: do not require virtual devices to be created already
    - wait-online: recognize that bridge/bond members will never gain
      link-local addresses
    - networkd:apply: Drop handling of legacy wpa@ instance units
    - wait-online: disabled wait-online for stable 1.0
    - test:integration: Try to improve test flakyness
    - autopkgtest: More fixes for flaky 'ethernets' test
    - Increase some test timeouts to account for slow (riscv64) buildds
    SECURITY UPDATE:
    - libnetplan: use more restrictive file permissions
      (Closes: #1072789, LP: #2065738, LP: #1987842)
    - CVE-2022-4968
    - libnetplan: escape control characters
    - backends: escape file paths
    - backends: escape semicolons in service units (LP: #2066258)
    Bug fixes:
    - cli: Fix logging setup when python-rich is not present
    - CI: fix DebCI case for no-change rebuilds
    - CI: adopt autopkgtest for 1.0-1 on 22.04
    - doc: Update README, move CODE_OF_CONDUCT
    - doc: fix en_GB spelling
    - CI: adopt snapd.patch for autopkgtest SRU (LP: #2051939)
    - parse-nm: add a workaround for the DoT DNS option (LP: #2055148)
    - CI: Install netplan-ci PPA
    - parse: don't remove datalist items during iteration
    - ATTN: parse/bonds: handle same primary in multiple bonds
    - parse/bonds: don't fail on primary reassignment
    - cli/sriov: set eswitch regardless of pcidev.vfs
    - doc: Fix wrong bonds.parameters.mode syntax in example
    - parse: fix redefinition of gateway(4|6)
    - doc:tutorial: fix whitespace formatting
    - util: fix potential NULL pointer assert
    - python: elements of __all__ must be strings
    - tests: fix diff test with iproute2 6.8
    - cli/generate: skip daemon_reload with --mapping
    - test: cleanup after wait_online tes...

Read more...

Changed in netplan.io (Ubuntu):
status: New → Fix Released
Revision history for this message
Felipe Alencastro (falencastro) wrote :

Hi, will there be a backport for Jammy?

description: updated
summary: - can't move mellanox interface to switchdev when SR-IOV disable
+ [SRU] can't move mellanox interface to switchdev when SR-IOV disable
Changed in netplan.io (Ubuntu Noble):
status: New → Fix Released
Changed in netplan.io (Ubuntu Jammy):
status: New → In Progress
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Moshe, or anyone else affected,

Accepted netplan.io into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/netplan.io/0.107.1-3ubuntu0.22.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in netplan.io (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Lukas Märdian (slyon) wrote :

Dear openstack team (or anyone with the relevant hardware), can you please help to test this, using the following commands?

cat <<EOF >/etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list
# Enable Ubuntu proposed archive
deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed restricted main multiverse universe
EOF

cat <<EOF >/etc/apt/preferences.d/proposed-updates
# Configure apt to allow selective installs of packages from proposed
Package: *
Pin: release a=$(lsb_release -cs)-proposed
Pin-Priority: 400
EOF

apt udpate
apt install -t jammy-proposed netplan.io

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (netplan.io/0.107.1-3ubuntu0.22.04.2)

All autopkgtests for the newly accepted netplan.io (0.107.1-3ubuntu0.22.04.2) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

initramfs-tools/0.140ubuntu13.5 (arm64, armhf, ppc64el)
initramfs-tools/unknown (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#netplan.io

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Lukas Märdian (slyon) wrote (last edit ):
Download full text (5.3 KiB)

I tested netplan.io 0.107.1-3ubuntu0.22.04.2 from jammy-proposed, all looking good!
The intermittent failures reported in comment #9 are resolved.
The original version upgrade SRU was re-verified in bug 2058031.

## Test1: PFs without VFs (the use case for scalable functions)

First of all, the eswitch/switchdev functionality is not available on Jammy's GA 5.15 kernel:
ubuntu@toadsworth:~$ sudo devlink dev eswitch show pci/0000:86:00.0
kernel answers: Operation not supported

So I upgraded to the HWE kernel and installed Netplan from proposed:
ubuntu@toadsworth:~$ sudo apt-get install --install-recommends linux-generic-hwe-22.04
ubuntu@toadsworth:~$ reboot
ubuntu@toadsworth:~$ uname -a
Linux toadsworth 6.8.0-51-generic #52~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Dec 9 15:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@toadsworth:~$ sudo apt-get install -t jammy-proposed netplan.io
ubuntu@toadsworth:~$ apt list *netplan*
Listing... Done
libnetplan-dev/jammy-proposed 0.107.1-3ubuntu0.22.04.2 amd64
libnetplan0/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic]
netplan-generator/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic]
netplan.io/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic]
python3-netplan/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic]

Next, I identified the Mellanox ConnectX-4 NIC and checked its original eswitch mode (legacy):
ubuntu@toadsworth:~$ sudo lshw -c network -businfo
Bus info Device Class Description
============================================================
pci@0000:06:00.0 ens1f0np0 network XtremeScale SFC9250 10/25/40/50/100G Ethernet Controller
pci@0000:06:00.1 ens1f1np1 network XtremeScale SFC9250 10/25/40/50/100G Ethernet Controller
pci@0000:09:00.0 eno1 network Ethernet Connection X722 for 1GbE
pci@0000:09:00.1 eno2 network Ethernet Connection X722 for 1GbE
pci@0000:86:00.0 ens3np0 network MT27710 Family [ConnectX-4 Lx]
ubuntu@toadsworth:~$ sudo devlink dev eswitch show pci/0000:86:00.0
pci/0000:86:00.0: mode legacy inline-mode none encap-mode basic

I changed the Netplan configuration according to the test plan above, applied the config and confirmed the eswich mode was switched to "switchdev":
ubuntu@toadsworth:~$ sudo netplan set network.ethernets.ens3np0.embedded-switch-mode=switchdev
ubuntu@toadsworth:~$ sudo netplan get network.ethernets.ens3np0
match:
  macaddress: "0c:42:a1:e2:8f:58"
optional: true
set-name: "ens3np0"
mtu: 1500
embedded-switch-mode: "switchdev"

ubuntu@toadsworth:~$ sudo netplan apply
ubuntu@toadsworth:~$ sudo devlink dev eswitch show pci/0000:86:00.0
pci/0000:86:00.0: mode switchdev inline-mode link encap-mode basic

## Test2: change the eswitch mode but with VFs
- Same setup on HWE kernel and Netplan 0.107, starting off on "legacy" eswitch mode and 0 VFs

ubuntu@toadsworth:~$ uname -a
Linux toadsworth 6.8.0-51-generic #52~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Dec 9 15:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@toadsworth:~$ apt list *netplan*
Listing... Done
libnetplan-dev/...

Read more...

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Lukas Märdian (slyon)
tags: removed: sru-next
Revision history for this message
Marcus Boden (marcusboden) wrote :

We've installed the proposed version 0.107.1-3ubuntu0.22.04.2 of netplan following comment #8, to work around our Mellanox devices sometimes not being brought up in switchdev mode after a reboot:
root@ps6-rb1-n5:~# devlink dev eswitch show pci/0000:41:00.0
pci/0000:41:00.0: mode legacy inline-mode none encap-mode basic
root@ps6-rb1-n5:~# devlink dev eswitch show pci/0000:41:00.1
pci/0000:41:00.1: mode legacy inline-mode none encap-mode basic

And setting it manually doesn't work:
root@ps6-rb1-n5:~# devlink dev eswitch set pci/0000:41:00.0 mode switchdev
Error: mlx5_core: Can't change mode, E-Switch is busy.
kernel answers: Device or resource busy

With 0.106.1-7ubuntu0.22.04.4, we sometimes had this issue after a reboots. We would just reboot again and it would work.

With 0.107.1-3ubuntu0.22.04.2, it happens every time (well, 3 times out of 3 tries). This resulted in us having to downgrade to 0.106 again to get the server back up.

Revision history for this message
Lukas Märdian (slyon) wrote :

I'm setting this to "block-proposed-jammy", while investigating comment #11.

Marcus, can you provide some more context on your test case? How does the relevant Netplan config look like? What kernel are you running this on?

If you cannot change the eswitch mode manually (using "devlink"), this seems to be a problem on a lower layer (kernel/driver?) and might be unrelated to Netplan.

tags: added: block-proposed-jammy
Revision history for this message
Marcus Boden (marcusboden) wrote (last edit ):

Hi, this happened on 6.8.0-51 HWE kernel. We're using Mellanox MT2892 cards. The relevant config:

network:
  version: 2
  ethernets:
    ens3f0:
      virtual-function-count: 32
      embedded-switch-mode: switchdev
      delay-virtual-functions-rebind: true

    ens3f1:
      virtual-function-count: 32
      embedded-switch-mode: switchdev
      delay-virtual-functions-rebind: true

As I mentioned, 0.106.1-7ubuntu0.22.04.4 we *sometimes* had the issue of the cards not coming up in switchdev mode. With 0.107.1-3ubuntu0.22.04.2, this happened on all three reboots that we tried it with (and we didn't try more as this caused a significant problem already).

This makes me think it's an issue with the new netplan version, not the kernel or driver.

Lukas Märdian (slyon)
tags: added: server-todo
Lukas Märdian (slyon)
Changed in netplan.io (Ubuntu Jammy):
assignee: nobody → Lukas Märdian (slyon)
Revision history for this message
Lukas Märdian (slyon) wrote (last edit ):

I was not yet able to reproduce the issue (but I found a server with a dual NIC Mellanox ConnectX-6 MT2892). Can you share some more information, please?
* sudo lshw -c network -businfo
* sudo dmesg | grep E-Switch
* Full Netplan configuration from /etc/netplan/*.yaml
  * Are you trying to set up a bond/link-aggregation (LAG) on top of those Mellanox NICs? (similar to bug #1988018)
* cat /sys/class/net/ens3f0/device/sriov_numvfs
  * and: cat /sys/class/net/ens3f1/device/sriov_numvfs

The following would be most interesting on a system with Netplan 0.107.1 (i.e. in the failed state)
* systemctl status netplan-sriov-rebind.service
  * and: journalctl -u netplan-sriov-rebind.service
* systemctl status netplan-sriov-apply.service
  * and: journalctl -u netplan-sriov-apply.service

Revision history for this message
Nicolas Bock (nicolasbock) wrote :

dmesg | grep E-Switch

[ 30.127780] kernel: mlx5_core 0000:41:00.0: E-Switch: Total vports 34, per vport: max uc(128) max mc(2048)
[ 30.836883] kernel: mlx5_core 0000:41:00.1: E-Switch: Total vports 34, per vport: max uc(128) max mc(2048)
[ 37.186020] kernel: mlx5_core 0000:41:00.1: E-Switch: Enable: mode(LEGACY), nvfs(32), active vports(33)
[ 50.640609] kernel: mlx5_core 0000:41:00.0: E-Switch: Enable: mode(LEGACY), nvfs(32), active vports(33)
[ 108.052977] kernel: mlx5_core 0000:41:00.1: E-Switch: Disable: mode(LEGACY), nvfs(32), active vports(33)
[ 109.688247] kernel: mlx5_core 0000:41:00.1: E-Switch: Supported tc chains and prios offload
[ 112.747359] kernel: mlx5_core 0000:41:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(32), active vports(33)
[ 136.733037] kernel: mlx5_core 0000:41:00.0: E-Switch: Disable: mode(LEGACY), nvfs(32), active vports(33)
[ 138.403235] kernel: mlx5_core 0000:41:00.0: E-Switch: Supported tc chains and prios offload
[ 141.295846] kernel: mlx5_core 0000:41:00.0: E-Switch: Enable: mode(OFFLOADS), nvfs(32), active vports(33)

Revision history for this message
Nicolas Bock (nicolasbock) wrote :

lshw

           *-network:0
                description: Ethernet interface
                product: MT2892 Family [ConnectX-6 Dx]
                vendor: Mellanox Technologies
                physical id: 0
                bus info: pci@0000:41:00.0
                logical name: ens3f0
                version: 00
                serial: 10:70:fd:0f:84:f6
                capacity: 40Gbit/s
                width: 64 bits
                clock: 33MHz
                capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical 1000bt-fd 10000bt-fd 25000bt-fd 40000bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=6.2.0-39-generic duplex=full firmware=22.39.1002 (MT_0000000359) latency=0 link=yes multicast=yes slave=yes
                resources: irq:1504 memory:d4000000-d5ffffff memory:d1000000-d10fffff memory:d8000000-d9ffffff

Revision history for this message
Nicolas Bock (nicolasbock) wrote :
Revision history for this message
Nicolas Bock (nicolasbock) wrote :
Revision history for this message
Marcus Boden (marcusboden) wrote :

Hi, here's the rest of the requested output:
https://pastebin.canonical.com/p/ZBVkc8rstg/

Revision history for this message
Lukas Märdian (slyon) wrote :

Thank you for providing the additional data!

Looking at your Netplan config, this looks very much like bug #1988018, as you're using "bond0" to build up a hardware-accelerated LAG. So I've tried to reproduce that other SRU bug, using a Mellanox ConnectX-6 equipped server and I seem to be able to reproduce the failure there!

See bug #1988018 (comment #14++), can we please continue this discussion in that other bug report, because I think that's more relevant than this bug about "switchdev" mode without VFs. – You clearly create virtual-functions and set up a hardware accelerated bond0.

Interestingly, the logs you provided seem to indicate success... are those really from a failed system state?

"""
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:ens3f1 - waiting for the LAG state to be 'active'
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:ens3f1 - VF LAG state is 'active'
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:0000:41:00.1: bound 0 VFs
[...]
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:ens3f0 - waiting for the LAG state to be 'active'
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:ens3f0 - VF LAG state is 'active'
Feb 21 11:54:40 ps6-ra3-n3 netplan[7601]: DEBUG:0000:41:00.0: bound 0 VFs
"""

Well. "bound 0 VFs" doesn't sound quite right, but "VF LAG state is 'active'" sounds better than my reproducer (see bug #1988018)

tags: removed: block-proposed-jammy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.