Comment 8 for bug 1925452

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2021-04-26 08:55 EDT-------
SRU Justification:
==================

[Impact]

* In addition to 9c9be85f6b59 "net/mlx5e: Add missing capability check for uplink follow" (handled in LP#1921104) another fix 1a73704c82ed "Fix HW spec violation configuring uplink" (handled in this bug, LP#1925452) is needed to fix issues that were introducted with commit 7d0314b11cdd "net/mlx5e: Modify uplink state on interface up/down".

* Commit 1a73704c82ed "Fix HW spec violation configuring uplink" fixes a regression for mlx5 adapters required to operate in switchdev mode.

* This fix makes sure that the uplink port is modified to follow only if the uplink_follow
capability if it's set as required by the HW specification.

* Failure cause traffic to the uplink representor net device to cease after switching to switchdev mode.

[Fix]

* upstream fix (upstream with v5.12-rc7)
1a73704c82ed4ee95532ac04645d02075bd1ce3d 1a73704c82ed "Fix HW spec violation configuring uplink"

* can be cleanly cherry picked from hirsute master-next and grooy master-next.

* a backport for groovy:
https://launchpadlibrarian.net/534888680/groovy-0001-net-mlx5-Fix-HW-spec-violation-configuring-uplink.patch

* a backport for focal:
https://launchpadlibrarian.net/534847308/focal-0001-net-mlx5-Fix-HW-spec-violation-configuring-uplink.patch

[Test Case]

[Test Case]
* Two servers, installed with Ubuntu Server 20.04 or 20.10 are needed.
* Each server needs to have a Mellanox ConnectX4/5 adapter, attached to the same switch
* Adapters must be running adapter firmware level 16.29.1006 or earlier.
* enable SRIOV and switchdev mode on one adapter:
echo 0 > /sys/bus/pci/devices/0100\:00\:00.0/sriov_drivers_autoprobe echo 0 > /sys/bus/pci/devices/0100\:00\:00.1/sriov_drivers_autoprobe echo 64 > /sys/bus/pci/devices/0100\:00\:00.0/sriov_numvfs echo 64 > /sys/bus/pci/devices/0100\:00\:00.1/sriov_numvfs devlink dev eswitch set pci/0100:00:00.0 mode switchdev devlink dev eswitch set pci/0100:00:00.1 mode switchdev
* Assign an IP address to the physical function device of the adapters on both systems
* IP communication will fail
* With the fix, IP communication can be stablished

[Regression Potential]

* There is always at least some potential for regression. In this case the new code can go wrong (or might become worse than before) in case the new if statement is wrong.

* It checks for the condition of "MLX5_CAP_GEN(mdev, uplink_follow)" and in case MLX5_CAP_GEN is calculated errornous or mdev is other than expected, the mlx5_modify_vport_admin_state call might go wrong, too.

* But since only the If clause was added, the changes are pretty minimal and therefore well traceable.