[Ubuntu 20.04] Striding RQ as Default for ConnectX-4
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Release Notes for Ubuntu |
Fix Released
|
Undecided
|
Frank Heimes | ||
Ubuntu on IBM z Systems |
Fix Released
|
Medium
|
Unassigned | ||
linux (Ubuntu) |
Won't Fix
|
Medium
|
Skipper Bug Screeners |
Bug Description
ello,
Within our Network Performance runs in the RoCE Express 2(.1) area, we noticed a performance regression with streaming workloads which could be mitigated by using an ethtool setting.
The Commit which switched the default value from "Striding RQ" to "Legacy RQ" for ConnectX-4 devices (RoCE Express 2(.1)) is attached here:
commit 5ffd81943d7a574
Author: Tariq Toukan <email address hidden>
Date: Tue Feb 20 15:17:54 2018 +0200
net/mlx5e: RX, Always prefer Linear SKB configuration
Prefer the linear SKB configuration of Legacy RQ over the
non-linear one of Striding RQ.
This implies that ConnectX-4 LX now uses legacy RQ by default,
as it does not support the linear configuration of Striding RQ.
Signed-off-by: Tariq Toukan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
diff --git a/drivers/
index 2c634e50d051.
--- a/drivers/
+++ b/drivers/
@@ -4405,9 +4405,16 @@ void mlx5e_build_
/* RQ */
- if (mlx5e_
- MLX5E_SET_
- !slow_pci_
+ /* Prefer Striding RQ, unless any of the following holds:
+ * - Striding RQ configuration is not possible/supported.
+ * - Slow PCI heuristic.
+ * - Legacy RQ would use linear SKB while Striding RQ would use non-linear.
+ */
+ if (!slow_
+ mlx5e_striding_
+ (mlx5e_
+ !mlx5e_
+ MLX5E_SET_
We have modified the upstream-kernel to allow us running of measurements and compare differences between Legacy RQ vs Striding RQ. Here is an example below:
Kernel used: 5.4.0-rc7
The measurements run on a dedicated machine (z14) using uperf with streaming profiles (MTU size 1500).
Example throughput drop:
(traffic via a shared card, i.e. client and server using VFs from the same ConnectX-4)
-------
| | Legacy RQ | Striding RQ |
-------
|str-writex30k (1 connection) | 24.62Gb/s | 33.47Gb/s |
-------
Additionaly, two tests with transactional workload using the ethtool proposed switch:
-------
| | Legacy RQ | Striding RQ |
-------
| rr1c-200x30k---1 | 4.12Gb/s | 5.66Gb/s |
-------
| rr1c-200x30k--10 | 15.10Gb/s | 20.77Gb/s |
-------
As concluded in the communication with Mellanox, there is a possibility to use a simple ethtool command to switch between the queuing methods, allowing us to avoid kernel code changes:
ethtool --set-priv-flags DEVNAME rx_striding_rq on
(To list the available settings you may use: ethtool --show-priv-flags DEVNAME)
tags: | added: architecture-s39064 bugnameltc-184497 severity-medium targetmilestone-inin2004 |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → linux (Ubuntu) |
summary: |
- [Ubuntu 20.04] Striding RQ als Default für ConnectX-4 in Distros + [Ubuntu 20.04] Striding RQ as Default for ConnectX-4 |
Changed in ubuntu-z-systems: | |
status: | Incomplete → Triaged |
Changed in ubuntu-release-notes: | |
assignee: | nobody → Frank Heimes (fheimes) |
status: | New → Confirmed |
Changed in ubuntu-z-systems: | |
status: | Triaged → Confirmed |
Changed in linux (Ubuntu): | |
status: | Incomplete → Won't Fix |
Not sure if I understand that ticket correctly - which 5.4-rc7 was used?
Was is directly from upstream?
And why is still a RC kernel in use - 5.4 is released since quite some time?
Well, if only commit "net/mlx5e: RX, Always prefer Linear SKB configuration" is needed, then we are done, since it's already in the Ubuntu focal kernel since quite some time:
focal-clean$ git log --oneline --grep "net/mlx5e: RX, Always prefer Linear SKB configuration" 5.4-5.4. 0-10.13 5.4-5.4. 0-11.14 5.4-5.4. 0-12.15 5.4-5.4. 0-13.16 5.4-5.4. 0-14.17
5ffd81943d7a net/mlx5e: RX, Always prefer Linear SKB configuration
focal-clean$ git tag --contains 5ffd81943d7a
Ubuntu-
Ubuntu-
Ubuntu-
Ubuntu-
Ubuntu-
Ubuntu-5.4.0-15.18
Ubuntu-5.4.0-16.19
Ubuntu-5.4.0-17.20
Ubuntu-5.4.0-17.21
Ubuntu-5.4.0-18.22
Ubuntu-5.4.0-8.11
Ubuntu-5.4.0-9.12
Looks to me that you can simply move to the current Ubuntu focal kernel (ideally the one from proposed) and proceed with testing from there ...