[bionic] ConnectX5 Large message size throughput degradation in TCP
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
we see degradation ~20% on ConnectX-5/4 in the following case:
TCP, 1 QP, 1 stream, unidir, single port.
Message sizes 1M and up show this degradation.
After changing the default TX moderation mode to off we see up to 40% packet rate and up to 23% bandwidth degradtions.
There is an upstream commit that fix this issue, I will backport it and send it to the <email address hidden>
commit 48bfc39791b8b4a
Author: Tal Gilboa <email address hidden>
Date: Fri Mar 30 15:50:08 2018 -0700
net/mlx5e: Set EQE based as default TX interrupt moderation mode
The default TX moderation mode was mistakenly set to CQE based. The
intention was to add a control ability in order to improve some specific
use-cases. In general, we prefer to use EQE based moderation as it gives
much better numbers for the common cases.
CQE based causes a degradation in the common case since it resets the
moderation timer on CQE generation. This causes an issue when TSO is
well utilized (large TSO sessions). The timer is set to 16us so traffic
of ~64KB TSO sessions per second would mean timer reset (CQE per TSO
session -> long time between CQEs). In this case we quickly reach the
tcp_
By setting EQE based moderation we make sure timer would expire after
16us regardless of the packet rate.
This fixes an up to 40% packet rate and up to 23% bandwidth degradtions.
Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ")
Signed-off-by: Tal Gilboa <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
diff --git a/drivers/
index c71f4f10283b.
--- a/drivers/
+++ b/drivers/
@@ -4137,7 +4137,7 @@ void mlx5e_build_
{
- u8 cq_period_mode = 0;
+ u8 rx_cq_period_mode;
@@ -4173,12 +4173,12 @@ void mlx5e_build_
/* CQ moderation params */
- cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_
+ rx_cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_
- mlx5e_set_
- mlx5e_set_
+ mlx5e_set_
+ mlx5e_set_
/* TX inline */
Testing this patch with bionic and it is working properly
before the patch
# ethtool --show-priv-flags enp6s0f0
Private flags for enp6s0f0:
rx_cqe_moder : on
tx_cqe_moder : on
rx_cqe_compress: off
After applying the patch
# ethtool --show-priv-flags enp6s0f0
Private flags for enp6s0f0:
rx_cqe_moder : on
tx_cqe_moder : off
rx_cqe_compress: off