Low RX performance for 40G Solarflare NICs

Bug #1964512 reported by Heitor Alves de Siqueira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Committed
High
Heitor Alves de Siqueira
Focal
Fix Released
High
Heitor Alves de Siqueira
Impish
Fix Released
High
Heitor Alves de Siqueira
Jammy
Fix Committed
High
Heitor Alves de Siqueira

Bug Description

[Impact]
* Some 40G Solarflare NICs have low RX performance in some cases, due
  low RX recycle ring size
* RX recycle ring size is either 4096 for IOMMU, 16 for NOIOMMU
* The low fixed sizes can cause a high number of calls to alloc_pages,
  tanking performance for higher link speeds

[Test Plan]
* Users report that iperf3 is sufficient to showcase the bad RX performance
* For some setups, RX performance was around 15Gbps while TX stayed
  consistently above 30Gbps

[Fix]
* This patch sets the RX recycle ring size according to an adapter's
  maximum link speed
* Fix was introduced by commit:
  000fe940e51f "sfc: The size of the RX recycle ring should be more flexible"
* --!-- Commit is from net-next --!--

[Regression Potential]
* Regressions would show primarily as performance issues, as we're
  effectively changing ring sizes for all RX traffic
* It's possible to see increased calls to alloc_pages if ring sizes
  aren't being set correctly
* We should look out for excessive memory usage in the sfc driver due to
  the increased ring sizes

CVE References

Changed in linux (Ubuntu Impish):
importance: Undecided → High
Changed in linux (Ubuntu Focal):
importance: Undecided → High
Changed in linux (Ubuntu Impish):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Focal):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Impish):
status: New → In Progress
Changed in linux (Ubuntu Jammy):
status: Confirmed → In Progress
Changed in linux (Ubuntu Focal):
status: New → In Progress
Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Impish):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.13.0-38.43 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
tags: added: verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.4.0-106.120 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-5.4/5.4.0-107.121~18.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Validated for Ubuntu Focal with the current kernel from focal-proposed.
Tested on a 10G NIC that's affected by this patch (device ID 0x0903). iperf3 shows that we're able to almost saturate the NIC successfully in both standard and reverse sender mode:

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.3 GBytes 8.83 Gbits/sec 266 sender
[ 5] 0.00-10.00 sec 10.3 GBytes 8.82 Gbits/sec receiver

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.8 GBytes 9.30 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 10.8 GBytes 9.30 Gbits/sec receiver

ubuntu@duduo:~$ uname -rv
5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:10:54 UTC 2022

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Validated for Ubuntu Bionic with the current HWE kernel from bionic-proposed.
Tested on a 10G NIC that's affected by this patch (device ID 0x0903). iperf3 shows that we're able to almost saturate the NIC successfully in both standard and reverse sender mode:

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.2 GBytes 8.80 Gbits/sec 376 sender
[ 5] 0.00-10.00 sec 10.2 GBytes 8.80 Gbits/sec receiver

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.9 GBytes 9.39 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 10.9 GBytes 9.38 Gbits/sec receiver

ubuntu@duduo:~$ uname -rv
5.4.0-108-generic #122~18.04.1-Ubuntu SMP Wed Apr 6 16:57:12 UTC 2022

Other basic smoke testing confirms no major regressions.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Validated for Ubuntu Impish with the current kernel from impish-proposed.
Tested on a 10G NIC that's affected by this patch (device ID 0x0903). iperf3 shows that we're able to almost saturate the NIC successfully in both standard and reverse sender mode:

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.1 GBytes 8.69 Gbits/sec 990 sender
[ 5] 0.00-9.99 sec 10.1 GBytes 8.69 Gbits/sec receiver

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.8 GBytes 9.30 Gbits/sec 155 sender
[ 5] 0.00-10.00 sec 10.8 GBytes 9.30 Gbits/sec receiver

ubuntu@duduo:~$ uname -rv
5.13.0-40-generic #45-Ubuntu SMP Tue Mar 29 14:48:14 UTC 2022

Other basic smoke testing confirms no major regressions.

tags: added: verification-done-impish
removed: verification-needed-impish
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

In addition to my own validations above, impacted users that have 40G NICs have also confirmed the patch behaves as expected without major regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (53.1 KiB)

This bug was fixed in the package linux - 5.13.0-40.45

---------------
linux (5.13.0-40.45) impish; urgency=medium

  * impish/linux: 5.13.0-40.45 -proposed tracker (LP: #1966701)

  * CVE-2022-1016
    - netfilter: nf_tables: initialize registers in nft_do_chain()

  * CVE-2022-1015
    - netfilter: nf_tables: validate registers coming from userspace.

  * audit: improve audit queue handling when "audit=1" on cmdline
    (LP: #1965723) // Impish update: upstream stable patchset 2022-03-22
    (LP: #1966021)
    - audit: improve audit queue handling when "audit=1" on cmdline

  * PS/2 Keyboard wakeup from s2idle not functioning on AMD Yellow Carp platform
    (LP: #1961739)
    - PM: s2idle: ACPI: Fix wakeup interrupts handling

  * Low RX performance for 40G Solarflare NICs (LP: #1964512)
    - SAUCE: sfc: The size of the RX recycle ring should be more flexible

  * [UBUNTU 20.04] Fix SIGP processing on KVM/s390 (LP: #1962578)
    - KVM: s390: Simplify SIGP Set Arch handling
    - KVM: s390: Add a routine for setting userspace CPU state

  * Move virtual graphics drivers from linux-modules-extra to linux-modules
    (LP: #1960633)
    - [Packaging] Move VM DRM drivers into modules

  * Impish update: upstream stable patchset 2022-03-09 (LP: #1964422)
    - bnx2x: Utilize firmware 7.13.21.0
    - bnx2x: Invalidate fastpath HSI version for VFs
    - rcu: Tighten rcu_advance_cbs_nowake() checks
    - select: Fix indefinitely sleeping task in poll_schedule_timeout()
    - drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
    - arm64/bpf: Remove 128MB limit for BPF JIT programs
    - Bluetooth: refactor malicious adv data check
    - net: sfp: ignore disabled SFP node
    - net: stmmac: skip only stmmac_ptp_register when resume from suspend
    - s390/hypfs: include z/VM guests with access control group set
    - bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
    - scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP
      devices
    - udf: Restore i_lenAlloc when inode expansion fails
    - udf: Fix NULL ptr deref when converting from inline format
    - efi: runtime: avoid EFIv2 runtime services on Apple x86 machines
    - PM: wakeup: simplify the output logic of pm_show_wakelocks()
    - tracing/histogram: Fix a potential memory leak for kstrdup()
    - tracing: Don't inc err_log entry count if entry allocation fails
    - ceph: properly put ceph_string reference after async create attempt
    - ceph: set pool_ns in new inode layout for async creates
    - fsnotify: fix fsnotify hooks in pseudo filesystems
    - Revert "KVM: SVM: avoid infinite loop on NPF from bad address"
    - perf/x86/intel/uncore: Fix CAS_COUNT_WRITE issue for ICX
    - drm/etnaviv: relax submit size limits
    - KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS
    - netfilter: nft_payload: do not update layer 4 checksum when mangling
      fragments
    - serial: 8250: of: Fix mapped region size when using reg-offset property
    - serial: stm32: fix software flow control transfer
    - tty: n_gsm: fix SW flow control encoding/handling
    - tty: Add support for Brainboxes UC cards.
    - usb-storage: Add unusual-devs...

Changed in linux (Ubuntu Impish):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (10.8 KiB)

This bug was fixed in the package linux - 5.4.0-109.123

---------------
linux (5.4.0-109.123) focal; urgency=medium

  * focal/linux: 5.4.0-109.123 -proposed tracker (LP: #1968290)

  * USB devices not detected during boot on USB 3.0 hubs (LP: #1968210)
    - SAUCE: Revert "Revert "xhci: Set HCD flag to defer primary roothub
      registration""
    - SAUCE: Revert "Revert "usb: core: hcd: Add support for deferring roothub
      registration""

linux (5.4.0-108.122) focal; urgency=medium

  * focal/linux: 5.4.0-108.122 -proposed tracker (LP: #1966740)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync dkms-build{,--nvidia-N} from LRMv5
    - debian/dkms-versions -- update from kernel-versions (main/2022.03.21)

  * Low RX performance for 40G Solarflare NICs (LP: #1964512)
    - SAUCE: sfc: The size of the RX recycle ring should be more flexible

  * [UBUNTU 20.04] KVM: Enable storage key checking for intercepted instruction
    (LP: #1962831)
    - selftests: kvm: add _vm_ioctl
    - selftests: kvm: Introduce the TEST_FAIL macro
    - KVM: selftests: Add GUEST_ASSERT variants to pass values to host
    - KVM: s390: gaccess: Refactor gpa and length calculation
    - KVM: s390: gaccess: Refactor access address range check
    - KVM: s390: gaccess: Cleanup access to guest pages
    - s390/uaccess: introduce bit field for OAC specifier
    - s390/uaccess: fix compile error
    - s390/uaccess: Add copy_from/to_user_key functions
    - KVM: s390: Honor storage keys when accessing guest memory
    - KVM: s390: handle_tprot: Honor storage keys
    - KVM: s390: selftests: Test TEST PROTECTION emulation
    - KVM: s390: Add optional storage key checking to MEMOP IOCTL
    - KVM: s390: Add vm IOCTL for key checked guest absolute memory access
    - KVM: s390: Rename existing vcpu memop functions
    - KVM: s390: Add capability for storage key extension of MEM_OP IOCTL
    - KVM: s390: Update api documentation for memop ioctl
    - KVM: s390: Clarify key argument for MEM_OP in api docs
    - KVM: s390: Add missing vm MEM_OP size check

  * 【sec-0911】 fail to reset sec module (LP: #1943301)
    - crypto: hisilicon/sec2 - Add workqueue for SEC driver.
    - crypto: hisilicon/sec2 - update SEC initialization and reset

  * Lots of hisi_qm zombie task slow down system after stress test
    (LP: #1932117)
    - crypto: hisilicon - Use one workqueue per qm instead of per qp

  * Lots of hisi_qm zombie task slow down system after stress test
    (LP: #1932117) // 【sec-0911】 fail to reset sec module (LP: #1943301)
    - crypto: hisilicon - Unify hardware error init/uninit into QM

  * [UBUNTU 20.04] Fix SIGP processing on KVM/s390 (LP: #1962578)
    - KVM: s390: Simplify SIGP Set Arch handling
    - KVM: s390: Add a routine for setting userspace CPU state

  * Move virtual graphics drivers from linux-modules-extra to linux-modules
    (LP: #1960633)
    - [Packaging] Move VM DRM drivers into modules

  * Focal update: v5.4.178 upstream stable release (LP: #1964634)
    - audit: improve audit queue handling when "audit=1" on cmdline
    - ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
    - ASoC: ops: Reject out of bounds values in snd_...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.