Mellanox Check For Write Combining Support

Bug #1874503 reported by Joseph Salisbury
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-azure (Ubuntu)
Fix Released
Undecided
Marcelo Cerri
Bionic
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
linux-azure-4.15 (Ubuntu)
New
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Focal
Invalid
Undecided
Unassigned

Bug Description

[Impact]

Microsoft and Mellanox would like to request the following two commits in the releases that run on Azure:

3f89b01f4bba IB/mlx5: Align usage of QP1 create flags with rest of mlx5 defines
11f552e21755 IB/mlx5: Test write combining support

These commits landed in mainline as of v5.5-rc1.

These commits will improved network performance of RDMA out of the VF when WC can be used. The highest benefit is to RDMA and DPDK, but UCX should also see improved latency.

The Mellanox driver uses WC to optimize posting work to the HCA, and getting this wrong in either direction can cause a significant performance loss. These patches prevent the possible performance loss.

[Test Case]

Simply measure the network performance with the patches applied to check if there was some gain.

[Regression Potential]

It affects mainly the mlx5 driver and IB, while many of the instances do not make use of those, they are used by the high performance users.

CVE References

Marcelo Cerri (mhcerri)
description: updated
Revision history for this message
Marcelo Cerri (mhcerri) wrote :
Changed in linux-azure (Ubuntu Bionic):
status: New → Invalid
Changed in linux-azure-4.15 (Ubuntu Focal):
status: New → Invalid
Changed in linux-azure (Ubuntu Focal):
status: New → In Progress
Changed in linux-azure-4.15 (Ubuntu Bionic):
status: New → In Progress
Changed in linux-azure (Ubuntu):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux-azure-4.15 (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.0 KiB)

This bug was fixed in the package linux-azure-4.15 - 4.15.0-1096.106

---------------
linux-azure-4.15 (4.15.0-1096.106) bionic; urgency=medium

  * bionic/linux-azure-4.15: 4.15.0-1096.106 -proposed tracker (LP: #1894683)

  * Mellanox Check For Write Combining Support (LP: #1874503)
    - RDMA/rxe: Close a race after ib_register_device
    - IB/mlx5: Align usage of QP1 create flags with rest of mlx5 defines
    - IB/mlx5: Test write combining support

  * Enable Invariant TSC Support (LP: #1875467)
    - x86/hyperv: Allow guests to enable InvariantTSC
    - x86/hyperv: Set TSC clocksource as default w/ InvariantTSC

  * Only notify Hyper-V for die events that are oops (LP: #1891222)
    - Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops

  * speed for CX4 VF showing as unknown in ethtool output (LP: #1876770)
    - net/mlx5: Expose link speed directly
    - net/mlx5: Expose port speed when possible
    - net/mlx5: Tidy up and fix reverse christmas ordring

  [ Ubuntu: 4.15.0-118.119 ]

  * bionic/linux: 4.15.0-118.119 -proposed tracker (LP: #1894697)
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * Introduce the new NVIDIA 450-server and the 450 UDA series (LP: #1887674)
    - [packaging] add signed modules for nvidia 450 and 450-server
  * cgroup refcount is bogus when cgroup_sk_alloc is disabled (LP: #1886860)
    - cgroup: add missing skcd->no_refcnt check in cgroup_sk_clone()
  * CVE-2020-12888
    - vfio/type1: Support faulting PFNMAP vmas
    - vfio-pci: Fault mmaps to enable vma tracking
    - vfio-pci: Invalidate mmaps and block MMIO access on disabled memory
  * [Hyper-V] VSS and File Copy daemons intermittently fails to start
    (LP: #1891224)
    - [Packaging] Bind hv_vss_daemon startup to hv_vss device
    - [Packaging] bind hv_fcopy_daemon startup to hv_fcopy device
  * KVM: Fix zero_page reference counter overflow when using KSM on KVM compute
    host (LP: #1837810)
    - KVM: fix overflow of zero page refcount with ksm running
  * Fix false-negative return value for rtnetlink.sh in kselftests/net
    (LP: #1890136)
    - selftests: rtnetlink: correct the final return value for the test
    - selftests: rtnetlink: make kci_test_encap() return sub-test result
  * Bionic update: upstream stable patchset 2020-08-18 (LP: #1892091)
    - USB: serial: qcserial: add EM7305 QDL product ID
    - USB: iowarrior: fix up report size handling for some devices
    - usb: xhci: define IDs for various ASMedia host controllers
    - usb: xhci: Fix ASMedia ASM1142 DMA addressing
    - Revert "ALSA: hda: call runtime_allow() for all hda controllers"
    - ALSA: seq: oss: Serialize ioctls
    - staging: android: ashmem: Fix lockdep warning for write operation
    - Bluetooth: Fix slab-out-of-bounds read in hci_extended_inquiry_result_evt()
    - Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_evt()
    - Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_with_rssi_evt()
    - omapfb: dss: Fix max fclk divider for omap36xx
    - binder: Prevent context manager from incrementing ref 0
    - vgacon: Fix for missing check in scrollback handling
    - mtd: properly check all write i...

Read more...

Changed in linux-azure-4.15 (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.1 KiB)

This bug was fixed in the package linux-azure - 4.15.0-1096.106~16.04.1

---------------
linux-azure (4.15.0-1096.106~16.04.1) xenial; urgency=medium

  * xenial/linux-azure: 4.15.0-1096.106~16.04.1 -proposed tracker (LP: #1894681)

  [ Ubuntu: 4.15.0-1096.106 ]

  * bionic/linux-azure-4.15: 4.15.0-1096.106 -proposed tracker (LP: #1894683)
  * Mellanox Check For Write Combining Support (LP: #1874503)
    - RDMA/rxe: Close a race after ib_register_device
    - IB/mlx5: Align usage of QP1 create flags with rest of mlx5 defines
    - IB/mlx5: Test write combining support
  * Enable Invariant TSC Support (LP: #1875467)
    - x86/hyperv: Allow guests to enable InvariantTSC
    - x86/hyperv: Set TSC clocksource as default w/ InvariantTSC
  * Only notify Hyper-V for die events that are oops (LP: #1891222)
    - Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops
  * speed for CX4 VF showing as unknown in ethtool output (LP: #1876770)
    - net/mlx5: Expose link speed directly
    - net/mlx5: Expose port speed when possible
    - net/mlx5: Tidy up and fix reverse christmas ordring
  * bionic/linux: 4.15.0-118.119 -proposed tracker (LP: #1894697)
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * Introduce the new NVIDIA 450-server and the 450 UDA series (LP: #1887674)
    - [packaging] add signed modules for nvidia 450 and 450-server
  * cgroup refcount is bogus when cgroup_sk_alloc is disabled (LP: #1886860)
    - cgroup: add missing skcd->no_refcnt check in cgroup_sk_clone()
  * CVE-2020-12888
    - vfio/type1: Support faulting PFNMAP vmas
    - vfio-pci: Fault mmaps to enable vma tracking
    - vfio-pci: Invalidate mmaps and block MMIO access on disabled memory
  * [Hyper-V] VSS and File Copy daemons intermittently fails to start
    (LP: #1891224)
    - [Packaging] Bind hv_vss_daemon startup to hv_vss device
    - [Packaging] bind hv_fcopy_daemon startup to hv_fcopy device
  * KVM: Fix zero_page reference counter overflow when using KSM on KVM compute
    host (LP: #1837810)
    - KVM: fix overflow of zero page refcount with ksm running
  * Fix false-negative return value for rtnetlink.sh in kselftests/net
    (LP: #1890136)
    - selftests: rtnetlink: correct the final return value for the test
    - selftests: rtnetlink: make kci_test_encap() return sub-test result
  * Bionic update: upstream stable patchset 2020-08-18 (LP: #1892091)
    - USB: serial: qcserial: add EM7305 QDL product ID
    - USB: iowarrior: fix up report size handling for some devices
    - usb: xhci: define IDs for various ASMedia host controllers
    - usb: xhci: Fix ASMedia ASM1142 DMA addressing
    - Revert "ALSA: hda: call runtime_allow() for all hda controllers"
    - ALSA: seq: oss: Serialize ioctls
    - staging: android: ashmem: Fix lockdep warning for write operation
    - Bluetooth: Fix slab-out-of-bounds read in hci_extended_inquiry_result_evt()
    - Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_evt()
    - Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_with_rssi_evt()
    - omapfb: dss: Fix max fclk divider for omap36xx
    - binder: Prevent context manager from incrementing ref 0
    - vgacon...

Read more...

Changed in linux-azure (Ubuntu):
status: New → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (43.4 KiB)

This bug was fixed in the package linux-azure - 5.4.0-1026.26

---------------
linux-azure (5.4.0-1026.26) focal; urgency=medium

  * focal/linux-azure: 5.4.0-1026.26 -proposed tracker (LP: #1894641)

  * Mellanox Check For Write Combining Support (LP: #1874503)
    - IB/mlx5: Align usage of QP1 create flags with rest of mlx5 defines
    - IB/mlx5: Test write combining support

  * Enable Invariant TSC Support (LP: #1875467)
    - x86/hyperv: Allow guests to enable InvariantTSC
    - clocksource/drivers/hyper-v: Set TSC clocksource as default w/ InvariantTSC

  * Only notify Hyper-V for die events that are oops (LP: #1891222)
    - Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops

  * speed for CX4 VF showing as unknown in ethtool output (LP: #1876770)
    - net/mlx5: Expose link speed directly
    - net/mlx5: Expose port speed when possible
    - net/mlx5: Tidy up and fix reverse christmas ordring

  [ Ubuntu: 5.4.0-48.52 ]

  * focal/linux: 5.4.0-48.52 -proposed tracker (LP: #1894654)
  * mm/slub kernel oops on focal kernel 5.4.0-45 (LP: #1895109)
    - SAUCE: Revert "mm/slub: fix a memory leak in sysfs_slab_add()"
  * Packaging resync (LP: #1786013)
    - update dkms package versions
    - update dkms package versions
  * Introduce the new NVIDIA 450-server and the 450 UDA series (LP: #1887674)
    - [packaging] add signed modules for nvidia 450 and 450-server
  * [UBUNTU 20.04] zPCI attach/detach issues with PF/VF linking support
    (LP: #1892849)
    - s390/pci: fix zpci_bus_link_virtfn()
    - s390/pci: re-introduce zpci_remove_device()
    - s390/pci: fix PF/VF linking on hot plug
  * [UBUNTU 20.04] kernel: s390/cpum_cf,perf: changeDFLT_CCERROR counter name
    (LP: #1891454)
    - s390/cpum_cf, perf: change DFLT_CCERROR counter name
  * [UBUNTU 20.04] zPCI: Enabling of a reserved PCI function regression
    introduced by multi-function support (LP: #1891437)
    - s390/pci: fix enabling a reserved PCI function
  * CVE-2020-12888
    - vfio/type1: Support faulting PFNMAP vmas
    - vfio-pci: Fault mmaps to enable vma tracking
    - vfio-pci: Invalidate mmaps and block MMIO access on disabled memory
  * [Hyper-V] VSS and File Copy daemons intermittently fails to start
    (LP: #1891224)
    - [Packaging] Bind hv_vss_daemon startup to hv_vss device
    - [Packaging] bind hv_fcopy_daemon startup to hv_fcopy device
  * alsa/hdmi: support nvidia mst hdmi/dp audio (LP: #1867704)
    - ALSA: hda - Rename snd_hda_pin_sense to snd_hda_jack_pin_sense
    - ALSA: hda - Add DP-MST jack support
    - ALSA: hda - Add DP-MST support for non-acomp codecs
    - ALSA: hda - Add DP-MST support for NVIDIA codecs
    - ALSA: hda: hdmi - fix regression in connect list handling
    - ALSA: hda: hdmi - fix kernel oops caused by invalid PCM idx
    - ALSA: hda: hdmi - preserve non-MST PCM routing for Intel platforms
    - ALSA: hda: hdmi - Keep old slot assignment behavior for Intel platforms
    - ALSA: hda - Fix DP-MST support for NVIDIA codecs
  * Focal update: v5.4.60 upstream stable release (LP: #1892899)
    - smb3: warn on confusing error scenario with sec=krb5
    - genirq/affinity: Make affinity setting if activated opt-in
    - ...

Changed in linux-azure (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.