RDMA Back port DMA buffer fix

Bug #2004807 reported by Tim Gardner
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-aws (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Tim Gardner
Jammy
Fix Released
Medium
Tim Gardner
Kinetic
Fix Released
Medium
Tim Gardner

Bug Description

SRU Justification

[Impact]

When registering a new DMA MR after selecting the best aligned page size
for it, we iterate over the given sglist to split each entry to smaller,
aligned to the selected page size, DMA blocks.

In given circumstances where the sg entry and page size fit certain
sizes and the sg entry is not aligned to the selected page size, the
total size of the aligned pages we need to cover the sg entry is >= 4GB.
Under this circumstances, while iterating page aligned blocks, the
counter responsible for counting how much we advanced from the start of
the sg entry is overflowed because its type is u32 and we pass 4GB in
size. This can lead to an infinite loop inside the iterator function
because the overflow prevents the counter to be larger
than the size of the sg entry.

Fixes: a808273 ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks")

[Test Plan]

AWS tested

[Where things could go wrong]

What could possibly go wrong with Remote DMA scatter/gather list errors ?

[Other Info]

SF: #00353710

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2004807

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Tim Gardner (timg-tpi)
affects: linux (Ubuntu) → linux-aws (Ubuntu)
Changed in linux-aws (Ubuntu):
status: Incomplete → New
Changed in linux-aws (Ubuntu Focal):
assignee: nobody → Tim Gardner (timg-tpi)
importance: Undecided → Medium
status: New → In Progress
Changed in linux-aws (Ubuntu Jammy):
assignee: nobody → Tim Gardner (timg-tpi)
importance: Undecided → Medium
status: New → In Progress
Changed in linux-aws (Ubuntu Kinetic):
assignee: nobody → Tim Gardner (timg-tpi)
importance: Undecided → Medium
status: New → In Progress
Changed in linux-aws (Ubuntu):
status: New → Fix Released
Revision history for this message
Tim Gardner (timg-tpi) wrote :
Tim Gardner (timg-tpi)
Changed in linux-aws (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux-aws (Ubuntu Jammy):
status: In Progress → Fix Committed
Changed in linux-aws (Ubuntu Kinetic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.19.0-1020.21 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-aws verification-needed-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.4.0-1097.105 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-aws verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.15.0-1031.35 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws verification-needed-jammy
Tim Gardner (timg-tpi)
tags: added: verification-done-focal verification-done-jammy verification-done-kinetic
removed: verification-needed-focal verification-needed-jammy verification-needed-kinetic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (154.6 KiB)

This bug was fixed in the package linux-aws - 5.15.0-1031.35

---------------
linux-aws (5.15.0-1031.35) jammy; urgency=medium

  * jammy/linux-aws: 5.15.0-1031.35 -proposed tracker (LP: #2004305)

  * Jammy update: v5.15.81 upstream stable release (LP: #2003130)
    - [Config] aws: Updates after rebase

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2023.01.30)

  * Regression in ext4 during online resize (LP: #2003816)
    - ext4: fix bad checksum after online resize
    - ext4: fix corruption when online resizing a 1K bigalloc fs
    - SAUCE: Export ext4_superblock_csum function
    - ext4: fix corrupt backup group descriptors after online resize

  * cma alloc failure in large 5.15 arm instances (LP: #1990167)
    - [Config] aws: Disable CONFIG_CMA for arm64

  * RDMA Back port DMA buffer fix (LP: #2004807)
    - RDMA/core: Fix ib block iterator counter overflow

  [ Ubuntu: 5.15.0-66.73 ]

  * jammy/linux: 5.15.0-66.73 -proposed tracker (LP: #2004636)
  * CVE-2023-0461
    - SAUCE: Fix inet_csk_listen_start after CVE-2023-0461

  [ Ubuntu: 5.15.0-65.72 ]

  * jammy/linux: 5.15.0-65.72 -proposed tracker (LP: #2004344)
  * Packaging resync (LP: #1786013)
    - [Packaging] update variants
    - debian/dkms-versions -- update from kernel-versions (main/2023.01.30)
  * NFS: client permission error after adding user to permissible group
    (LP: #2003053)
    - NFS: Clear the file access cache upon login
    - NFS: Judge the file access cache's timestamp in rcu path
    - NFS: Fix up a sparse warning
  * Fix W6400 hang after resume of S3 stress (LP: #2000299)
    - drm/amd/display: Manually adjust strobe for DCN303
  * Rear Audio port sometimes has no audio output after reboot(Cirrus Logic)
    (LP: #1998905)
    - ALSA: hda/cirrus: Add extra 10 ms delay to allow PLL settle and lock.
  * CVE-2022-20369
    - NFSD: fix use-after-free in __nfs42_ssc_open()
  * CVE-2023-0461
    - net/ulp: prevent ULP without clone op from entering the LISTEN status
    - net/ulp: use consistent error code when blocking ULP
  * CVE-2023-0179
    - netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits
  * Jammy update: v5.15.85 upstream stable release (LP: #2003139)
    - udf: Discard preallocation before extending file with a hole
    - udf: Fix preallocation discarding at indirect extent boundary
    - udf: Do not bother looking for prealloc extents if i_lenExtents matches
      i_size
    - udf: Fix extending file within last block
    - usb: gadget: uvc: Prevent buffer overflow in setup handler
    - USB: serial: option: add Quectel EM05-G modem
    - USB: serial: cp210x: add Kamstrup RF sniffer PIDs
    - USB: serial: f81232: fix division by zero on line-speed change
    - USB: serial: f81534: fix division by zero on line-speed change
    - xhci: Apply XHCI_RESET_TO_DEFAULT quirk to ADL-N
    - igb: Initialize mailbox message for VF reset
    - usb: dwc3: pci: Update PCIe device ID for USB3 controller on CPU sub-system
      for Raptor Lake
    - HID: uclogic: Add HID_QUIRK_HIDINPUT_FORCE quirk
    - selftests: net: Use "grep -E" instead of "egrep"
    - net: loopback: use NET_NAME_PR...

Changed in linux-aws (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (189.1 KiB)

This bug was fixed in the package linux-aws - 5.19.0-1020.21

---------------
linux-aws (5.19.0-1020.21) kinetic; urgency=medium

  * kinetic/linux-aws: 5.19.0-1020.21 -proposed tracker (LP: #2004284)

  * Kinetic update: upstream stable patchset 2023-01-27 (LP: #2004051)
    - [Config] Update configs after rebase

  * cma alloc failure in large 5.15 arm instances (LP: #1990167)
    - [Config] aws: Disable CONFIG_CMA for arm64

  * RDMA Back port DMA buffer fix (LP: #2004807)
    - RDMA/core: Fix ib block iterator counter overflow

  [ Ubuntu: 5.19.0-35.36 ]

  * kinetic/linux: 5.19.0-35.36 -proposed tracker (LP: #2004652)
  * CVE-2023-0461
    - SAUCE: Fix inet_csk_listen_start after CVE-2023-0461

  [ Ubuntu: 5.19.0-34.35 ]

  * kinetic/linux: 5.19.0-34.35 -proposed tracker (LP: #2004299)
  * LXD containers using shiftfs on ZFS or TMPFS broken on 5.15.0-48.54
    (LP: #1990849)
    - [SAUCE] shiftfs: fix -EOVERFLOW inside the container
  * Kinetic update: upstream stable patchset 2023-01-27 (LP: #2004051)
    - ASoC: fsl_sai: use local device pointer
    - serial: Add rs485_supported to uart_port
    - serial: fsl_lpuart: Fill in rs485_supported
    - x86/sgx: Create utility to validate user provided offset and length
    - x86/sgx: Add overflow check in sgx_validate_offset_length()
    - binder: validate alloc->mm in ->mmap() handler
    - ceph: Use kcalloc for allocating multiple elements
    - ceph: fix NULL pointer dereference for req->r_session
    - wifi: mac80211: fix memory free error when registering wiphy fail
    - wifi: mac80211_hwsim: fix debugfs attribute ps with rc table support
    - riscv: dts: sifive unleashed: Add PWM controlled LEDs
    - audit: fix undefined behavior in bit shift for AUDIT_BIT
    - wifi: airo: do not assign -1 to unsigned char
    - wifi: mac80211: Fix ack frame idr leak when mesh has no route
    - wifi: ath11k: Fix QCN9074 firmware boot on x86
    - spi: stm32: fix stm32_spi_prepare_mbr() that halves spi clk for every run
    - selftests/bpf: Add verifier test for release_reference()
    - Revert "net: macsec: report real_dev features when HW offloading is enabled"
    - platform/x86: ideapad-laptop: Disable touchpad_switch
    - platform/x86: touchscreen_dmi: Add info for the RCA Cambio W101 v2 2-in-1
    - platform/x86/intel/pmt: Sapphire Rapids PMT errata fix
    - scsi: ibmvfc: Avoid path failures during live migration
    - scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC
    - drm: panel-orientation-quirks: Add quirk for Acer Switch V 10 (SW5-017)
    - block, bfq: fix null pointer dereference in bfq_bio_bfqg()
    - arm64/syscall: Include asm/ptrace.h in syscall_wrapper header.
    - nvmet: fix memory leak in nvmet_subsys_attr_model_store_locked
    - Revert "drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10
      properly""
    - ALSA: usb-audio: add quirk to fix Hamedal C20 disconnect issue
    - RISC-V: vdso: Do not add missing symbols to version section in linker script
    - MIPS: pic32: treat port as signed integer
    - xfrm: fix "disable_policy" on ipv4 early demux
    - xfrm: replay: Fix ESN wrap around for GSO
    - af_key: Fix send_acquire race wit...

Changed in linux-aws (Ubuntu Kinetic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (114.3 KiB)

This bug was fixed in the package linux-aws - 5.4.0-1097.105

---------------
linux-aws (5.4.0-1097.105) focal; urgency=medium

  * focal/linux-aws: 5.4.0-1097.105 -proposed tracker (LP: #2004355)

  * Focal update: v5.4.226 upstream stable release (LP: #2003896)
    - [Config] aws: updateconfigs for INET_TABLE_PERTURB_ORDER

  * RDMA Back port DMA buffer fix (LP: #2004807)
    - RDMA/core: Fix ib block iterator counter overflow

  [ Ubuntu: 5.4.0-144.161 ]

  * focal/linux: 5.4.0-144.161 -proposed tracker (LP: #2004653)
  * CVE-2023-0461
    - SAUCE: Fix inet_csk_listen_start after CVE-2023-0461

  [ Ubuntu: 5.4.0-143.160 ]

  * focal/linux: 5.4.0-143.160 -proposed tracker (LP: #2004385)
  * NFS: client permission error after adding user to permissible group
    (LP: #2003053)
    - NFS: Clear the file access cache upon login
    - NFS: Judge the file access cache's timestamp in rcu path
    - NFS: Fix up a sparse warning
  * Focal update: v5.4.229 upstream stable release (LP: #2003914)
    - tracing/ring-buffer: Only do full wait when cpu != RING_BUFFER_ALL_CPUS
    - udf: Discard preallocation before extending file with a hole
    - udf: Fix preallocation discarding at indirect extent boundary
    - udf: Do not bother looking for prealloc extents if i_lenExtents matches
      i_size
    - udf: Fix extending file within last block
    - usb: gadget: uvc: Prevent buffer overflow in setup handler
    - USB: serial: option: add Quectel EM05-G modem
    - USB: serial: cp210x: add Kamstrup RF sniffer PIDs
    - USB: serial: f81232: fix division by zero on line-speed change
    - USB: serial: f81534: fix division by zero on line-speed change
    - igb: Initialize mailbox message for VF reset
    - xen-netback: move removal of "hotplug-status" to the right place
    - HID: ite: Add support for Acer S1002 keyboard-dock
    - HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch 10E
    - HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch V 10
    - HID: uclogic: Add HID_QUIRK_HIDINPUT_FORCE quirk
    - net: loopback: use NET_NAME_PREDICTABLE for name_assign_type
    - usb: musb: remove extra check in musb_gadget_vbus_draw
    - ARM: dts: qcom: apq8064: fix coresight compatible
    - arm64: dts: qcom: sdm845-cheza: fix AP suspend pin bias
    - drivers: soc: ti: knav_qmss_queue: Mark knav_acc_firmwares as static
    - arm: dts: spear600: Fix clcd interrupt
    - soc: ti: knav_qmss_queue: Use pm_runtime_resume_and_get instead of
      pm_runtime_get_sync
    - soc: ti: knav_qmss_queue: Fix PM disable depth imbalance in knav_queue_probe
    - soc: ti: smartreflex: Fix PM disable depth imbalance in omap_sr_probe
    - perf: arm_dsu: Fix hotplug callback leak in dsu_pmu_init()
    - perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init()
    - arm64: dts: mt2712e: Fix unit_address_vs_reg warning for oscillators
    - arm64: dts: mt2712e: Fix unit address for pinctrl node
    - arm64: dts: mt2712-evb: Fix vproc fixed regulators unit names
    - arm64: dts: mt2712-evb: Fix usb vbus regulators unit names
    - arm64: dts: mediatek: mt6797: Fix 26M oscillator unit name
    - ARM: dts: dove: Fix assigned-addresses for eve...

Changed in linux-aws (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws-5.15/5.15.0-1046.51~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-done-focal-linux-aws-5.15'. If the problem still exists, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-failed-focal-linux-aws-5.15'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-aws-5.15-v2 verification-needed-focal-linux-aws-5.15
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.