EFA: add support for 0xefa1 devices

Bug #1896791 reported by Kamal Mostafa
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Kamal Mostafa
Focal
Fix Released
Undecided
Kamal Mostafa
Groovy
Fix Released
Undecided
Kamal Mostafa
linux-aws (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Kamal Mostafa

Bug Description

AWS RDMA/efa driver: add support for new AWS EFA '0xefa1' devices.

[Impact]

The following 4 mainline commits are required to support the new device features and ID:

d4f9cb5c5b22 RDMA/efa: Add EFA 0xefa1 PCI ID
a5d87b698547 RDMA/efa: User/kernel compatibility handshake mechanism
da2924bdca99 RDMA/efa: Expose minimum SQ size
556c811f24b3 RDMA/efa: Expose maximum TX doorbell batch

[Test Case]

New device is not yet generally available, but has been tested by AWS.

[Regression Potential]

Low regression potential; Affects only the EFA driver.

[Other Info]

Focal and Groovy generic kernels can easily support this patch set, so lets do that, to provide wider support and an updated driver source baseline for those.

Bionic can only support it in bionic/linux-aws (not generic), so lets do that.

no longer affects: linux (Ubuntu Bionic)
no longer affects: linux-aws (Ubuntu Focal)
no longer affects: linux-aws (Ubuntu Groovy)
Changed in linux (Ubuntu Focal):
status: New → In Progress
Changed in linux-aws (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Kamal Mostafa (kamalmostafa)
Changed in linux (Ubuntu Focal):
assignee: nobody → Kamal Mostafa (kamalmostafa)
description: updated
Ian May (ian-may)
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.5 KiB)

This bug was fixed in the package linux - 5.8.0-21.22

---------------
linux (5.8.0-21.22) groovy; urgency=medium

  * groovy/linux: 5.8.0-21.22 -proposed tracker (LP: #1898150)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * Fix broken e1000e device after S3 (LP: #1897755)
    - SAUCE: e1000e: Increase polling timeout on MDIC ready bit

  * EFA: add support for 0xefa1 devices (LP: #1896791)
    - RDMA/efa: Expose maximum TX doorbell batch
    - RDMA/efa: Expose minimum SQ size
    - RDMA/efa: User/kernel compatibility handshake mechanism
    - RDMA/efa: Add EFA 0xefa1 PCI ID

  * Groovy update: v5.8.13 upstream stable release (LP: #1898076)
    - device_cgroup: Fix RCU list debugging warning
    - ASoC: pcm3168a: ignore 0 Hz settings
    - ASoC: wm8994: Skip setting of the WM8994_MICBIAS register for WM1811
    - ASoC: wm8994: Ensure the device is resumed in wm89xx_mic_detect functions
    - ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN Converter9 2-in-1
    - clk: versatile: Add of_node_put() before return statement
    - RISC-V: Take text_mutex in ftrace_init_nop()
    - i2c: aspeed: Mask IRQ status to relevant bits
    - s390/init: add missing __init annotations
    - lockdep: fix order in trace_hardirqs_off_caller()
    - EDAC/ghes: Check whether the driver is on the safe list correctly
    - drm/amdkfd: fix a memory leak issue
    - drm/amd/display: Don't use DRM_ERROR() for DTM add topology
    - drm/amd/display: update nv1x stutter latencies
    - drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC is
    - drm/amd/display: Don't log hdcp module warnings in dmesg
    - objtool: Fix noreturn detection for ignored functions
    - i2c: mediatek: Send i2c master code at more than 1MHz
    - riscv: Fix Kendryte K210 device tree
    - ieee802154: fix one possible memleak in ca8210_dev_com_init
    - ieee802154/adf7242: check status of adf7242_read_reg
    - clocksource/drivers/h8300_timer8: Fix wrong return value in
      h8300_8timer_init()
    - batman-adv: bla: fix type misuse for backbone_gw hash indexing
    - libbpf: Fix build failure from uninitialized variable warning
    - atm: eni: fix the missed pci_disable_device() for eni_init_one()
    - batman-adv: mcast/TT: fix wrongly dropped or rerouted packets
    - netfilter: ctnetlink: add a range check for l3/l4 protonum
    - netfilter: ctnetlink: fix mark based dump filtering regression
    - netfilter: conntrack: nf_conncount_init is failing with IPv6 disabled
    - netfilter: nft_meta: use socket user_ns to retrieve skuid and skgid
    - mac802154: tx: fix use-after-free
    - bpf: Fix clobbering of r2 in bpf_gen_ld_abs
    - tools/libbpf: Avoid counting local symbols in ABI check
    - drm/vc4/vc4_hdmi: fill ASoC card owner
    - net: qed: Disable aRFS for NPAR and 100G
    - net: qede: Disable aRFS for NPAR and 100G
    - net: qed: RDMA personality shouldn't fail VF load
    - igc: Fix wrong timestamp latency numbers
    - igc: Fix not considering the TX delay for timestamps
    - drm/sun4i: sun8i-csc: Secondary CSC register correction
    - hv_netvsc: Switch the data path at the right time during hibernation
    - spi: spi-fsl-dspi:...

Changed in linux (Ubuntu Groovy):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.8 KiB)

This bug was fixed in the package linux-aws - 5.8.0-1007.7

---------------
linux-aws (5.8.0-1007.7) groovy; urgency=medium

  * groovy/linux-aws: 5.8.0-1007.7 -proposed tracker (LP: #1898143)

  * Groovy kernel (5.8.0-1004-aws) creates broken /dev/console on i3.metal
    instances (LP: #1896604)
    - [Config] [aws] set default nr_uarts back to 4 on amd64

  * Miscellaneous Ubuntu changes
    - [Config] toolchain update

  [ Ubuntu: 5.8.0-21.22 ]

  * groovy/linux: 5.8.0-21.22 -proposed tracker (LP: #1898150)
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * Fix broken e1000e device after S3 (LP: #1897755)
    - SAUCE: e1000e: Increase polling timeout on MDIC ready bit
  * EFA: add support for 0xefa1 devices (LP: #1896791)
    - RDMA/efa: Expose maximum TX doorbell batch
    - RDMA/efa: Expose minimum SQ size
    - RDMA/efa: User/kernel compatibility handshake mechanism
    - RDMA/efa: Add EFA 0xefa1 PCI ID
  * Groovy update: v5.8.13 upstream stable release (LP: #1898076)
    - device_cgroup: Fix RCU list debugging warning
    - ASoC: pcm3168a: ignore 0 Hz settings
    - ASoC: wm8994: Skip setting of the WM8994_MICBIAS register for WM1811
    - ASoC: wm8994: Ensure the device is resumed in wm89xx_mic_detect functions
    - ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN Converter9 2-in-1
    - clk: versatile: Add of_node_put() before return statement
    - RISC-V: Take text_mutex in ftrace_init_nop()
    - i2c: aspeed: Mask IRQ status to relevant bits
    - s390/init: add missing __init annotations
    - lockdep: fix order in trace_hardirqs_off_caller()
    - EDAC/ghes: Check whether the driver is on the safe list correctly
    - drm/amdkfd: fix a memory leak issue
    - drm/amd/display: Don't use DRM_ERROR() for DTM add topology
    - drm/amd/display: update nv1x stutter latencies
    - drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC is
    - drm/amd/display: Don't log hdcp module warnings in dmesg
    - objtool: Fix noreturn detection for ignored functions
    - i2c: mediatek: Send i2c master code at more than 1MHz
    - riscv: Fix Kendryte K210 device tree
    - ieee802154: fix one possible memleak in ca8210_dev_com_init
    - ieee802154/adf7242: check status of adf7242_read_reg
    - clocksource/drivers/h8300_timer8: Fix wrong return value in
      h8300_8timer_init()
    - batman-adv: bla: fix type misuse for backbone_gw hash indexing
    - libbpf: Fix build failure from uninitialized variable warning
    - atm: eni: fix the missed pci_disable_device() for eni_init_one()
    - batman-adv: mcast/TT: fix wrongly dropped or rerouted packets
    - netfilter: ctnetlink: add a range check for l3/l4 protonum
    - netfilter: ctnetlink: fix mark based dump filtering regression
    - netfilter: conntrack: nf_conncount_init is failing with IPv6 disabled
    - netfilter: nft_meta: use socket user_ns to retrieve skuid and skgid
    - mac802154: tx: fix use-after-free
    - bpf: Fix clobbering of r2 in bpf_gen_ld_abs
    - tools/libbpf: Avoid counting local symbols in ABI check
    - drm/vc4/vc4_hdmi: fill ASoC card owner
    - net: qed: Disable aRFS for NPAR and 100G
    - net: qede: Disable aRFS for NPA...

Changed in linux-aws (Ubuntu):
status: New → Fix Released
Ian May (ian-may)
Changed in linux-aws (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-aws - 4.15.0-1087.92

---------------
linux-aws (4.15.0-1087.92) bionic; urgency=medium

  * bionic/linux-aws: 4.15.0-1087.92 -proposed tracker (LP: #1900674)

  * CVE-2020-12351 // CVE-2020-12352 // CVE-2020-24490
    - [Config] aws: Disable BlueZ highspeed support

  * EFA: add support for 0xefa1 devices (LP: #1896791)
    - RDMA/efa: Expose maximum TX doorbell batch
    - RDMA/efa: Expose minimum SQ size
    - RDMA/efa: User/kernel compatibility handshake mechanism
    - RDMA/efa: Add EFA 0xefa1 PCI ID

  * aws: enable PCI write-combine for arm64 (LP: #1893817)
    - SAUCE: arm64: Enable PCI write-combine resources under sysfs

  [ Ubuntu: 4.15.0-122.124 ]

  * bionic/linux: 4.15.0-122.124 -proposed tracker (LP: #1899941)
  * CVE-2020-12351 // CVE-2020-12352 // CVE-2020-24490
    - Bluetooth: Disable High Speed by default
    - Bluetooth: MGMT: Fix not checking if BT_HS is enabled
    - [Config] Disable BlueZ highspeed support
  * CVE-2020-12351
    - Bluetooth: L2CAP: Fix calling sk_filter on non-socket based channel
  * CVE-2020-12352
    - Bluetooth: A2MP: Fix not initializing all members

 -- Stefan Bader <email address hidden> Tue, 20 Oct 2020 12:01:06 +0200

Changed in linux-aws (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

This hardware isn't available to us for verification yet. The upstream commits have been vetted by AWS, and their ports are present in Ubuntu-aws-5.4.0-1029.30.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (78.9 KiB)

This bug was fixed in the package linux - 5.4.0-56.62

---------------
linux (5.4.0-56.62) focal; urgency=medium

  * focal/linux: 5.4.0-56.62 -proposed tracker (LP: #1905300)

  * CVE-2020-4788
    - selftests/powerpc: rfi_flush: disable entry flush if present
    - powerpc/64s: flush L1D on kernel entry
    - powerpc/64s: flush L1D after user accesses
    - selftests/powerpc: entry flush test

linux (5.4.0-55.61) focal; urgency=medium

  * focal/linux: 5.4.0-55.61 -proposed tracker (LP: #1903175)

  * Update kernel packaging to support forward porting kernels (LP: #1902957)
    - [Debian] Update for leader included in BACKPORT_SUFFIX

  * Avoid double newline when running insertchanges (LP: #1903293)
    - [Packaging] insertchanges: avoid double newline

  * EFI: Fails when BootCurrent entry does not exist (LP: #1899993)
    - efivarfs: Replace invalid slashes with exclamation marks in dentries.

  * CVE-2020-14351
    - perf/core: Fix race in the perf_mmap_close() function

  * raid10: Block discard is very slow, causing severe delays for mkfs and
    fstrim operations (LP: #1896578)
    - md: add md_submit_discard_bio() for submitting discard bio
    - md/raid10: extend r10bio devs to raid disks
    - md/raid10: pull codes that wait for blocked dev into one function
    - md/raid10: improve raid10 discard request
    - md/raid10: improve discard request for far layout
    - dm raid: fix discard limits for raid1 and raid10
    - dm raid: remove unnecessary discard limits for raid10

  * Bionic: btrfs: kernel BUG at /build/linux-
    eTBZpZ/linux-4.15.0/fs/btrfs/ctree.c:3233! (LP: #1902254)
    - btrfs: drop unnecessary offset_in_page in extent buffer helpers
    - btrfs: extent_io: do extra check for extent buffer read write functions
    - btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent()
    - btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref()
    - btrfs: ctree: check key order before merging tree blocks

  * Ethernet no link lights after reboot (Intel i225-v 2.5G) (LP: #1902578)
    - igc: Add PHY power management control

  * Undetected Data corruption in MPI workloads that use VSX for reductions on
    POWER9 DD2.1 systems (LP: #1902694)
    - powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
    - selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load
      workaround

  * [20.04 FEAT] Support/enhancement of NVMe IPL (LP: #1902179)
    - s390: nvme ipl
    - s390: nvme reipl
    - s390/ipl: support NVMe IPL kernel parameters

  * uvcvideo: add mapping for HEVC payloads (LP: #1895803)
    - media: uvcvideo: Add mapping for HEVC payloads

  * Focal update: v5.4.73 upstream stable release (LP: #1902115)
    - ibmveth: Switch order of ibmveth_helper calls.
    - ibmveth: Identify ingress large send packets.
    - ipv4: Restore flowi4_oif update before call to xfrm_lookup_route
    - mlx4: handle non-napi callers to napi_poll
    - net: fec: Fix phy_device lookup for phy_reset_after_clk_enable()
    - net: fec: Fix PHY init after phy_reset_after_clk_enable()
    - net: fix pos incrementment in ipv6_route_seq_next
    - net/smc: fix valid DMBE buffer sizes
    - net...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.