Ubuntu 22.04 raise abnormal NIC MSI-X requests with larger CPU cores (256)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Fix Released
|
Undecided
|
Luke Nowakowski-Krijger | ||
Kinetic |
Fix Released
|
Undecided
|
Luke Nowakowski-Krijger |
Bug Description
SRU Justification:
[Impact]
There is a user reporting errors in setup with their Intel E810 NIC with
error messages saying that the driver cannot allocate enough MSI-X vectors
on their 256 cpu-count system.
It seems the ICE ethernet driver has an all or nothing approach to
allocating MSI-X vectors and could request more MSI-X vectors than it
finds available, which could lead to the driver failing to initialize and
start.
[Fix]
The patch that fixes this allocates as many MSI-X vectors as it can to continue
functionality by reducing the number of requested MSI-X vectors if it does
not have enough to do full allocation.
[Backport]
In Jammy we do not carry patches for switchdev support in the driver so do not
allocate the switchdev MSI-X vector for it. Also in Jammy use the older
way of checking RDMA support by testing the RDMA bit is set as opposed to the newer
ice_is_rdma_ena that the patch uses.
[Test Plan]
Install and startup Ice driver with an Intel 800 series NIC and check that we
do not have the failure:
Not enough device MSI-X vectors, requested = 260, available = 253
and check that everything works as expected.
The backported patch for Jammy has been tested by the original user who
submited the bug report with their high cpu count system and confirmed no errors.
[Where problems could occur]
There could be problems with the logic of reducing the MSI-X vector
usage leading to more errors in the driver, but otherwise minimal
regression potential as the code is mostly refactoring initial MSI-X
setup.
-------
System Configuration
OS: Ubuntu 22.04 LTS
Kernel: 5.15.0-25-generic
CPUs: 256
NIC: Intel E810 NIC with 512 MSIx vectors each function
Errors
Not enough device MSI-X vectors, requested = 260, available = 253
Findings
(1) the current ice kernel driver (ice_main.c) will pre-allocate all required number of msix (even it's not enough for big core CPUs)
(2) the commit https:/
So for supporting the new CPUs with more than 252 vCPUs, will Ubuntu kernel backport above patch to the current kernel (v5.15) ?
CVE References
affects: | ubuntu-realtime → linux (Ubuntu) |
Changed in linux (Ubuntu Jammy): | |
status: | New → Confirmed |
Changed in linux (Ubuntu): | |
status: | Confirmed → Fix Released |
Changed in linux (Ubuntu Jammy): | |
status: | Confirmed → In Progress |
assignee: | nobody → Luke Nowakowski-Krijger (lukenow) |
Changed in linux (Ubuntu Kinetic): | |
status: | New → Confirmed |
assignee: | nobody → Luke Nowakowski-Krijger (lukenow) |
Changed in linux (Ubuntu Kinetic): | |
status: | Confirmed → In Progress |
description: | updated |
Changed in linux (Ubuntu Jammy): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Kinetic): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-kinetic removed: verification-needed-kinetic |
tags: |
added: verification-done-focal removed: verification-needed-focal |
tags: |
added: verification-done-jammy removed: verification-needed-jammy |
tags: |
added: verification-done-jammy removed: verification-needed-jammy |
tags: |
added: verification-done-focal-linux-aws-5.15 removed: verification-needed-focal-linux-aws-5.15 |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 2012335
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.