Azure: Mellanox VF NIC crashes when removed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-azure (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Medium
|
Tim Gardner |
Bug Description
SRU Justification
[Impact]
The 5.4.0-1075-azure and newer kernels are broken in that the VM can easily panic when the Mellanox VF NIC is removed and added due to Azure host servicing events or the below manual "unbind/bind" test (here the GUID can be different in different VMs):
for i in `seq 1 1000`;
do
cd /sys/bus/
echo abdc2107-
echo abdc2107-
done
A sample panic call-trace is:
[ 107.359954] kernel BUG at /build/
[ 107.363858] invalid opcode: 0000 [#1] SMP NOPTI
[ 107.365870] CPU: 0 PID: 334 Comm: kworker/0:2 Not tainted 5.4.0-1077-azure #80~18.04.1-Ubuntu
[ 107.369589] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 107.373811] Workqueue: events vmbus_onmessage
[ 107.375909] RIP: 0010:kfree+
…
[ 107.413789] Call Trace:
[ 107.414867] kobject_
[ 107.416747] kobject_
[ 107.418327] device_
[ 107.420653] device_
[ 107.422523] bus_remove_
[ 107.424279] device_
[ 107.425824] device_
[ 107.427536] vmbus_device_
[ 107.429528] vmbus_onoffer_
[ 107.431474] vmbus_onmessage
[ 107.433104] vmbus_onmessage
[ 107.434919] process_
[ 107.436661] worker_
It turns out there is a bug in https:/
Please apply the below patch to fix the issue:
--- a/drivers/
+++ b/drivers/
@@ -3653,7 +3653,7 @@ static int hv_pci_
- free_page((unsigned long)hbus);
+ kfree(hbus);
return ret;
}
BTW, please apply this patch as well (Note: this patch is not really required as it's only for error handling path, which is usually unlikely):
https:/
[Test Case]
Microsoft tested
CVE References
affects: | linux (Ubuntu) → linux-azure (Ubuntu) |
Changed in linux-azure (Ubuntu): | |
status: | New → Invalid |
Changed in linux-azure (Ubuntu Focal): | |
assignee: | nobody → Tim Gardner (timg-tpi) |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in linux-azure (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
Dexuan - see https:/ /bugs.launchpad .net/bugs/ 1973758