[SRU] UBSAN warnings in bnx2x kernel driver
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
linux (Ubuntu) | Status tracked in Oracular | |||||
Focal |
Fix Released
|
High
|
Ghadi Rahme | |||
Jammy |
Fix Released
|
High
|
Ghadi Rahme | |||
Noble |
Fix Committed
|
High
|
Ghadi Rahme | |||
Oracular |
Fix Released
|
High
|
Ghadi Rahme |
Bug Description
[impact]
Currently in the bnx2x kernel driver there are reads/writes that occur out of bounds that have the possibility to cause kernel crashes. No meaningful impact has been observed yet other than UBSAN stack traces.
I have posted a patch upstream to resolve this issue (134061163ee5 bnx2x: Fix multiple UBSAN array-index-
[Test Plan]
There are multiple ways to reproduce the issue. But the most hands free way to reproduce it would be to utilize a Qlogic NIC that makes use of the E2 controller on a system with more than 32 cores. Below are both ways this can be reproduced. Please note that both will require a NIC that makes use of the bnx2x driver.
* Normal Reproduction:
1. start a machine running kernel 6.5 or higher with a a number of cores above 32. Please note that these need to be physical cores not threads. The machine also needs to be using a NIC that utilizes an E2 controller.
2. In dmesg the following UBSAN warnings can be seen:
UBSAN: array-index-
index 20 is out of range for type 'stats_query_entry [19]'
CPU: 12 PID: 858 Comm: systemd-network Not tainted 6.9.0-060900rc7
#202405052133
Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9,
BIOS P89 10/21/2019
Call Trace:
<TASK>
dump_stack_
dump_stack+
__ubsan_
bnx2x_
bnx2x_
bnx2x_
bnx2x_
bnx2x_
__dev_
RIP: 0033:0x736223927a0a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca
64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffc0b
RAX: ffffffffffffffda RBX: 0000583df50f9c78 RCX: 0000736223927a0a
RDX: 0000000000000020 RSI: 0000583df50ee510 RDI: 0000000000000003
RBP: 0000583df50d4940 R08: 00007ffc0bb2adb0 R09: 0000000000000080
R10: 0000000000000000 R11: 0000000000000246 R12: 0000583df5103ae0
R13: 000000000000035a R14: 0000583df50f9c30 R15: 0000583ddddddf00
</TASK>
---[ end trace ]---
------------[ cut here ]------------
UBSAN: array-index-
index 28 is out of range for type 'stats_query_entry [19]'
CPU: 12 PID: 858 Comm: systemd-network Not tainted 6.9.0-060900rc7
#202405052133
Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9,
BIOS P89 10/21/2019
Call Trace:
<TASK>
dump_stack_
dump_stack+
__ubsan_
bnx2x_prep_
bnx2x_stats_
bnx2x_post_
bnx2x_nic_
bnx2x_open+
__dev_open+
RIP: 0033:0x736223927a0a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca
64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffc0b
RAX: ffffffffffffffda RBX: 0000583df50f9c78 RCX: 0000736223927a0a
RDX: 0000000000000020 RSI: 0000583df50ee510 RDI: 0000000000000003
RBP: 0000583df50d4940 R08: 00007ffc0bb2adb0 R09: 0000000000000080
R10: 0000000000000000 R11: 0000000000000246 R12: 0000583df5103ae0
R13: 000000000000035a R14: 0000583df50f9c30 R15: 0000583ddddddf00
</TASK>
---[ end trace ]---
------------[ cut here ]------------
UBSAN: array-index-
index 29 is out of range for type 'stats_query_entry [19]'
CPU: 13 PID: 163 Comm: kworker/u96:1 Not tainted 6.9.0-060900rc7
#202405052133
Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9,
BIOS P89 10/21/2019
Workqueue: bnx2x bnx2x_sp_task [bnx2x]
Call Trace:
<TASK>
dump_stack_
dump_stack+
__ubsan_
bnx2x_
bnx2x_
? bnx2x_hw_
bnx2x_
bnx2x_
bnx2x_
bnx2x_
process_
</TASK>
---[ end trace ]---
* Forced reproducer:
1. Make sure you have a machine running kernel 6.5 and higher with any NIC that makes use of the bnx2x driver (No need for a NIC that utilizes the E2 controller). Also the number of cores the machine has is not important.
2. once the machine is booted unload the bnx2x module from the kernel:
$ sudo modprobe -r bnx2x
3. then load back the driver but while specifying the number of ethernet queues to a value above 16:
$ sudo modprobe bnx2x num_queues=20
4. The same stack traces shown above will show up in dmesg.
[Fix]
The fix already upstream and provided by:
* 134061163ee5 bnx2x: Fix multiple UBSAN array-index-
[where problems could occur]
* Since the patch increases the firmware stats array size, the driver will utilize slightly more memory, however this is still an insignificant amount.
* Since no logic change has been done to the driver the regression risk is minimal
[workaround]
As stated earlier I have already written a patch to solve the issue, but in the meantime one way to avoid this problem would be to unload the driver and then load it back with a value for num_queues below 16:
$ sudo modprobe bnx2x num_queues=15
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
assignee: | nobody → Ghadi Rahme (ghadi-rahme) |
description: | updated |
description: | updated |
summary: |
- UBSAN warnings in bnx2x kernel driver + [SRU] UBSAN warnings in bnx2x kernel driver |
Changed in linux (Ubuntu Oracular): | |
status: | New → Triaged |
Changed in linux (Ubuntu Noble): | |
status: | New → Triaged |
Changed in linux (Ubuntu Jammy): | |
status: | New → Triaged |
Changed in linux (Ubuntu Focal): | |
status: | New → Triaged |
Changed in linux (Ubuntu Noble): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Jammy): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Focal): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Jammy): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Oracular): | |
status: | Triaged → Fix Released |
Changed in linux (Ubuntu Noble): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Focal): | |
assignee: | nobody → Ghadi Rahme (ghadi-rahme) |
Changed in linux (Ubuntu Jammy): | |
assignee: | nobody → Ghadi Rahme (ghadi-rahme) |
Changed in linux (Ubuntu Noble): | |
assignee: | nobody → Ghadi Rahme (ghadi-rahme) |
This bug is awaiting verification that the linux/5. 15.0-120. 130 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- jammy-linux' to 'verification- done-jammy- linux'. If the problem still exists, change the tag 'verification- needed- jammy-linux' to 'verification- failed- jammy-linux' .
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!