5.15.0-69 ice driver deadlocks with bonded e810 NICs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
The ice driver in the 5.15.0-69 kernel deadlocks on rtnl_lock() when adding e810 NICs to a bond interface. Booting with `sysctl.
```
[ 244.980100] INFO: task kworker/6:1:182 blocked for more than 120 seconds.
[ 244.988431] Not tainted 5.15.0-69-generic #76-Ubuntu
[ 244.995279] "echo 0 > /proc/sys/
[ 245.004826] task:kworker/6:1 state:D stack: 0 pid: 182 ppid: 2 flags:0x00004000
[ 245.015017] Workqueue: events linkwatch_event
[ 245.020734] Call Trace:
[ 245.024144] <TASK>
[ 245.027137] __schedule+
[ 245.031848] schedule+0x69/0x110
[ 245.036228] schedule_
[ 245.042066] __mutex_
[ 245.047993] __mutex_
[ 245.053432] mutex_lock+
[ 245.057714] rtnl_lock+0x15/0x20
[ 245.061901] linkwatch_
[ 245.066571] process_
[ 245.071607] worker_
[ 245.076260] ? process_
[ 245.081493] kthread+0x127/0x150
[ 245.085592] ? set_kthread_
[ 245.090769] ret_from_
[ 245.095266] </TASK>
```
and
```
[ 245.530629] INFO: task ifenslave:849 blocked for more than 121 seconds.
[ 245.540433] Not tainted 5.15.0-69-generic #76-Ubuntu
[ 245.549050] "echo 0 > /proc/sys/
[ 245.558960] task:ifenslave state:D stack: 0 pid: 849 ppid: 847 flags:0x00004002
[ 245.570930] Call Trace:
[ 245.576175] <TASK>
[ 245.581018] __schedule+
[ 245.587445] schedule+0x69/0x110
[ 245.593631] schedule_
[ 245.600573] __wait_
[ 245.607526] ? usleep_
[ 245.614743] wait_for_
[ 245.621903] flush_workqueue
[ 245.628887] ib_cache_
[ 245.637083] __ib_unregister
[ 245.645398] ib_unregister_
[ 245.653541] irdma_ib_
[ 245.662105] irdma_remove+
[ 245.669446] auxiliary_
[ 245.676688] __device_
[ 245.684241] device_
[ 245.691416] bus_remove_
[ 245.698396] device_
[ **712178] ice_lag_
m] (3 of 5) A start job is runni[ 245.720683] ice_lag_
ng for\u2026rk interfaces (3min 47s[ 245.729739] ice_lag_
/ 5min 3s)
[ 245.738525] raw_notifier_
[ 245.746006] call_netdevice_
[ 245.754123] __netdev_
[ 245.761658] netdev_
[ 245.769627] bond_enslave+
[ 245.777398] ? sscanf+0x4e/0x70
[ 245.783375] bond_option_
[ 245.791738] __bond_
[ 245.799505] __bond_
[ 245.807860] bond_opt_
[ 245.816062] bonding_
[ 245.824750] dev_attr_
[ 245.831443] sysfs_kf_
[ 245.837979] kernfs_
[ 245.845469] new_sync_
[ 245.852210] vfs_write+
[ 245.858429] ksys_write+
[ 245.864624] __x64_sys_
[ 245.871288] do_syscall_
[ 245.877715] ? handle_
[ 245.884566] ? do_user_
[ 245.891990] ? filp_close+
[ 245.898452] ? exit_to_
[ 245.906272] ? irqentry_
[ 245.914042] ? irqentry_
[ 245.920703] ? exc_page_
[ 245.927555] entry_SYSCALL_
[ 245.935763] RIP: 0033:0x7f1e86855a37
[ 245.942153] RSP: 002b:00007fff8d
[ 245.953034] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f1e86855a37
[ 245.963554] RDX: 000000000000000a RSI: 0000556eff580510 RDI: 0000000000000001
[ 245.972468] RBP: 0000556eff580510 R08: 0000556eff582c5a R09: 0000000000000000
[ 245.983048] R10: 0000556eff582c59 R11: 0000000000000246 R12: 0000000000000001
[ 245.993402] R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000
[ 246.001700] </TASK>
```
This appears consistent with the underlying cause being the bug fixed by mainline commit 248401cb2c4612d
The 5.15.0-67 kernel does not exhibit the problem; given that the 5.15.0-68 kernel apparently included the "RDMA/irdma: Report the correct link speed" patch listed in one of the "Fixes" tags in the above commit, I suspect that that's the culprit and that importing the above commit shoudl resolve the problem.
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-
ProcVersionSign
Uname: Linux 5.15.0-67-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Apr 5 22:47 seq
crw-rw---- 1 root audio 116, 33 Apr 5 22:47 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu82.3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckR
Date: Wed Apr 5 22:48:03 2023
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 004: ID 0b1f:03ee Insyde Software Corp. RNDIS/Ethernet Gadget
Bus 001 Device 003: ID 0557:9241 ATEN International Co., Ltd SMCI HID KM
Bus 001 Device 002: ID 1d6b:0107 Linux Foundation USB Virtual Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Supermicro SYS-510T-MR-EI018
PciMultimedia:
ProcEnviron:
TERM=vt220
PATH=(custom, no user)
XDG_RUNTIME_
LANG=C.UTF-8
SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 20220329.
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/23/2022
dmi.bios.release: 5.22
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: 1.2
dmi.board.
dmi.board.name: X12STH-SYS
dmi.board.vendor: Supermicro
dmi.board.version: 1.01
dmi.chassis.
dmi.chassis.type: 1
dmi.chassis.vendor: Supermicro
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.family: To be filled by O.E.M.
dmi.product.name: SYS-510T-MR-EI018
dmi.product.sku: To be filled by O.E.M.
dmi.product.
dmi.sys.vendor: Supermicro
This change was made by a bot.