Configure sysctl to panic on oops, that should do it.
This stack section is interesting.
Dec 9 09:18:27 ealxs00195 kernel: [1726974.078795] RIP: 0010:[<ffffffff8165b049>] [<ffffffff8165b049>] _raw_spin_unlock_irqrestore+0x19/0x30
...
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091694] [<ffffffff8142f764>] __scsi_remove_target+0xd4/0xf0
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091696] [<ffffffff8142f841>] scsi_remove_target+0xc1/0xe0
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091701] [<ffffffffa00deb96>] fc_starget_delete+0x26/0x30 [scsi_transport_fc]
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091706] [<ffffffff81084b1a>] process_one_work+0x11a/0x480
It's required to be able to sleep in a workq context, we're grabbing spinlocks here, which when done quickly
is fine, but holding them too long and well, you see this. I doubt this is the root cause, something else is likely
holding the locks due to an error handling condition and the removal threads are just victims.
What I really need is a crashdump. I'm simply short on time to reproduce this myself.
https:/ /wiki.ubuntu. com/Kernel/ CrashdumpRecipe
Configure sysctl to panic on oops, that should do it.
This stack section is interesting. ffffffff8165b04 9>] [<ffffffff8165b 049>] _raw_spin_ unlock_ irqrestore+ 0x19/0x30 764>] __scsi_ remove_ target+ 0xd4/0xf0 841>] scsi_remove_ target+ 0xc1/0xe0 b96>] fc_starget_ delete+ 0x26/0x30 [scsi_transport_fc] b1a>] process_ one_work+ 0x11a/0x480
Dec 9 09:18:27 ealxs00195 kernel: [1726974.078795] RIP: 0010:[<
...
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091694] [<ffffffff8142f
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091696] [<ffffffff8142f
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091701] [<ffffffffa00de
Dec 9 09:18:27 ealxs00195 kernel: [1726974.091706] [<ffffffff81084
It's required to be able to sleep in a workq context, we're grabbing spinlocks here, which when done quickly
is fine, but holding them too long and well, you see this. I doubt this is the root cause, something else is likely
holding the locks due to an error handling condition and the removal threads are just victims.