kernel crash when NVMe drive inserted in one slot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Xenial |
Fix Released
|
High
|
Unassigned |
Bug Description
Opening this on behalf of one of my colleagues at Cisco, we're seeing an issue on our new S-series S3260 server that's causing the kernel to crash.
If we have an NVMe device inserted into one of two drive slots, we will see kernel crash only with Ubuntu. With an NVMe drive in the bad slot, other OS's will work fine. If we move the NVMe drive out of the bad slot and into the other slot, everything is working fine as expected. We only see the kernel crash with an NVMe drive in that bad slot when using Ubuntu. We tested with HGST and Intel NVMe drives and were able to reproduce the issue with both. HGST reviewed some logs and they don't believe at this time the issue is with the NVMe drives.
We're hoping someone from Canonical can take a look to understand what is the difference between the working and failing slot. The data collection was done with the NVMe drive inserted in the working slot so we could access the OS.
I had a connection time out when trying to use ubuntu-bug, so I saved the apport file and will attach to the bug. I have collected the kernel and syslog as well, but they are ~9GB. I found a call trace in the kernel log start on Jan 25 06:02:54 and floods the logs afterwards. I will include the call trace in a separate text file on the attachment.
tags: | added: kernel-fixed-upstream |
Changed in linux (Ubuntu Xenial): | |
importance: | Undecided → High |
status: | New → Fix Committed |
Changed in linux (Ubuntu): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu): | |
status: | Fix Committed → Fix Released |
Changed in linux (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1661131
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.