Kernel errors with NVME devices

Bug #1573888 reported by youshotwhointhatwhatnow
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I have several identical servers with two PCIe NVME drives each (Intel P3600). These servers have been running 14.04 without issues.

When trying to install 16.04 only one NVME device was visible and the installer would freeze up. I booted a live USB (ubuntu-mate 16.04) and found the syslog was getting spammed with an infinite loop of kernel oops messages. I attached a trimmed version of that log. Sometimes the sysem would hit a kernel panic and freeze, but I don't have a setup to capture that currently.

Revision history for this message
youshotwhointhatwhatnow (moloney-brendan) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1573888/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
tags: added: xenial
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1573888

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
youshotwhointhatwhatnow (moloney-brendan) wrote :

I don't think apport is going to work here, syslog grows at hunderds of megabytes per second from the repeated error messages.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
youshotwhointhatwhatnow (moloney-brendan) wrote :

This appears to be hardware/firmware related (specifically a problem with the motherboard). There is some sort of IRQ conflict when one of the PCIe slots is being used. The IRQ conflict was there in 14.04 as well, but didn't seem to cause any problems. Moving one of the NVME devices to a different PCIe slot solved the IRQ conflict and all of the resulting issues.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.