kexec reboot = NMI: PCI system error (SERR) for reason a1 on CPU 0

Bug #1743946 reported by TJ
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

On 16.04 with linux-image-lowlatency-hwe-16.04 amd64 (currently 4.13.0-30-lowlatency) with kexec-tools default-configured to handle reboots with the current kernel version (no GRUB detect) and /proc/cmdline.

The system freezes either permanently (requiring a hard power-off) or for tens of seconds, showing (copied from a photograph):

kexec_core: Starting new kernel
nouveau 0000:01:00.0: DRM: GPU lockup - switching to software fbcon
NMI: PCI system error (SERR) for reason a1 on CPU 0.
Dazed and confused, but trying to continue

Due to the nature of the failure there is no additional logging or clues. I tried "nopti" in case recent 'Meltdown' changes were responsible but it still happens. I've only just enabled kexec on this system so have no historical indications of it having worked.

System is Dell XPS M1530, Core 2 Duo T9300, 4GB RAM
---
ApportVersion: 2.20.1-0ubuntu2.15
Architecture: amd64
AudioDevicesInUse: Error: [Errno 2] No such file or directory
CurrentDesktop: XFCE
DistroRelease: Ubuntu 16.04
JournalErrors:
 -- Logs begin at Wed 2018-01-10 04:01:51 UTC, end at Thu 2018-01-18 13:41:03 UTC. --
 Jan 18 07:08:41 hostname dnsmasq[868]: warning: no upstream servers configured
MachineType: Dell Inc. XPS M1530
Package: linux (not installed)
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.13.0-30-lowlatency root=UUID=2edb423c-3c9d-4031-a18a-99f112ad48c1 ro nopti acpi_osi=! "acpi_osi=Windows 2006" pci=assign-busses,pcie_scan_all,realloc crashkernel=384M-2G:128M,2G-:256M
ProcVersionSignature: Ubuntu 4.13.0-30.33~16.04.1-lowlatency 4.13.13
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-30-lowlatency N/A
 linux-backports-modules-4.13.0-30-lowlatency N/A
 linux-firmware 1.157.15
Tags: xenial
Uname: Linux 4.13.0-30-lowlatency x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm disk plugdev sudo
WifiSyslog:

_MarkForUpload: True
dmi.bios.date: 11/19/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A12
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA12:bd11/19/2008:svnDellInc.:pnXPSM1530:pvr:rvnDellInc.:rn:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: XPS M1530
dmi.sys.vendor: Dell Inc.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1743946

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful
Revision history for this message
TJ (tj) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected xenial
description: updated
Revision history for this message
TJ (tj) wrote : CRDA.txt

apport information

Revision history for this message
TJ (tj) wrote : CurrentDmesg.txt

apport information

Revision history for this message
TJ (tj) wrote : IwConfig.txt

apport information

Revision history for this message
TJ (tj) wrote : Lspci.txt

apport information

Revision history for this message
TJ (tj) wrote : Lsusb.txt

apport information

Revision history for this message
TJ (tj) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
TJ (tj) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
TJ (tj) wrote : ProcEnviron.txt

apport information

TJ (tj)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.15 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc8

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
TJ (tj) wrote :

Tested alternatives:

v4.15-rc8 mainline build: PC locks up on "kexec -e" with all LEDs flashing
4.4.0-110-lowlatency: no error message then after about 20 seconds PC soft-reboots and POSTs

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.