laptop crashed all of a sudden

Bug #1852479 reported by Juan Carlos Carvajal Bermúdez
This bug report is a duplicate of:  Bug #1746340: Samsung SSD corruption (fsck needed). Edit Remove
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

My laptop just crashed spectacularly two times. I was using chrome both times but I am not sure if that is related. There was no possibility of rescuing the system whatsoever. I just dug into the logs.

probably related to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173

kernel log:
 rfkill: input handler disabled
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: can't find device of ID00e8
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: can't find device of ID00e8
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Error of this Agent is reported first
 nvme 0000:03:00.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 nvme 0000:03:00.0: AER: device [144d:a808] error status/mask=00000001/0000e000
 nvme 0000:03:00.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:03:00.0
 pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:03:00.0
 pcieport 0000:00:1d.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 pcieport 0000:00:1d.0: AER: device [8086:a330] error status/mask=00000001/00002000
 pcieport 0000:00:1d.0: AER: [ 0] RxErr
 nvme 0000:03:00.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
 nvme 0000:03:00.0: AER: device [144d:a808] error status/mask=00000001/0000e000
 nvme 0000:03:00.0: AER: [ 0] RxErr
 nvme 0000:03:00.0: AER: Error of this Agent is reported first

lspci -tv
-[0000:00]-+-00.0 Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers
           +-01.0-[01]--+-00.0 NVIDIA Corporation GP106M [GeForce GTX 1060 Mobile]
           | \-00.1 NVIDIA Corporation GP106 High Definition Audio Controller
           +-02.0 Intel Corporation UHD Graphics 630 (Mobile)
           +-12.0 Intel Corporation Cannon Lake PCH Thermal Controller
           +-14.0 Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller
           +-14.2 Intel Corporation Cannon Lake PCH Shared SRAM
           +-15.0 Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0
           +-16.0 Intel Corporation Cannon Lake PCH HECI Controller
           +-17.0 Intel Corporation Cannon Lake Mobile PCH SATA AHCI Controller
           +-1b.0-[02]--
           +-1d.0-[03]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
           +-1d.5-[04]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1d.6-[05]----00.0 Intel Corporation Wireless-AC 9260
           +-1e.0 Intel Corporation Device a328
           +-1e.3 Intel Corporation Device a32b
           +-1f.0 Intel Corporation Device a30d
           +-1f.3 Intel Corporation Cannon Lake PCH cAVS
           +-1f.4 Intel Corporation Cannon Lake PCH SMBus Controller
           \-1f.5 Intel Corporation Cannon Lake PCH SPI Controller

Ubuntu 5.3.0-20.21+system76~1572304854~19.10~8caa3e6-generic 5.3.7
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu8.2
Architecture: amd64
CurrentDesktop: pop:GNOME
DistroRelease: Pop!_OS 19.10
MachineType: SchenkerTechnologiesGmbH XMG NEO 15 - XNE15M18
NonfreeKernelModules: nvidia_modeset nvidia
Package: linux (not installed)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-20-generic root=UUID=ce33b159-ce12-4227-92f3-3ea93527e6d4 ro quiet splash pci=noaer vt.handoff=7
ProcVersionSignature: Ubuntu 5.3.0-20.21+system76~1572304854~19.10~8caa3e6-generic 5.3.7
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-20-generic N/A
 linux-backports-modules-5.3.0-20-generic N/A
 linux-firmware 1.183.1+system76~1571781891~19.10~fb0e8b8
Tags: eoan
Uname: Linux 5.3.0-20-generic x86_64
UpgradeStatus: Upgraded to eoan on 2019-08-19 (86 days ago)
UserGroups: adm dialout sambashare sudo
_MarkForUpload: True
dmi.bios.date: 11/29/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: N.1.09
dmi.board.asset.tag: Standard
dmi.board.name: GK5CN6Z
dmi.board.vendor: SchenkerTechnologiesGmbH
dmi.board.version: Standard
dmi.chassis.asset.tag: Standard
dmi.chassis.type: 10
dmi.chassis.vendor: SchenkerTechnologiesGmbH
dmi.chassis.version: Standard
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrN.1.09:bd11/29/2018:svnSchenkerTechnologiesGmbH:pnXMGNEO15-XNE15M18:pvrStandard:rvnSchenkerTechnologiesGmbH:rnGK5CN6Z:rvrStandard:cvnSchenkerTechnologiesGmbH:ct10:cvrStandard:
dmi.product.family: CFL
dmi.product.name: XMG NEO 15 - XNE15M18
dmi.product.sku: XNE15M18
dmi.product.version: Standard
dmi.sys.vendor: SchenkerTechnologiesGmbH

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1852479

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected eoan
description: updated
Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : AudioDevicesInUse.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : CRDA.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : IwConfig.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : Lspci.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : Lsusb.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : ProcEnviron.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : ProcModules.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : PulseList.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : RfKill.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : UdevDb.txt

apport information

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

ProcVersionSignature: Ubuntu 5.3.0-20.21+system76~1572304854~19.10~8caa3e6-generic 5.3.7

Seems like it's a patched kernel. Maybe file a bug at Pop!_OS instead?

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote :

after digging a bit I am pretty sure that the problem is similar to this one:

https://bugzilla.kernel.org/show_bug.cgi?id=195039

I have also a Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983.
It just dies very randomly. Any updates from the kernel?

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Probably need some minor adjustment to match your NVMe.

Revision history for this message
ulot0 (ulot0) wrote :

Same question
Dell G3 notebook, Linux XXX hosts 5.4.0-31-generic ා Ubuntu SMP Thu May 7 20:20:34 UTC 2020 x86_ 64 x86_ 64 x86_ 64 GNU/Linux
It can only be forced to start after a 24-hour emergency. It can't be sure that it's caused by the hard disk error, but the Samsung Electronics Co Ltd nvme SSD controller sm981 / pm981 / pm983 1TB hard disk is used. When the machine is running, only the Chrome browser is opened, and two or three pages are opened to watch the video online.

Revision history for this message
Juan Carlos Carvajal Bermúdez (jucajuca) wrote :

here a bit of my cat /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash modprobe.blacklist=nouveau nvme_core.default_ps_max_latency_us=5500 pcie_aspm=off"
GRUB_CMDLINE_LINUX="nouveau.modeset=0"

For me pcie_aspm=off was the parameter that helped solve the issue.

for more info

+see here: https://wiki.archlinux.org/index.php/Solid_state_drive/NVMe

+and here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1746340#yui_3_10_3_1_1590400769591_1615

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.