Ubuntu 20.04 crashing on Dell 5820 X-Series desktop

Bug #1922904 reported by Jay
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

hello Community,

Please find the below trace crash getting while we intiating file copy remotely or locally.

Ubuntu 20.04.2 LTS"
5.8.0-48-generic

Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
 Manufacturer: Dell Inc.
 Product Name: Precision 5820 Tower X-Series

#cat /proc/version_signature
Ubuntu 5.8.0-48.54~20.04.1-generic 5.8.18

Also in Ubuntu 18.04 this SSD drive not detection in the Ubuntu 18.04.5 ISO. ALready changd the AHCI mode to SATA

====
Apr 7 14:02:19 localhost kernel: [ 1662.528149] general protection fault, probably for non-canonical address 0xffef8f421f91d4b0: 0000 [#1] SMP NOPTI
Apr 7 14:02:19 localhost kernel: [ 1662.528159] CPU: 11 PID: 3377 Comm: rsync Tainted: P OE 5.8.0-48-generic #54~20.04.1-Ubuntu
Apr 7 14:02:19 localhost kernel: [ 1662.528162] Hardware name: Dell Inc. Precision 5820 Tower X-Series/0X75JG, BIOS 2.8.0 01/15/2021
Apr 7 14:02:19 localhost kernel: [ 1662.528171] RIP: 0010:kmem_cache_alloc+0x89/0x230
Apr 7 14:02:19 localhost kernel: [ 1662.528175] Code: 08 65 4c 03 05 70 a6 75 78 49 83 78 10 00 4d 8b 20 0f 84 8e 01 00 00 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
Apr 7 14:02:19 localhost kernel: [ 1662.528178] RSP: 0018:ffff9de9c0ce7b30 EFLAGS: 00010292
Apr 7 14:02:19 localhost kernel: [ 1662.528181] RAX: ffef8f421f91d4b0 RBX: 0000000000400000 RCX: ffff8f46506cf000
Apr 7 14:02:19 localhost kernel: [ 1662.528183] RDX: 00000000000a5d5e RSI: 0000000000408d40 RDI: 00002ea31f430bd0
Apr 7 14:02:19 localhost kernel: [ 1662.528185] RBP: ffff9de9c0ce7b60 R08: ffffbde9bf0f0bd0 R09: ffff8f4689fbd428
Apr 7 14:02:19 localhost kernel: [ 1662.528188] R10: 0000000000000000 R11: 000000000000c0c6 R12: ffef8f421f91d478
Apr 7 14:02:19 localhost kernel: [ 1662.528189] R13: 0000000000408d40 R14: ffff8f4699422d80 R15: ffff8f4691f8af40
Apr 7 14:02:19 localhost kernel: [ 1662.528192] FS: 00007f6494a64740(0000) GS:ffff8f469fcc0000(0000) knlGS:0000000000000000
Apr 7 14:02:19 localhost kernel: [ 1662.528195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 7 14:02:19 localhost kernel: [ 1662.528196] CR2: 00007f0680865000 CR3: 0000001f82076005 CR4: 00000000003606e0
Apr 7 14:02:19 localhost kernel: [ 1662.528199] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 7 14:02:19 localhost kernel: [ 1662.528200] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 7 14:02:19 localhost kernel: [ 1662.528202] Call Trace:
Apr 7 14:02:19 localhost kernel: [ 1662.528213] ? alloc_buffer_head+0x1f/0x70
Apr 7 14:02:19 localhost kernel: [ 1662.528218] alloc_buffer_head+0x1f/0x70
Apr 7 14:02:19 localhost kernel: [ 1662.528222] alloc_page_buffers+0xa3/0x160
Apr 7 14:02:19 localhost kernel: [ 1662.528227] create_empty_buffers+0x1e/0x110
Apr 7 14:02:19 localhost kernel: [ 1662.528234] ext4_block_write_begin+0x40f/0x520
Apr 7 14:02:19 localhost kernel: [ 1662.528239] ? jbd2__journal_start+0x101/0x200
Apr 7 14:02:19 localhost kernel: [ 1662.528243] ? ext4_da_map_blocks.constprop.0+0x380/0x380
Apr 7 14:02:19 localhost kernel: [ 1662.528247] ? __ext4_journal_start_sb+0x70/0x140
Apr 7 14:02:19 localhost kernel: [ 1662.528251] ext4_da_write_begin+0x1de/0x480
Apr 7 14:02:19 localhost kernel: [ 1662.528257] generic_perform_write+0xc2/0x1c0
Apr 7 14:02:19 localhost kernel: [ 1662.528263] ext4_buffered_write_iter+0x90/0x140
Apr 7 14:02:19 localhost kernel: [ 1662.528268] ext4_file_write_iter+0x50/0x220
Apr 7 14:02:19 localhost kernel: [ 1662.528273] new_sync_write+0x113/0x1a0
Apr 7 14:02:19 localhost kernel: [ 1662.528277] vfs_write+0x1c5/0x200
Apr 7 14:02:19 localhost kernel: [ 1662.528281] ksys_write+0x67/0xe0
Apr 7 14:02:19 localhost kernel: [ 1662.528284] __x64_sys_write+0x1a/0x20
Apr 7 14:02:19 localhost kernel: [ 1662.528293] do_syscall_64+0x49/0xc0
Apr 7 14:02:19 localhost kernel: [ 1662.528298] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 7 14:02:19 localhost kernel: [ 1662.528302] RIP: 0033:0x7f6494b781e7
Apr 7 14:02:19 localhost kernel: [ 1662.528305] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Apr 7 14:02:19 localhost kernel: [ 1662.528307] RSP: 002b:00007fffffd4f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Apr 7 14:02:19 localhost kernel: [ 1662.528310] RAX: ffffffffffffffda RBX: 000055cd3c2bba50 RCX: 00007f6494b781e7
Apr 7 14:02:19 localhost kernel: [ 1662.528312] RDX: 0000000000040000 RSI: 000055cd3c2bba50 RDI: 0000000000000001
Apr 7 14:02:19 localhost kernel: [ 1662.528314] RBP: 0000000000000001 R08: 0000000000004043 R09: 000055cd3c2acf27
Apr 7 14:02:19 localhost kernel: [ 1662.528315] R10: 00000000c355e316 R11: 0000000000000246 R12: 0000000000040000
Apr 7 14:02:19 localhost kernel: [ 1662.528317] R13: 000055cd3c2aba00 R14: 0000000000001527 R15: 00007fffffd4f6f8
Apr 7 14:02:19 localhost kernel: [ 1662.528320] Modules linked in: rfcomm ccm typec_displayport cmac algif_hash joydev algif_skcipher af_alg bnep hid_generic intel_rapl_msr usbhid intel_rapl_common hid nvidia_uvm(OE) nvidia_drm(POE) nvidia_modeset(POE) isst_if_common nls_iso8859_1 nvidia(POE) nfit x86_pkg_temp_thermal intel_powerclamp coretemp uas kvm_intel btusb iwlmvm btrtl kvm btbcm btintel mac80211 bluetooth crct10dif_pclmul snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic aesni_intel ecdh_generic usb_storage ledtrig_audio snd_hda_codec_hdmi dell_smm_hwmon libarc4 ecc crypto_simd cryptd glue_helper dell_wmi iwlwifi rapl snd_hda_intel intel_cstate dell_smbios snd_seq_midi drm_kms_helper sparse_keymap snd_intel_dspcfg apple_mfi_fastcharge snd_seq_midi_event ucsi_ccg dcdbas snd_rawmidi cec typec_ucsi video snd_hda_codec rc_core cfg80211 typec fb_sys_fops dell_wmi_descriptor wmi_bmof syscopyarea input_leds snd_hda_core snd_seq sysfillrect snd_hwdep sysimgblt intel_wmi_thunderbolt snd_seq_device snd_pcm
Apr 7 14:02:19 localhost kernel: [ 1662.528384] serio_raw efi_pstore snd_timer mei_me snd mei soundcore ioatdma mac_hid acpi_tad sch_fq_codel parport_pc ppdev lp parport drm ip_tables x_tables autofs4 nvme nvme_core crc32_pclmul vmd i2c_nvidia_gpu igb e1000e ahci i2c_i801 xhci_pci i2c_algo_bit i2c_smbus dca libahci xhci_pci_renesas wmi
Apr 7 14:02:19 localhost kernel: [ 1662.528413] ---[ end trace 28a397be443c33d4 ]---
====
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.16
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gdm 1134 F.... pulseaudio
 /dev/snd/controlC1: gdm 1134 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-03-24 (13 days ago)
InstallationMedia: Ubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 (20210209.1)
MachineType: Dell Inc. Precision 5820 Tower X-Series
NonfreeKernelModules: nvidia_modeset nvidia
Package: linux (not installed)
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.8.0-48-generic root=UUID=67802464-049b-4c4f-96aa-cbe76132ef96 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.8.0-48.54~20.04.1-generic 5.8.18
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-48-generic N/A
 linux-backports-modules-5.8.0-48-generic N/A
 linux-firmware 1.187.10
Tags: focal
Uname: Linux 5.8.0-48-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 01/15/2021
dmi.bios.release: 2.8
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.8.0
dmi.board.name: 0X75JG
dmi.board.vendor: Dell Inc.
dmi.board.version: A01
dmi.chassis.asset.tag: IT-12926
dmi.chassis.type: 3
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.8.0:bd01/15/2021:br2.8:svnDellInc.:pnPrecision5820TowerX-Series:pvr:rvnDellInc.:rn0X75JG:rvrA01:cvnDellInc.:ct3:cvr:
dmi.product.family: Precision
dmi.product.name: Precision 5820 Tower X-Series
dmi.product.sku: 08B1
dmi.sys.vendor: Dell Inc.

Revision history for this message
Jay (jayram1989) wrote :
tags: added: apport-collected focal
description: updated
Revision history for this message
Jay (jayram1989) wrote : AlsaInfo.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : CRDA.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : IwConfig.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : Lspci.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : Lspci-vt.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : Lsusb.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : Lsusb-t.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : Lsusb-v.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : ProcEnviron.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : ProcModules.txt

apport information

tags: added: kernel-bug
removed: apport-collected focal
Revision history for this message
Jay (jayram1989) wrote : RfKill.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : UdevDb.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : WifiSyslog.txt

apport information

Revision history for this message
Jay (jayram1989) wrote : acpidump.txt

apport information

Revision history for this message
Jay (jayram1989) wrote :

Previously there is an issue with tracker-mine crash. I removed that package as well still no luck. Please let me know this model stable with which ubuntu version??

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: groovy
Revision history for this message
Jay (jayram1989) wrote :
Revision history for this message
Jay (jayram1989) wrote :
Download full text (3.2 KiB)

Notes
===
Using 18.04 LTS as well getting the crashing issue.
Suspecting this SSD model driver causing the crash.

Below is the syslogs for your reference
===

Apr 8 13:12:25 localhost kernel: [ 773.855092] show_signal: 35 callbacks suppressed
Apr 8 13:12:25 localhost kernel: [ 773.855094] traps: dpkg-query[8579] general protection fault ip:7f008f209f11 sp:7ffc1d8d16a8 error:0 in libc-2.27.so[7f008f07f000+1e7000]
Apr 8 13:12:25 localhost kernel: [ 773.913058] traps: apport[8581] general protection fault ip:544414 sp:7ffe1f3ff9f0 error:0 in python3.6[400000+3b4000]
Apr 8 13:12:25 localhost kernel: [ 773.913070] Process 8581(apport) has RLIMIT_CORE set to 1
Apr 8 13:12:25 localhost kernel: [ 773.913070] Aborting core
Apr 8 13:12:27 localhost kernel: [ 776.151560] traps: dpkg-query[8582] general protection fault ip:7f279c3f8f11 sp:7ffc81d44d18 error:0 in libc-2.27.so[7f279c26e000+1e7000]
Apr 8 13:12:31 localhost kernel: [ 780.513187] XFS (nvme0n1): Unmounting Filesystem

Apr 8 13:13:02 localhost kernel: [ 810.919029] traps: mkfs.ext4[8587] general protection fault ip:7f003ac171bb sp:7ffcd7625470 error:0 in libext2fs.so.2.4[7f003ac00000+4d000]

Apr 8 13:13:16 localhost kernel: [ 825.229643] traps: apport[8598] general protection fault ip:59cb64 sp:7ffcea1e5cf8 error:0 in python3.6[400000+3b4000]
Apr 8 13:13:16 localhost kernel: [ 825.229657] Process 8598(apport) has RLIMIT_CORE set to 1
Apr 8 13:13:16 localhost kernel: [ 825.229657] Aborting core
Apr 8 13:13:36 localhost kernel: [ 844.807235] traps: apport-checkrep[8604] general protection fault ip:59cb60 sp:7fff43e688b8 error:0 in python3.6[400000+3b4000]
Apr 8 13:13:36 localhost kernel: [ 844.991014] apport-checkrep[8609]: segfault at 8 ip 000000000054dc65 sp 00007ffe23d42520 error 6 in python3.6[400000+3b4000]
Apr 8 13:13:36 localhost kernel: [ 844.991018] Code: 46 28 00 00 00 00 49 89 4e 20 48 85 c9 74 04 4c 89 71 28 4c 89 35 b3 0f 52 00 e9 c6 fc ff ff 4c 8b 52 e8 4c 8b 4a f0 4d 89 11 <4d> 89 4a 08 48 c7 42 e8 00 00 00 00 e9 7d fd ff ff 48 89 fe 48 8b
Apr 8 13:14:09 localhost kernel: [ 878.006289] traps: apport[8637] general protection fault ip:59cb60 sp:7ffe82f661f8 error:0 in python3.6[400000+3b4000]
Apr 8 13:14:09 localhost kernel: [ 878.006303] Process 8637(apport) has RLIMIT_CORE set to 1
Apr 8 13:14:09 localhost kernel: [ 878.006303] Aborting core
Apr 8 13:14:33 localhost systemd[1]: Starting Cleanup of Temporary Directories...
Apr 8 13:14:33 localhost systemd[1]: Started Cleanup of Temporary Directories.
Apr 8 13:15:01 localhost kernel: [ 929.918109] traps: apt-config[8682] general protection fault ip:7f56f13e3ed0 sp:7ffeef400768 error:0 in libapt-pkg.so.5.0.2[7f56f13a2000+1b6000]
Apr 8 13:15:01 localhost kernel: [ 929.928414] traps: apport[8683] general protection fault ip:50cdaa sp:7ffd9052a960 error:0 in python3.6[400000+3b4000]
Apr 8 13:15:01 localhost kernel: [ 929.928436] Process 8683(apport) has RLIMIT_CORE set to 1
Apr 8 13:15:01 localhost kernel: [ 929.928436] Aborting core
Apr 8 13:15:01 localhost kernel: [ 930.422542] traps: apt-config[8868] general protection fault ip:7f867f5347f8 sp:7ffcb8a4a8e8 error:0 in ld-2.27.so[7f867f516000+29000]
Apr 8 13:15:01 localhost kernel: [ ...

Read more...

Revision history for this message
Jay (jayram1989) wrote :

SSD.model

nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 S4UFNF0NC01090 PM981a NVMe SAMSUNG 2048GB 1 165.28 GB / 2.05 TB 512 B + 0 B 15302229

Revision history for this message
Jay (jayram1989) wrote :

dmesg | grep nvme
[ 1.797280] nvme nvme0: pci function 10000:01:00.0
[ 1.797290] nvme 10000:01:00.0: PCI INT A: not connected
[ 2.013709] nvme nvme0: missing or invalid SUBNQN field.
[ 2.013727] nvme nvme0: Shutdown timeout set to 8 seconds
[ 2.038879] nvme nvme0: 28/0/0 default/read/poll queues

Revision history for this message
Jay (jayram1989) wrote :

smartctl --all /dev/nvme0
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-70-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: PM981a NVMe SAMSUNG 2048GB
Serial Number: S4UFNF0NC01090
Firmware Version: 15302229
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 2,048,408,248,320 [2.04 TB]
Unallocated NVM Capacity: 0
Controller ID: 4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 2,048,408,248,320 [2.04 TB]
Namespace 1 Utilization: 1,002,229,760 [1.00 GB]
Namespace 1 Formatted LBA Size: 512
Local Time is: Sun Apr 11 21:00:29 2021 +04
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL *Other*
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat *Other*
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 83 Celsius
Critical Comp. Temp. Threshold: 85 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
 0 + 6.60W - - 0 0 0 0 0 0
 1 + 4.40W - - 1 1 1 1 0 0
 2 + 3.10W - - 2 2 2 2 0 0
 3 - 0.0700W - - 3 3 3 3 210 1200
 4 - 0.0050W - - 4 4 4 4 2000 8000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
 0 + 512 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 37 Celsius
Available Spare: 100%
Available Spare Threshold: 50%
Percentage Used: 0%
Data Units Read: 58,201 [29.7 GB]
Data Units Written: 633,157 [324 GB]
Host Read Commands: 774,623
Host Write Commands: 1,252,887
Controller Busy Time: 7
Power Cycles: 228
Power On Hours: 56
Unsafe Shutdowns: 189
Media and Data Integrity Errors: 0
Error Information Log Entries: 898
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 37 Celsius
Temperature Sensor 2: 32 Celsius

Error Information (NVMe Log 0x01, max 64 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
  0 898 0 0x0000 0x4004 0x004 0 1 -

Revision history for this message
Jay (jayram1989) wrote :

Fix
===

Issue is with the faulty RAM. Once replaced with new one there are no issues
Hence closing this bug as fixed

Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.