Spontaneous NVIDIA panic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nvidia-graphics-drivers-470 (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
All graphics and USB devices spontaneously halted.
SSH, networking, and filesystems remained working.
Xorg stuck and can not exit even after ABRT signal.
"shutdown" command never completed.
Aug 31 18:17:12 fuel kernel: [ 3494.973431] BUG: unable to handle page fault for address: 0000000000020018
Aug 31 18:17:12 fuel kernel: [ 3494.973436] #PF: supervisor read access in kernel mode
Aug 31 18:17:12 fuel kernel: [ 3494.973438] #PF: error_code(0x0000) - not-present page
Aug 31 18:17:12 fuel kernel: [ 3494.973439] PGD 0 P4D 0
Aug 31 18:17:12 fuel kernel: [ 3494.973442] Oops: 0000 [#1] SMP NOPTI
Aug 31 18:17:12 fuel kernel: [ 3494.973444] CPU: 2 PID: 33911 Comm: chrome Tainted: P OE 5.11.0-31-generic #33-Ubuntu
Aug 31 18:17:12 fuel kernel: [ 3494.973447] Hardware name: Gigabyte Technology Co., Ltd. Default string/X99P-SLI-CF, BIOS F25b 03/13/2018
Aug 31 18:17:12 fuel kernel: [ 3494.973448] RIP: 0010:_nv028963r
Aug 31 18:17:12 fuel kernel: [ 3494.973792] Code: 57 10 31 c0 48 85 d2 74 2e 48 8b 4f 08 31 c0 48 85 c9 74 0d 48 63 41 14 48 89 d6 48 29 c6 48 89 f0 48 3b 57 18 48 89 07 74 1b <48> 8b 42 08 48 89 47 10 b8 01 00 00 00 48 83 c4 08 c3 66 0f 1f 84
Aug 31 18:17:12 fuel kernel: [ 3494.973795] RSP: 0018:ffffb11742
Aug 31 18:17:12 fuel kernel: [ 3494.973797] RAX: 0000000000020010 RBX: ffff9c819c336c30 RCX: ffff9c7e48cec978
Aug 31 18:17:12 fuel kernel: [ 3494.973798] RDX: 0000000000020010 RSI: 0000000000020010 RDI: ffff9c7cf6695d20
Aug 31 18:17:12 fuel kernel: [ 3494.973800] RBP: ffff9c7cf6695d20 R08: 0000000000000020 R09: ffff9c7cf6695d28
Aug 31 18:17:12 fuel kernel: [ 3494.973801] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c7d3d867738
Aug 31 18:17:12 fuel kernel: [ 3494.973802] R13: ffff9c7e1c027060 R14: ffff9c7cf6695d98 R15: ffff9c819c336c30
Aug 31 18:17:12 fuel kernel: [ 3494.973804] FS: 000000000000000
Aug 31 18:17:12 fuel kernel: [ 3494.973806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 31 18:17:12 fuel kernel: [ 3494.973807] CR2: 0000000000020018 CR3: 00000006dcc10005 CR4: 00000000003706e0
Aug 31 18:17:12 fuel kernel: [ 3494.973809] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 31 18:17:12 fuel kernel: [ 3494.973810] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 31 18:17:12 fuel kernel: [ 3494.973811] Call Trace:
Aug 31 18:17:12 fuel kernel: [ 3494.973814] ? _nv035844rm+
Aug 31 18:17:12 fuel kernel: [ 3494.974111] ? _nv014655rm+
Aug 31 18:17:12 fuel kernel: [ 3494.974414] ? _nv037695rm+
Aug 31 18:17:12 fuel kernel: [ 3494.974719] ? _nv037694rm+
Aug 31 18:17:12 fuel kernel: [ 3494.975018] ? _nv037689rm+
Aug 31 18:17:12 fuel kernel: [ 3494.975317] ? _nv037690rm+
Aug 31 18:17:12 fuel kernel: [ 3494.975618] ? _nv036056rm+
Aug 31 18:17:12 fuel kernel: [ 3494.975840] ? _nv000699rm+
Aug 31 18:17:12 fuel kernel: [ 3494.976092] ? rm_cleanup_
Aug 31 18:17:12 fuel kernel: [ 3494.976341] ? fsnotify+
Aug 31 18:17:12 fuel kernel: [ 3494.976347] ? nvidia_
Aug 31 18:17:12 fuel kernel: [ 3494.976539] ? nvidia_
Aug 31 18:17:12 fuel kernel: [ 3494.976728] ? __fput+0x9f/0x250
Aug 31 18:17:12 fuel kernel: [ 3494.976731] ? ____fput+0xe/0x10
Aug 31 18:17:12 fuel kernel: [ 3494.976733] ? task_work_
Aug 31 18:17:12 fuel kernel: [ 3494.976738] ? do_exit+0x233/0x3e0
Aug 31 18:17:12 fuel kernel: [ 3494.976742] ? do_group_
Aug 31 18:17:12 fuel kernel: [ 3494.976744] ? __x64_sys_
Aug 31 18:17:12 fuel kernel: [ 3494.976747] ? do_syscall_
Aug 31 18:17:12 fuel kernel: [ 3494.976749] ? entry_SYSCALL_
Aug 31 18:17:12 fuel kernel: [ 3494.976754] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables libcrc32c nfnetlink ccm md4 cmac nls_utf8 cifs libarc4 fscache libdes binfmt_misc snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common snd_hda_
Aug 31 18:17:12 fuel kernel: [ 3494.976804] x_tables autofs4 hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core crc32_pclmul e1000e drm i2c_i801 ahci i2c_smbus lpc_ich thunderbolt libahci nvme xhci_pci nvme_core xhci_pci_renesas wmi
Aug 31 18:17:12 fuel kernel: [ 3494.976823] CR2: 0000000000020018
Aug 31 18:17:12 fuel kernel: [ 3494.976824] ---[ end trace 16a13b81d497db97 ]---
Aug 31 18:17:12 fuel kernel: [ 3494.996770] RIP: 0010:_nv028963r
Aug 31 18:17:12 fuel kernel: [ 3494.997107] Code: 57 10 31 c0 48 85 d2 74 2e 48 8b 4f 08 31 c0 48 85 c9 74 0d 48 63 41 14 48 89 d6 48 29 c6 48 89 f0 48 3b 57 18 48 89 07 74 1b <48> 8b 42 08 48 89 47 10 b8 01 00 00 00 48 83 c4 08 c3 66 0f 1f 84
Aug 31 18:17:12 fuel kernel: [ 3494.997109] RSP: 0018:ffffb11742
Aug 31 18:17:12 fuel kernel: [ 3494.997112] RAX: 0000000000020010 RBX: ffff9c819c336c30 RCX: ffff9c7e48cec978
Aug 31 18:17:12 fuel kernel: [ 3494.997113] RDX: 0000000000020010 RSI: 0000000000020010 RDI: ffff9c7cf6695d20
Aug 31 18:17:12 fuel kernel: [ 3494.997115] RBP: ffff9c7cf6695d20 R08: 0000000000000020 R09: ffff9c7cf6695d28
Aug 31 18:17:12 fuel kernel: [ 3494.997116] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c7d3d867738
Aug 31 18:17:12 fuel kernel: [ 3494.997117] R13: ffff9c7e1c027060 R14: ffff9c7cf6695d98 R15: ffff9c819c336c30
Aug 31 18:17:12 fuel kernel: [ 3494.997119] FS: 000000000000000
Aug 31 18:17:12 fuel kernel: [ 3494.997121] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 31 18:17:12 fuel kernel: [ 3494.997122] CR2: 0000000000020018 CR3: 00000004b3340006 CR4: 00000000003706e0
Aug 31 18:17:12 fuel kernel: [ 3494.997124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 31 18:17:12 fuel kernel: [ 3494.997125] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 31 18:17:12 fuel kernel: [ 3494.997127] Fixing recursive fault but reboot is needed!
Aug 31 18:17:17 fuel kernel: [ 3499.968653] BUG: kernel NULL pointer dereference, address: 0000000000000009
Aug 31 18:17:17 fuel kernel: [ 3499.968663] #PF: supervisor read access in kernel mode
Aug 31 18:17:17 fuel kernel: [ 3499.968667] #PF: error_code(0x0000) - not-present page
Aug 31 18:17:17 fuel kernel: [ 3499.968670] PGD 0 P4D 0
Aug 31 18:17:17 fuel kernel: [ 3499.968677] Oops: 0000 [#2] SMP NOPTI
Aug 31 18:17:17 fuel kernel: [ 3499.968682] CPU: 11 PID: 3575 Comm: Xorg Tainted: P D OE 5.11.0-31-generic #33-Ubuntu
Aug 31 18:17:17 fuel kernel: [ 3499.968688] Hardware name: Gigabyte Technology Co., Ltd. Default string/X99P-SLI-CF, BIOS F25b 03/13/2018
Aug 31 18:17:17 fuel kernel: [ 3499.968691] RIP: 0010:_nv010150r
Aug 31 18:17:17 fuel kernel: [ 3499.969503] Code: 07 0f 1f 44 00 00 31 d2 48 8b 07 48 85 c0 75 1a e9 a1 02 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 29 02 00 00 48 8b 48 18 48 85 c9 75 e9 48 89
Aug 31 18:17:17 fuel kernel: [ 3499.969509] RSP: 0018:ffffb11741
Aug 31 18:17:17 fuel kernel: [ 3499.969514] RAX: 0000000000000009 RBX: ffff9c7db0945ee8 RCX: 0000000000000009
Aug 31 18:17:17 fuel kernel: [ 3499.969517] RDX: ffff9c7db0945f38 RSI: 0000000000000df7 RDI: ffffffffc25a7498
Aug 31 18:17:17 fuel kernel: [ 3499.969521] RBP: ffff9c7db0945ed0 R08: ffffb11741adfe44 R09: ffff9c7db0945ee8
Aug 31 18:17:17 fuel kernel: [ 3499.969524] R10: ffffffffc067ce80 R11: 0000000000000000 R12: ffffffffc067ceb5
Aug 31 18:17:17 fuel kernel: [ 3499.969527] R13: 0000000000000004 R14: ffffffffc25a87a0 R15: 0000000000010008
Aug 31 18:17:17 fuel kernel: [ 3499.969531] FS: 00007f7d9219fa4
Aug 31 18:17:17 fuel kernel: [ 3499.969536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 31 18:17:17 fuel kernel: [ 3499.969540] CR2: 0000000000000009 CR3: 0000000171e5e003 CR4: 00000000003706e0
Aug 31 18:17:17 fuel kernel: [ 3499.969544] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 31 18:17:17 fuel kernel: [ 3499.969547] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 31 18:17:17 fuel kernel: [ 3499.969550] Call Trace:
Aug 31 18:17:17 fuel kernel: [ 3499.969554] ? _nv039714rm+
Aug 31 18:17:17 fuel kernel: [ 3499.969946] ? _nv036047rm+
Aug 31 18:17:17 fuel kernel: [ 3499.970456] ? _nv000724kms+
Aug 31 18:17:17 fuel kernel: [ 3499.970494] ? _nv010249rm+
Aug 31 18:17:17 fuel kernel: [ 3499.971002] ? _nv010248rm+
Aug 31 18:17:17 fuel kernel: [ 3499.971510] ? _nv010248rm+
Aug 31 18:17:17 fuel kernel: [ 3499.972017] ? rm_kernel_
Aug 31 18:17:17 fuel kernel: [ 3499.972657] ? nvidia_
Aug 31 18:17:17 fuel kernel: [ 3499.973037] ? nvkms_call_
Aug 31 18:17:17 fuel kernel: [ 3499.973070] ? _nv002512kms+
Aug 31 18:17:17 fuel kernel: [ 3499.973122] ? nvkms_copyin+
Aug 31 18:17:17 fuel kernel: [ 3499.973153] ? _nv000724kms+
Aug 31 18:17:17 fuel kernel: [ 3499.973185] ? nvKmsIoctl+
Aug 31 18:17:17 fuel kernel: [ 3499.973217] ? nvkms_ioctl+
Aug 31 18:17:17 fuel kernel: [ 3499.973248] ? nvidia_
Aug 31 18:17:17 fuel kernel: [ 3499.973629] ? __x64_sys_
Aug 31 18:17:17 fuel kernel: [ 3499.973639] ? do_syscall_
Aug 31 18:17:17 fuel kernel: [ 3499.973644] ? entry_SYSCALL_
Aug 31 18:17:17 fuel kernel: [ 3499.973655] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables libcrc32c nfnetlink ccm md4 cmac nls_utf8 cifs libarc4 fscache libdes binfmt_misc snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common snd_hda_
Aug 31 18:17:17 fuel kernel: [ 3499.973768] x_tables autofs4 hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core crc32_pclmul e1000e drm i2c_i801 ahci i2c_smbus lpc_ich thunderbolt libahci nvme xhci_pci nvme_core xhci_pci_renesas wmi
Aug 31 18:17:17 fuel kernel: [ 3499.973810] CR2: 0000000000000009
Aug 31 18:17:17 fuel kernel: [ 3499.973814] ---[ end trace 16a13b81d497db98 ]---
Aug 31 18:17:17 fuel kernel: [ 3500.022556] RIP: 0010:_nv028963r
Aug 31 18:17:17 fuel kernel: [ 3500.023346] Code: 57 10 31 c0 48 85 d2 74 2e 48 8b 4f 08 31 c0 48 85 c9 74 0d 48 63 41 14 48 89 d6 48 29 c6 48 89 f0 48 3b 57 18 48 89 07 74 1b <48> 8b 42 08 48 89 47 10 b8 01 00 00 00 48 83 c4 08 c3 66 0f 1f 84
Aug 31 18:17:17 fuel kernel: [ 3500.023351] RSP: 0018:ffffb11742
Aug 31 18:17:17 fuel kernel: [ 3500.023357] RAX: 0000000000020010 RBX: ffff9c819c336c30 RCX: ffff9c7e48cec978
Aug 31 18:17:17 fuel kernel: [ 3500.023361] RDX: 0000000000020010 RSI: 0000000000020010 RDI: ffff9c7cf6695d20
Aug 31 18:17:17 fuel kernel: [ 3500.023364] RBP: ffff9c7cf6695d20 R08: 0000000000000020 R09: ffff9c7cf6695d28
Aug 31 18:17:17 fuel kernel: [ 3500.023368] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c7d3d867738
Aug 31 18:17:17 fuel kernel: [ 3500.023371] R13: ffff9c7e1c027060 R14: ffff9c7cf6695d98 R15: ffff9c819c336c30
Aug 31 18:17:17 fuel kernel: [ 3500.023374] FS: 00007f7d9219fa4
Aug 31 18:17:17 fuel kernel: [ 3500.023379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 31 18:17:17 fuel kernel: [ 3500.023383] CR2: 0000000000000009 CR3: 0000000171e5e003 CR4: 00000000003706e0
Aug 31 18:17:17 fuel kernel: [ 3500.023386] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 31 18:17:17 fuel kernel: [ 3500.023389] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: nvidia-driver-470 470.57.
ProcVersionSign
Uname: Linux 5.11.0-31-generic x86_64
NonfreeKernelMo
ApportVersion: 2.20.11-0ubuntu65.1
Architecture: amd64
CasperMD5CheckR
CurrentDesktop: X-Cinnamon
Date: Tue Aug 31 18:29:27 2021
InstallationDate: Installed on 2018-10-28 (1038 days ago)
InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
SourcePackage: nvidia-
UpgradeStatus: Upgraded to hirsute on 2021-06-21 (71 days ago)
This happens to me too, but with Ubuntu 20.04. It started shortly after the latest nvidia- graphics- driver- 470 update. Symptoms are:
* graphical output freezes completely
* no reaction to any kind of input (no switching to tty either)
* ssh works and shows Xorg as using 100% CPU
* Xorg process can't be killed
* shutdown command is not executed (only ssh connection is terminated)
My GPU: GTX970. I've never had this problem before, although my PC is has not been modified for years. Now it happens multiple times per day!