nvidia-graphics-drivers-535 .113.01 crash

Bug #2038681 reported by Bert RAM Aerts
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-535 (Ubuntu)
New
Undecided
Unassigned

Bug Description

It was running RealVNC server, VMware Workstation Pro with Windows 11 guest, Firefox and Thunderbird.
I used it with RealVNC Viewer on another machine and suddenly my VNC connection dropped.
Coming to my nVIDIA laptop it was showing no gnome desktop anymore but a black screen.
I could still ssh to the laptop to trigger a reboot.
Via journalctl I saw that nVIDIA driver had crashed.
I have seen this now with both
535.113.01-0ubuntu0.23.04.1
535.113.01-0ubuntu0.23.04.3

Oct 06 17:01:57 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
Oct 06 17:01:57 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(0): recover...
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(GPU-0): Failed to initialize DMA.
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(0): Failed to allocate push buffer
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(0): Error recovery failed.
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) NVIDIA(0): *** Aborting ***
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE)
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: Fatal server error:
Oct 06 17:02:37 legion5ubuntu /usr/libexec/gdm-x-session[3110]: (EE) Failed to recover from error!

ProblemType: Bug
DistroRelease: Ubuntu 23.04
Package: xorg 1:7.7+23ubuntu2
ProcVersionSignature: Ubuntu 6.2.0-34.34-generic 6.2.16
Uname: Linux 6.2.0-34-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
.proc.driver.nvidia.capabilities.gpu0: Error: path was not a regular file.
.proc.driver.nvidia.capabilities.mig: Error: path was not a regular file.
.proc.driver.nvidia.gpus.0000.01.00.0: Error: path was not a regular file.
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.suspend: suspend hibernate resume
.proc.driver.nvidia.suspend_depth: default modeset uvm
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.113.01 Tue Sep 12 19:41:24 UTC 2023
 GCC version: gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~23.04)
ApportVersion: 2.26.1-0ubuntu2
Architecture: amd64
BootLog: Error: [Errno 13] Permission denied: '/var/log/boot.log'
CasperMD5CheckResult: pass
CompositorRunning: None
CurrentDesktop: GNOME
Date: Fri Oct 6 19:12:16 2023
DistUpgraded: 2023-04-14 13:11:36,442 DEBUG Running PostInstallScript: '/usr/lib/ubuntu-advantage/upgrade_lts_contract.py'
DistroCodename: lunar
DistroVariant: ubuntu
DkmsStatus:
 nvidia/535.113.01, 6.2.0-33-generic, x86_64: installed
 nvidia/535.113.01, 6.2.0-34-generic, x86_64: installed
ExtraDebuggingInterest: Yes
GraphicsCard:
 NVIDIA Corporation TU106M [GeForce RTX 2060 Mobile] [10de:1f15] (rev a1) (prog-if 00 [VGA controller])
   Subsystem: Lenovo TU106M [GeForce RTX 2060 Mobile] [17aa:3f8c]
InstallationDate: Installed on 2021-05-17 (871 days ago)
InstallationMedia: Ubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
MachineType: LENOVO 81Y6
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.2.0-34-generic root=UUID=998a3bca-88e2-4d44-ab68-6776750daa09 ro nvidia-drm.modeset=1 acpi_backlight=native nvidia.NVreg_RegistryDwords=EnableBrightnessControl=1
SourcePackage: xorg
Symptom: display
Title: Xorg crash
UpgradeStatus: Upgraded to lunar on 2023-04-14 (175 days ago)
dmi.bios.date: 06/09/2023
dmi.bios.release: 1.59
dmi.bios.vendor: LENOVO
dmi.bios.version: EFCN59WW
dmi.board.asset.tag: NO Asset Tag
dmi.board.name: LNVNB161216
dmi.board.vendor: LENOVO
dmi.board.version: SDK0R32862 WIN
dmi.chassis.asset.tag: NO Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Lenovo Legion 5 15IMH05H
dmi.ec.firmware.release: 1.59
dmi.modalias: dmi:bvnLENOVO:bvrEFCN59WW:bd06/09/2023:br1.59:efr1.59:svnLENOVO:pn81Y6:pvrLenovoLegion515IMH05H:rvnLENOVO:rnLNVNB161216:rvrSDK0R32862WIN:cvnLENOVO:ct10:cvrLenovoLegion515IMH05H:skuLENOVO_MT_81Y6_BU_idea_FM_Legion515IMH05H:
dmi.product.family: Legion 5 15IMH05H
dmi.product.name: 81Y6
dmi.product.sku: LENOVO_MT_81Y6_BU_idea_FM_Legion 5 15IMH05H
dmi.product.version: Lenovo Legion 5 15IMH05H
dmi.sys.vendor: LENOVO
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.114-1
version.libgl1-mesa-dri: libgl1-mesa-dri 23.0.4-0ubuntu1~23.04.1
version.libgl1-mesa-glx: libgl1-mesa-glx N/A
version.nvidia-graphics-drivers: nvidia-graphics-drivers-* N/A
version.xserver-xorg-core: xserver-xorg-core 2:21.1.7-1ubuntu3
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-3
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20210115-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.17-2build1

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :
affects: xorg (Ubuntu) → nvidia-graphics-drivers-535 (Ubuntu)
summary: - nvidia-driver-535.113.01 crash
+ nvidia-graphics-drivers-535 .113.01 crash
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Can you provide the full log from when it happened?

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote (last edit ):

When it had happened, I rebooted my laptop via ssh from another machine.
And then made the bugreport several hours later.
nvidia-bug-report.log.gz (1.2 MiB, application/x-gzip) is already attached and contains the error I mentioned in the description.
Not sure what you mean with "full log from when it happened"?

Yesterday I upgraded from Ubuntu 2304 to the latest Ubuntu 2310, also running 535.113.01-0ubuntu0.23.04.3.

The other machine [2], where I run RealVNC viewer, has a big screen attached with 2560 x 1440 resolution. My nVIDIA laptop [1] has a sreen of 1920 x 1080 resolution. I have the impression that if I make the VNC session with [1] full screen on [2] on the big screen, the crash happens soon after that. But when I don't go full screen, the crash does not happen.

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote (last edit ):

Now nvidia crash in Ubuntu 23.10 with 535.113.01-0ubuntu0.23.04.3.
Again with RealVNC server (latest version) on Ubuntu and RealVNC Viewer (latest version) on another machine. This time the viewer was using 1920 x 1080 resolution. So this time not in full screen mode on big screen with 2560 x 1440 resolution.
This time I used CTRL-ALT-F6 and launched nvidia-bug-report.sh, output is attached in next comment.

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :
Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :

I need to admit that the way I use my laptop now, only started on 27/09/2023.
I bought a Sonos Era 300 speaker and need to use a Windows application in the local network to control it.
So I use VMware Workstation Pro 17 Host Ubuntu with Guest Windows 11 running the Sonos S2 application on a laptop in the same network as the speaker.
And I use RealVNC Server on this Ubuntu laptop and RealVNC Viewer on my WIndows 10 work laptop that uses VPN, so not in the local network.
Summary: I have no idea if previous nVIDIA drivers would or would not crash in the same way as 535.113.01, as I never tested this.

Revision history for this message
Abhishek Chauhan (abchauhan) wrote :

Hi Bert,

Thank you for sharing the bug reports. We are unable to reproduce this on our NVIDIA test systems.
Do you continue to see crashes on the latest 535 and 550 series drivers - 535.154.05, 550.40.07?

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :

In the meantime I am using my Ubuntu 23.10 laptop with XFCE instead of GNOME.
Since this change, I was not able to reproduce this bug.
Currently running nVIDIA 545.29.06 without issues.
So you can close this bugreport I guess?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.