nvidia failure to resume from suspend; kernel fallback broken

Bug #2065076 reported by Jurgen Schellaert
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
nvidia-graphics-drivers-550 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I am unable to resume from suspend. The system comes back on but the screen is black. Switching to a virtual console to look for a solution is impossible as the keyboard appears to be dead.

The issue is far from new and and goes as far back (for me) as 22.04. Until 23.10, however, it could be worked around by disabling the systemd suspend mechanism (disable nvidia-suspend, nvidia-hibernate and nvidia-resumpe). The system would then use the kernel driver fallback instead, which worked fine. Until now.

I have tried nvida drivers 535 and 550. Neither worked. I have enabled the graphics-drivers ppa to test the latest available driver, which is a newer version of the 550 driver included in the repositories. No luck.

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: linux-generic (not installed)
ProcVersionSignature: Ubuntu 6.8.0-31.31-generic 6.8.1
Uname: Linux 6.8.0-31-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.28.1-0ubuntu2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/seq: codey 2877 F.... pipewire
 /dev/snd/controlC0: codey 2881 F.... wireplumber
 /dev/snd/controlC1: codey 2881 F.... wireplumber
CRDA: N/A
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Tue May 7 16:38:34 2024
InstallationDate: Installed on 2022-03-27 (772 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220326)
MachineType: System manufacturer System Product Name
ProcFB: 0 simpledrmdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.8.0-31-generic root=UUID=3e125bf2-3409-4f46-8a06-bba9a5fa9032 ro
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-6.8.0-31-generic N/A
 linux-backports-modules-6.8.0-31-generic N/A
 linux-firmware 20240318.git3b128b60-0ubuntu2
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 03/13/2023
dmi.bios.release: 5.17
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 6063
dmi.board.asset.tag: Default string
dmi.board.name: PRIME X470-PRO
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr6063:bd03/13/2023:br5.17:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEX470-PRO:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:skuSKU:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Jurgen Schellaert (jurgen-schellaert-j) wrote :
Revision history for this message
Jurgen Schellaert (jurgen-schellaert-j) wrote (last edit ):

OK, this looks (for now) fixed. I have gone back to 535 and now resume is working again.

What has made the difference, I guess, is that installing 535 has restored /lib/systemd/system-sleep/nvidia. Deleting that file used to be part of the procedure to disable the systemd suspend/resume mechanism in favour of the kernel driver fallback. It looks like this is no longer the case and that the file needs to be left alone.

Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

The file '/lib/systemd/system-sleep/nvidia' is provided by the nvidia-graphics-drivers-* packages and not by linux. Assigning the issue to nvidia-graphics-drivers-550 for now.

Changed in linux (Ubuntu):
status: New → Invalid
Revision history for this message
Jurgen Schellaert (jurgen-schellaert-j) wrote :

I posted back too soon. The problem has just surfaced again. When I tried to resume my system just now, the screen would not turn on again. As pointed out before, the mouse/keyboard appear to be dead when that happens. The CPU fans also run full speed and do not stop until I reboot.

Could this "regression" (if that's the proper term...) be caused by the gnome-shell updates I ran today?

Revision history for this message
Jurgen Schellaert (jurgen-schellaert-j) wrote :

Failure to resume is happening systematically now. I have tried four times to suspend/resume since I posted my previous message. All four attempts failed.

It does seem like this was triggered by the updates I installed the other day.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-550 (Ubuntu):
status: New → Confirmed
Revision history for this message
dhenry (tfc-duke) wrote :

I have the same issue with GTX 980M. Resume works sometimes, but often fails as described (> 50% failure).
I'm currently running the 23.10 kernel to avoid this issue.

Note that with previous Ubuntu versions (like 23.10), this issue (at least, similar) could happen, but very rarely (I got it maybe 2-3 times over a year).

As if there where some race conditions triggering the issue that did rarely occur with older kernels, but very often with recent ones.

I tried without nvidia proprietary driver (using nouveau): no resume freeze. However, the nouveau support has other issues like not being able to control brightness (and brightness is set to 1% after resume).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.