Resume fail after suspend with NVIDIA driver on Ubuntu 22.04

Bug #1970088 reported by dhenry
68
This bug affects 12 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-510 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After upgrading from 21.10 to 22.04, the suspend/resume stopped working on my laptop with an NVIDIA discrete card.

The computer goes in sleep mode, but when resuming, the screen remain black.

The issue happens with both NVIDIA driver 470 and 510. Previously, I was running the 470 because I use MATE and there was a bug with Xorg/MATE/NVIDIA 510, so I don't know how it behave in Ubuntu 21.10 and NVIDIA driver 510.

I could not try with an older kernel on Ubuntu 22.04, because DKMS refused to build the driver for older kernels.

The dmesg/kern.log contains a backtrace:

Apr 24 11:26:16 thebat kernel: [ 70.354793] ------------[ cut here ]------------
Apr 24 11:26:16 thebat kernel: [ 70.354795] WARNING: CPU: 2 PID: 4127 at /var/lib/dkms/nvidia/510.60.02/build/nvidia/nv.c:3935 nv_restore_user_channels+0xce/0xe0 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.354974] Modules linked in: vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nvme_fabrics rfcomm ccm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib bridge stp llc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common intel_tcc_cooling uvcvideo videobuf2_vmalloc videobuf2_memops x86_pkg_temp_thermal binfmt_misc intel_powerclamp snd_hda_codec_hdmi coretemp videobuf2_v4l2 clevo_xsm_wmi(OE) snd_hda_codec_realtek snd_hda_codec_generic videobuf2_common kvm_intel ledtrig_audio kvm videodev nls_iso8859_1 btusb rapl iwlmvm snd_hda_intel mc btrtl intel_cstate btbcm snd_seq_midi snd_intel_dspcfg snd_seq_midi_event btintel mac80211 snd_intel_sdw_acpi snd_rawmidi bluetooth ecdh_generic ecc libarc4 joydev
Apr 24 11:26:16 thebat kernel: [ 70.355006] input_leds snd_hda_codec snd_seq snd_hda_core iwlwifi snd_hwdep serio_raw intel_wmi_thunderbolt efi_pstore mxm_wmi wmi_bmof ee1004 snd_pcm snd_seq_device cfg80211 snd_timer snd mei_me soundcore mei intel_pch_thermal mac_hid acpi_pad nvidia_uvm(POE) sch_fq_codel ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp parport ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crct10dif_pclmul rc_core crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd drm psmouse i2c_i801 nvme i2c_smbus sdhci_pci ahci xhci_pci alx cqhci xhci_pci_renesas nvme_core sdhci mdio libahci wmi video
Apr 24 11:26:16 thebat kernel: [ 70.355040] CPU: 2 PID: 4127 Comm: nvidia-sleep.sh Tainted: P OE 5.15.0-25-generic #25-Ubuntu
Apr 24 11:26:16 thebat kernel: [ 70.355042] Hardware name: Notebook P7xxDM(-G) /P775DM(-G) , BIOS 1.05.09 12/28/2015
Apr 24 11:26:16 thebat kernel: [ 70.355043] RIP: 0010:nv_restore_user_channels+0xce/0xe0 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355158] Code: b2 b3 fa be 01 00 00 00 4c 89 ef e8 ec b3 00 00 48 89 df e8 14 b2 b3 fa ba 02 00 00 00 4c 89 ee 4c 89 e7 e8 04 61 94 00 eb 93 <0f> 0b eb c6 41 be 51 00 00 00 eb 9e 66 0f 1f 44 00 00 0f 1f 44 00
Apr 24 11:26:16 thebat kernel: [ 70.355159] RSP: 0018:ffffb5af83a5bd48 EFLAGS: 00010206
Apr 24 11:26:16 thebat kernel: [ 70.355160] RAX: 0000000000000003 RBX: ffff8c4d1ec00800 RCX: ffffb5af83a5bce0
Apr 24 11:26:16 thebat kernel: [ 70.355161] RDX: 0000000000000087 RSI: 0000000000000246 RDI: ffff8c4d00ee1068
Apr 24 11:26:16 thebat kernel: [ 70.355162] RBP: ffffb5af83a5bd70 R08: 0000000000000000 R09: ffff8c54a65b1040
Apr 24 11:26:16 thebat kernel: [ 70.355163] R10: ffff8c4d03d04660 R11: 0000000000000000 R12: ffff8c4d1146b000
Apr 24 11:26:16 thebat kernel: [ 70.355164] R13: ffff8c4d1ec00800 R14: 0000000000000003 R15: ffff8c4d1ec00d10
Apr 24 11:26:16 thebat kernel: [ 70.355165] FS: 00007f14620a8740(0000) GS:ffff8c54a6480000(0000) knlGS:0000000000000000
Apr 24 11:26:16 thebat kernel: [ 70.355166] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 11:26:16 thebat kernel: [ 70.355167] CR2: 000055bb914470c8 CR3: 000000010ee62003 CR4: 00000000003706e0
Apr 24 11:26:16 thebat kernel: [ 70.355168] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 24 11:26:16 thebat kernel: [ 70.355168] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 24 11:26:16 thebat kernel: [ 70.355169] Call Trace:
Apr 24 11:26:16 thebat kernel: [ 70.355170] <TASK>
Apr 24 11:26:16 thebat kernel: [ 70.355172] nv_set_system_power_state+0x22b/0x3e0 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355286] nv_procfs_write_suspend+0x100/0x180 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355401] proc_reg_write+0x5a/0x90
Apr 24 11:26:16 thebat kernel: [ 70.355404] ? __cond_resched+0x1a/0x50
Apr 24 11:26:16 thebat kernel: [ 70.355407] vfs_write+0xc3/0x260
Apr 24 11:26:16 thebat kernel: [ 70.355410] ksys_write+0x67/0xe0
Apr 24 11:26:16 thebat kernel: [ 70.355411] __x64_sys_write+0x19/0x20
Apr 24 11:26:16 thebat kernel: [ 70.355412] do_syscall_64+0x5c/0xc0
Apr 24 11:26:16 thebat kernel: [ 70.355414] ? do_user_addr_fault+0x1e3/0x670
Apr 24 11:26:16 thebat kernel: [ 70.355417] ? exit_to_user_mode_prepare+0x37/0xb0
Apr 24 11:26:16 thebat kernel: [ 70.355419] ? irqentry_exit_to_user_mode+0x9/0x20
Apr 24 11:26:16 thebat kernel: [ 70.355420] ? irqentry_exit+0x19/0x30
Apr 24 11:26:16 thebat kernel: [ 70.355421] ? exc_page_fault+0x89/0x160
Apr 24 11:26:16 thebat kernel: [ 70.355422] ? asm_exc_page_fault+0x8/0x30
Apr 24 11:26:16 thebat kernel: [ 70.355424] entry_SYSCALL_64_after_hwframe+0x44/0xae
Apr 24 11:26:16 thebat kernel: [ 70.355426] RIP: 0033:0x7f14621bfa37
Apr 24 11:26:16 thebat kernel: [ 70.355427] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Apr 24 11:26:16 thebat kernel: [ 70.355428] RSP: 002b:00007fff93dcc848 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Apr 24 11:26:16 thebat kernel: [ 70.355430] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f14621bfa37
Apr 24 11:26:16 thebat kernel: [ 70.355430] RDX: 0000000000000007 RSI: 0000555dafdaed00 RDI: 0000000000000001
Apr 24 11:26:16 thebat kernel: [ 70.355431] RBP: 0000555dafdaed00 R08: 0000000000000000 R09: 0000555dafdaed00
Apr 24 11:26:16 thebat kernel: [ 70.355432] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000007
Apr 24 11:26:16 thebat kernel: [ 70.355433] R13: 00007f14622c5780 R14: 00007f14622c1600 R15: 00007f14622c0a00
Apr 24 11:26:16 thebat kernel: [ 70.355435] </TASK>
Apr 24 11:26:16 thebat kernel: [ 70.355435] ---[ end trace ce9942c23cb7434d ]---
Apr 24 11:26:16 thebat kernel: [ 70.355466] ------------[ cut here ]------------
Apr 24 11:26:16 thebat kernel: [ 70.355467] WARNING: CPU: 6 PID: 4127 at /var/lib/dkms/nvidia/510.60.02/build/nvidia/nv.c:4152 nv_set_system_power_state+0x2d0/0x3e0 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355593] Modules linked in: vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nvme_fabrics rfcomm ccm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib bridge stp llc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common intel_tcc_cooling uvcvideo videobuf2_vmalloc videobuf2_memops x86_pkg_temp_thermal binfmt_misc intel_powerclamp snd_hda_codec_hdmi coretemp videobuf2_v4l2 clevo_xsm_wmi(OE) snd_hda_codec_realtek snd_hda_codec_generic videobuf2_common kvm_intel ledtrig_audio kvm videodev nls_iso8859_1 btusb rapl iwlmvm snd_hda_intel mc btrtl intel_cstate btbcm snd_seq_midi snd_intel_dspcfg snd_seq_midi_event btintel mac80211 snd_intel_sdw_acpi snd_rawmidi bluetooth ecdh_generic ecc libarc4 joydev
Apr 24 11:26:16 thebat kernel: [ 70.355620] input_leds snd_hda_codec snd_seq snd_hda_core iwlwifi snd_hwdep serio_raw intel_wmi_thunderbolt efi_pstore mxm_wmi wmi_bmof ee1004 snd_pcm snd_seq_device cfg80211 snd_timer snd mei_me soundcore mei intel_pch_thermal mac_hid acpi_pad nvidia_uvm(POE) sch_fq_codel ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp parport ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crct10dif_pclmul rc_core crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd drm psmouse i2c_i801 nvme i2c_smbus sdhci_pci ahci xhci_pci alx cqhci xhci_pci_renesas nvme_core sdhci mdio libahci wmi video
Apr 24 11:26:16 thebat kernel: [ 70.355646] CPU: 6 PID: 4127 Comm: nvidia-sleep.sh Tainted: P W OE 5.15.0-25-generic #25-Ubuntu
Apr 24 11:26:16 thebat kernel: [ 70.355648] Hardware name: Notebook P7xxDM(-G) /P775DM(-G) , BIOS 1.05.09 12/28/2015
Apr 24 11:26:16 thebat kernel: [ 70.355648] RIP: 0010:nv_set_system_power_state+0x2d0/0x3e0 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355762] Code: ff ff 41 83 fd 02 74 e9 49 8b 84 24 70 02 00 00 ba 02 00 00 00 48 8b 70 78 48 8b 78 60 e8 98 cf ff ff 85 c0 74 cb 0f 0b eb c7 <0f> 0b e9 5c ff ff ff 48 c7 c7 10 ea 95 c2 e8 2d 7f b3 fa e8 08 48
Apr 24 11:26:16 thebat kernel: [ 70.355763] RSP: 0018:ffffb5af83a5bd80 EFLAGS: 00010206
Apr 24 11:26:16 thebat kernel: [ 70.355764] RAX: 0000000000000003 RBX: 0000000000000002 RCX: ffff8c4d00ee1cc0
Apr 24 11:26:16 thebat kernel: [ 70.355764] RDX: 0000000080020002 RSI: ffffffffc05f18e8 RDI: ffff8c4d204e3800
Apr 24 11:26:16 thebat kernel: [ 70.355765] RBP: ffffb5af83a5bdb0 R08: 0000000000000001 R09: 0000000000000000
Apr 24 11:26:16 thebat kernel: [ 70.355766] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8c4d1ec00800
Apr 24 11:26:16 thebat kernel: [ 70.355767] R13: 0000000000000000 R14: 0000555dafdaed00 R15: ffffb5af83a5be58
Apr 24 11:26:16 thebat kernel: [ 70.355767] FS: 00007f14620a8740(0000) GS:ffff8c54a6580000(0000) knlGS:0000000000000000
Apr 24 11:26:16 thebat kernel: [ 70.355768] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 11:26:16 thebat kernel: [ 70.355769] CR2: 0000562ef2ef45e8 CR3: 000000010ee62006 CR4: 00000000003706e0
Apr 24 11:26:16 thebat kernel: [ 70.355770] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 24 11:26:16 thebat kernel: [ 70.355771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 24 11:26:16 thebat kernel: [ 70.355771] Call Trace:
Apr 24 11:26:16 thebat kernel: [ 70.355772] <TASK>
Apr 24 11:26:16 thebat kernel: [ 70.355773] nv_procfs_write_suspend+0x100/0x180 [nvidia]
Apr 24 11:26:16 thebat kernel: [ 70.355888] proc_reg_write+0x5a/0x90
Apr 24 11:26:16 thebat kernel: [ 70.355890] ? __cond_resched+0x1a/0x50
Apr 24 11:26:16 thebat kernel: [ 70.355893] vfs_write+0xc3/0x260
Apr 24 11:26:16 thebat kernel: [ 70.355895] ksys_write+0x67/0xe0
Apr 24 11:26:16 thebat kernel: [ 70.355896] __x64_sys_write+0x19/0x20
Apr 24 11:26:16 thebat kernel: [ 70.355897] do_syscall_64+0x5c/0xc0
Apr 24 11:26:16 thebat kernel: [ 70.355899] ? do_user_addr_fault+0x1e3/0x670
Apr 24 11:26:16 thebat kernel: [ 70.355901] ? exit_to_user_mode_prepare+0x37/0xb0
Apr 24 11:26:16 thebat kernel: [ 70.355902] ? irqentry_exit_to_user_mode+0x9/0x20
Apr 24 11:26:16 thebat kernel: [ 70.355904] ? irqentry_exit+0x19/0x30
Apr 24 11:26:16 thebat kernel: [ 70.355905] ? exc_page_fault+0x89/0x160
Apr 24 11:26:16 thebat kernel: [ 70.355906] ? asm_exc_page_fault+0x8/0x30
Apr 24 11:26:16 thebat kernel: [ 70.355907] entry_SYSCALL_64_after_hwframe+0x44/0xae
Apr 24 11:26:16 thebat kernel: [ 70.355909] RIP: 0033:0x7f14621bfa37
Apr 24 11:26:16 thebat kernel: [ 70.355910] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Apr 24 11:26:16 thebat kernel: [ 70.355911] RSP: 002b:00007fff93dcc848 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Apr 24 11:26:16 thebat kernel: [ 70.355912] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f14621bfa37
Apr 24 11:26:16 thebat kernel: [ 70.355913] RDX: 0000000000000007 RSI: 0000555dafdaed00 RDI: 0000000000000001
Apr 24 11:26:16 thebat kernel: [ 70.355913] RBP: 0000555dafdaed00 R08: 0000000000000000 R09: 0000555dafdaed00
Apr 24 11:26:16 thebat kernel: [ 70.355914] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000007
Apr 24 11:26:16 thebat kernel: [ 70.355915] R13: 00007f14622c5780 R14: 00007f14622c1600 R15: 00007f14622c0a00
Apr 24 11:26:16 thebat kernel: [ 70.355916] </TASK>
Apr 24 11:26:16 thebat kernel: [ 70.355917] ---[ end trace ce9942c23cb7434e ]---
Apr 24 11:26:19 thebat kernel: [ 73.358424] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Apr 24 11:26:21 thebat kernel: [ 75.613255] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000957d:0:0:407

Revision history for this message
dhenry (tfc-duke) wrote :
Revision history for this message
dhenry (tfc-duke) wrote :

A workaround I found is to disable nvidia-resume and nvidia-suspend services:

systemctl disable nvidia-hibernate.service nvidia-resume.service nvidia-suspend.service

Suspend/resume works again after that.

Revision history for this message
dhenry (tfc-duke) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-510 (Ubuntu):
status: New → Confirmed
Revision history for this message
Stephen (belrik) wrote :

Seeing a bug with Nvidia 510 since upgrading to 22.04. Did a clean install to confirm. Will test the suggested fix of disabling nvidia services.

Revision history for this message
Stephen (belrik) wrote :

Disabling these services caused the attached displays to re-initialise from powersaving mode, but the screen remains black. If I switch to text console I have a flashing cursor but nothing else.

Revision history for this message
Juno Computers (junocomp) wrote :

I also have the same issue. The only way in is through ssh and restarting gdm. But this logs me out of the previous session.

Revision history for this message
ssherlock (simon-sherlock) wrote :

Is this still happening for people? Suspend/resume started working for me after a recent software update

Revision history for this message
Stephen (belrik) wrote :

Last couple of attempts have worked but I am hesitant to dismiss the bug because previous Nvidia suspend resume bugs have not occurred 100% of the time.

Revision history for this message
ManOnTheMoon (manonthemoon) wrote :

Still having problem going into suspend when using Nividia 510 driver while in x11. Weird thing was it was ok a couple of days back. And now have to revert back to 390 driver for suspend to work.

Revision history for this message
dhenry (tfc-duke) wrote :

No fix for me.

I installed all updates, reenabled the suspend services, rebooted and tried a suspend/resume, but it still failed to resume.

After a hard reboot, I disabled the suspend services, and suspend/resume works again for me.

Revision history for this message
dhenry (tfc-duke) wrote :

As the author of this bug report, I marked this bug as duplicate of bug #1946303 because it's the same symptoms and the same workaround has been described working there.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.