[modeset][nvidia] suspend/resume broken in nvidia-460 : Display engine push buffer channel allocation failed

Bug #1911055 reported by Chris Bainbridge
290
This bug affects 50 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-460 (Ubuntu)
Confirmed
Undecided
Unassigned
nvidia-graphics-drivers-470 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After a recent update to nvidia-460, suspend then resume results in a black screen, then a hang for around 120 seconds, and then the text:

nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed: 0x65 (Call timed out [NV_ERR_TIMEOUT])
nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed: 0x65 (Call timed out [NV_ERR_TIMEOUT])
nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer

dmesg shows:

[ 188.352670] ACPI: Waking up from system sleep state S3
...
[ 309.142164] nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed: 0x65 (Call timed out [NV_ERR_TIMEOUT])
[ 309.142319] nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
[ 313.142165] nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed: 0x65 (Call timed out [NV_ERR_TIMEOUT])
[ 313.142348] nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
[ 313.151885] acpi LNXPOWER:08: Turning OFF
[ 313.151898] acpi LNXPOWER:04: Turning OFF
[ 313.152351] acpi LNXPOWER:03: Turning OFF
[ 313.153064] acpi LNXPOWER:02: Turning OFF
...
[ 325.010192] nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DELL 2209WA (HDMI-0)
[ 325.010577] BUG: kernel NULL pointer dereference, address: 0000000000000050
[ 325.010579] #PF: supervisor read access in kernel mode
[ 325.010580] #PF: error_code(0x0000) - not-present page
[ 325.010580] PGD 0 P4D 0
[ 325.010582] Oops: 0000 [#1] SMP PTI
[ 325.010583] CPU: 9 PID: 2247 Comm: Xorg Tainted: P O 5.4.0-60-generic #67-Ubuntu
[ 325.010584] Hardware name: Razer Blade/DANA_MB, BIOS 01.01 08/31/2018
[ 325.010595] RIP: 0010:_nv000112kms+0xd/0x30 [nvidia_modeset]
[ 325.010596] Code: 16 00 74 06 83 7e 0c 02 77 03 31 c0 c3 c7 46 0c 01 00 00 00 b8 01 00 00 00 c3 0f 1f 00 0f b7 46 04 0f b7 56 08 39 d0 0f 47 c2 <3b> 47 18 77 06 83 7e 10 02 77 08 31 c0 c3 0f 1f 44 00 00 c7 46 10
[ 325.010597] RSP: 0018:ffffb89542e5b8a8 EFLAGS: 00010246
[ 325.010598] RAX: 0000000000000280 RBX: 0000000000000038 RCX: 00000000000003ff
[ 325.010598] RDX: 0000000000000280 RSI: ffffb89542e5bba0 RDI: 0000000000000038
[ 325.010599] RBP: ffffb89542e5b9c8 R08: 0000000000000000 R09: ffffffffc379e440
[ 325.010600] R10: 0000000000000000 R11: 00000000ffffffff R12: ffffb89542e5b948
[ 325.010600] R13: ffffb89542e5b948 R14: ffffb89542e5b930 R15: ffffffffc3a50e60
[ 325.010601] FS: 00007feb4c09ea80(0000) GS:ffff8e46ada40000(0000) knlGS:0000000000000000
[ 325.010602] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 325.010602] CR2: 0000000000000050 CR3: 00000004772e2006 CR4: 00000000003606e0
[ 325.010603] Call Trace:
[ 325.010610] ? _nv002768kms+0x96/0xd0 [nvidia_modeset]
[ 325.010616] ? _nv002328kms+0xb5/0x110 [nvidia_modeset]
[ 325.010624] ? _nv000742kms+0x168/0x370 [nvidia_modeset]
[ 325.010823] ? _nv036002rm+0x62/0x70 [nvidia]
[ 325.010910] ? os_get_current_tick+0x2c/0x50 [nvidia]
[ 325.010921] ? _nv002771kms+0x433/0x600 [nvidia_modeset]
[ 325.010928] ? _nv002771kms+0x3fc/0x600 [nvidia_modeset]
[ 325.010930] ? __schedule+0x2eb/0x740
[ 325.010935] ? _nv000742kms+0x40/0x40 [nvidia_modeset]
[ 325.010939] ? nvkms_alloc+0x6a/0xa0 [nvidia_modeset]
[ 325.010944] ? _nv000742kms+0x40/0x40 [nvidia_modeset]
[ 325.010948] ? _nv000744kms+0x2a/0x40 [nvidia_modeset]
[ 325.010952] ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
[ 325.010957] ? nvkms_ioctl_common+0x42/0x80 [nvidia_modeset]
[ 325.010961] ? nvkms_ioctl+0xc4/0x100 [nvidia_modeset]
[ 325.011016] ? nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
[ 325.011017] ? do_vfs_ioctl+0x407/0x670
[ 325.011019] ? do_user_addr_fault+0x216/0x450
[ 325.011020] ? ksys_ioctl+0x67/0x90
[ 325.011021] ? __x64_sys_ioctl+0x1a/0x20
[ 325.011023] ? do_syscall_64+0x57/0x190
[ 325.011024] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 325.011025] Modules linked in: rfcomm ccm aufs overlay cmac algif_hash algif_skcipher af_alg bnep snd_sof_pci snd_sof_intel_hda_common nvidia_uvm(O) snd_soc_hdac_hda snd_sof_intel_hda snd_sof_intel_byt snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_hda_ext_core snd_soc_acpi_intel_match nvidia_drm(PO) snd_soc_acpi nvidia_modeset(PO) snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine nls_iso8859_1 intel_rapl_msr mei_hdcp intel_rapl_common nvidia(PO) x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_usb_audio snd_hda_intel snd_usbmidi_lib iwlmvm snd_intel_dspcfg snd_hda_codec snd_seq_midi kvm_intel snd_seq_midi_event snd_hda_core snd_hwdep mac80211 kvm snd_rawmidi libarc4 snd_pcm rapl intel_cstate snd_seq mxm_wmi dcdbas intel_wmi_thunderbolt wmi_bmof dell_wmi_descriptor snd_seq_device 8250_dw snd_timer iwlwifi uvcvideo btusb videobuf2_vmalloc btrtl videobuf2_memops btbcm btintel videobuf2_v4l2 cfg80211 bluetooth
[ 325.011042] videobuf2_common joydev videodev input_leds ecdh_generic mei_me snd mc ecc hid_multitouch mei soundcore intel_pch_thermal mac_hid acpi_pad sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_compress dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_microsoft ff_memless usbhid hid_generic i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt nvme crypto_simd fb_sys_fops cryptd glue_helper nvme_core thunderbolt drm i2c_i801 intel_lpss_pci intel_lpss r8169 ahci idma64 i2c_hid realtek libahci virt_dma hid wmi video pinctrl_cannonlake pinctrl_intel
[ 325.011059] CR2: 0000000000000050
[ 325.011060] ---[ end trace 95cdb20056e1c4ac ]---
[ 325.011066] RIP: 0010:_nv000112kms+0xd/0x30 [nvidia_modeset]
[ 325.011067] Code: 16 00 74 06 83 7e 0c 02 77 03 31 c0 c3 c7 46 0c 01 00 00 00 b8 01 00 00 00 c3 0f 1f 00 0f b7 46 04 0f b7 56 08 39 d0 0f 47 c2 <3b> 47 18 77 06 83 7e 10 02 77 08 31 c0 c3 0f 1f 44 00 00 c7 46 10
[ 325.011067] RSP: 0018:ffffb89542e5b8a8 EFLAGS: 00010246
[ 325.011068] RAX: 0000000000000280 RBX: 0000000000000038 RCX: 00000000000003ff
[ 325.011069] RDX: 0000000000000280 RSI: ffffb89542e5bba0 RDI: 0000000000000038
[ 325.011069] RBP: ffffb89542e5b9c8 R08: 0000000000000000 R09: ffffffffc379e440
[ 325.011070] R10: 0000000000000000 R11: 00000000ffffffff R12: ffffb89542e5b948
[ 325.011071] R13: ffffb89542e5b948 R14: ffffb89542e5b930 R15: ffffffffc3a50e60
[ 325.011071] FS: 00007feb4c09ea80(0000) GS:ffff8e46ada40000(0000) knlGS:0000000000000000
[ 325.011072] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 325.011073] CR2: 0000000000000050 CR3: 00000004772e2006 CR4: 00000000003606e0

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-modules-nvidia-460-5.4.0-60-generic 5.4.0-60.67
ProcVersionSignature: Ubuntu 5.4.0-60.67-generic 5.4.78
Uname: Linux 5.4.0-60-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu27.14
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Mon Jan 11 20:29:31 2021
InstallationDate: Installed on 2020-01-06 (371 days ago)
InstallationMedia: Ubuntu 18.04.3 LTS "Bionic Beaver" - Release amd64 (20190805)
SourcePackage: linux-restricted-modules
UpgradeStatus: Upgraded to focal on 2020-10-09 (94 days ago)

Revision history for this message
Chris Bainbridge (chris-bainbridge) wrote :
affects: linux-restricted-modules (Ubuntu) → nvidia-graphics-drivers-460 (Ubuntu)
Revision history for this message
Chris Bainbridge (chris-bainbridge) wrote :

Suspend/resume was previously working in nvidia-455.38

summary: - [nvidia] suspend/resume broken in nvidia-460 : Display engine push
- buffer channel allocation failed
+ [modeset][nvidia] suspend/resume broken in nvidia-460 : Display engine
+ push buffer channel allocation failed
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-460 (Ubuntu):
status: New → Confirmed
Revision history for this message
diablo60 (eldiablo62) wrote :

Hi, I have the same problem after suspend I have a black screen. No problem before updating to nvidia-460.

Revision history for this message
David Monro (davidm-ub) wrote :

Same issue here on Groovy (20.10), kernel 5.8.0-38-lowlatency nvidia 460, GTX980. I get a completely black screen on resume, but can ssh in and see the same error in dmesg.

Cheers

David

Revision history for this message
Simon (simonzima-rz) wrote :

Going back to nvidia-450 solved the case for me. So when they will repair it i will upgrade again.
Simon

Revision history for this message
Amr Motan (amrmotan) wrote :

Hi,

I am also facing the same problem. I run Ubuntu 20.04 Budgie.

Regards
Amr

Revision history for this message
Alf HP Lund (alf-c) wrote :

I had the same issue with Kubuntu 20.04.1 on HP ZBook 15 G3.

After following this guide to install the latest drivers, the problem seems to be gone: https://askubuntu.com/questions/61396/how-do-i-install-the-nvidia-drivers

I can now both use the nvidia card and suspend / lock screen

Revision history for this message
Alf HP Lund (alf-c) wrote :

I did these steps:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt upgrade
sudo reboot

My driver version now reports as 460.39

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :

@Alf HP Lund: you use Kubuntu. But with Wayland or with xorg?

Revision history for this message
Peter R (forums-oygle) wrote :

I'm getting a "kernel taint" message with Nvidia-460 ; not sure if it is related to this bug, reference https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1904249/comments/3

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :

With 460.56 I get below errors on Ubuntu 20.10 on Dell Inspiron 7577 laptop with nVIDIA GeForce GTX 1060 with max-Q design (and Intel GPU as well but Prime setting is NVIDIA performance mode):
grep DMA /var/log/syslog
Feb 27 19:54:26 Dell7577Linux kernel: [43459.186526] nvidia-modeset: ERROR: GPU:0: Notifier DMA allocation failed
Feb 27 19:54:26 Dell7577Linux kernel: [43459.186528] nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
I get no desktop anymore after resume on suspend to RAM, only above errors in text display.

jachin.cai (jachin.cai)
Changed in nvidia-graphics-drivers-460 (Ubuntu):
status: Confirmed → Fix Committed
status: Fix Committed → Confirmed
Revision history for this message
cherepanov (a-cherepanov) wrote :

The same. HP ZBook x360 G5

Revision history for this message
Bert RAM Aerts (bert.ram.aerts) wrote :

Reporting a nVIDIA issue in an Ubuntu bug report is probably not as efficient as contacting nVIDIA directly. I made a report in the Linux forum: https://forums.developer.nvidia.com/t/460-56-no-desktop-on-resume-from-suspend-to-ram/170275 but I did not get a response yet. Could some of the people affected by this bug add their nvidia_bug_report.log.gz ? Thanks!

Revision history for this message
Edwin Khoo (edwinksl) wrote :

Same issue with Ubuntu 20.04 on ThinkPad P50.

Revision history for this message
Hugo Ferreira (hmf) wrote (last edit ):

I had the same issue in a ZBook G5 but interestingly enough I got this working.

Some background: I had used the Nvidia drive 460 with success. However due to a small mishap, I had to reinstall Ubuntu. Second time around, though I got the error reported here. I tried installing the drives directly via the "Software & Updates -> Additional Driver" application and via the packages (never NVIDIAs scripts). But these did not work. But the following 2 steps seemed to get this doing:

1. Configure the use of the suspend and resume scripts shown here:
https://download.nvidia.com/XFree86/Linux-x86_64/460.39/README/powermanagement.html

2. Use the "Software & Updates -> Additional Driver" application but use the "NVIDIA Server Driver metapackage ... (proprietery)" and *not* the "NVIDIA driver metapackage ... (proprietary, tested)"

Tests show both NVIDia profiles work now on resume.

It would be interresting to get feedback from others. If this in fact does work, then one needs to know what is the difference between the above two options.

HTHs

Revision history for this message
Álvaro Souza (alvarofernandoms) wrote (last edit ):

I'm using a Dell 7567 with a GTX 1050 TI and even with a untested version (465.27) the issue still here. 😞

Revision history for this message
Damien Banuls (damb11) wrote :

Like Hugo Ferreira, switching to the NVIDIA Server Driver instead of the classic NVIDIA driver finally did the job. Everything I tried before didn't work.
So it is finally fixed for me, although I don't know if there are any drawbacks to using the Server driver :)

Revision history for this message
Mario (mario156090) wrote :

I had same error, so I make clean install my Ubuntu and install in CLI mode the drivers.

sudo apt install nvidia-driver-455

I think it is app for Ubuntu drivers issue.

Revision history for this message
Shawn B (kantlivelong) wrote :

I was also having a suspend/resume with 460.56+. Rolling back to 460.39 fixed it.

Older packages can be found at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-460/+publishinghistory

List of packages I have installed:
libnvidia-cfg1-460:amd64
libnvidia-common-460
libnvidia-compute-460:amd64
libnvidia-compute-460:i386
libnvidia-decode-460:amd64
libnvidia-decode-460:i386
libnvidia-encode-460:amd64
libnvidia-encode-460:i386
libnvidia-extra-460:amd64
libnvidia-extra-460:i386
libnvidia-fbc1-460:amd64
libnvidia-fbc1-460:i386
libnvidia-gl-460:amd64
libnvidia-gl-460:i386
libnvidia-ifr1-460:amd64
libnvidia-ifr1-460:i386
nvidia-compute-utils-460
nvidia-dkms-460
nvidia-driver-460
nvidia-kernel-common-460
nvidia-kernel-source-460
nvidia-prime
nvidia-prime-applet
nvidia-settings
nvidia-utils-460
xserver-xorg-video-nvidia-460

Revision history for this message
Jose Arrarte (jarrarte) wrote :

Same issue with KDE Neon (Ubuntu 20.04) on a ThinkPad P50 with a Quadro M20000M.

Revision history for this message
Geoff (articmistic) wrote :

same issue with alienware i7-6700HQ CPU and GTX 965M running clean install of 20.04.2 LTS

Revision history for this message
Ian Gough (igough57) wrote :

Same issue with Lenovo Thinkpad P70 w Quadro M600M

Revision history for this message
bram lagerweij (bramlagerweij16) wrote :
Revision history for this message
bram lagerweij (bramlagerweij16) wrote (last edit ):

Just an update, Nvidia seems to be aware of this, and they have confirmed tracking these issues.

> We are tracking issue internally with bug number 3358939 .
> We are currently trying to duplicate issue locally.
> Shall keep everyone updated on it.

 Please see the Nvidia forum for updates.
- https://forums.developer.nvidia.com/t/regression-460-series-black-screen-on-boot-nvidia-modeset-error-gpu-failed-to-allocate-display-engine-core-dma-push-buffer/165598/35

Revision history for this message
Shannon VanWagner (shannon-vanwagner) wrote :

Inspired by #17 - What seems to have worked for me is to disable all nvidia service scripts.

First, can take note of current setting with:
$ for e in suspend hibernate resume;do sudo systemctl status nvidia-$e.service;done
● nvidia-suspend.service - NVIDIA system suspend actions
     Loaded: loaded (/lib/systemd/system/nvidia-suspend.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
● nvidia-hibernate.service - NVIDIA system hibernate actions
     Loaded: loaded (/lib/systemd/system/nvidia-hibernate.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
● nvidia-resume.service - NVIDIA system resume actions
     Loaded: loaded (/lib/systemd/system/nvidia-resume.service; disabled; vendor preset: enabled)
     Active: inactive (dead)

Then disable the scripts with:
$ for e in suspend hibernate resume;do sudo systemctl disable nvidia-$e.service;done

Revision history for this message
Michael Scheper (j-ubuntu-h) wrote :

FWIW, I've been experiencing this problem under Mint 21, when it comes back from being suspended, with NVIDIA Driver Version 470.63.01 installed.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-470 (Ubuntu):
status: New → Confirmed
Revision history for this message
Peter Van der Goten (gotenp) wrote :

Same problem, I use mint 20.2 NVIDIA driver 470.63.01

Revision history for this message
Dominik Borowski (d0ms3y) wrote :

Identical problem. only that my machine is frozen after the nvidia error messages. Running Ubuntu 20.04 LTS / Nvidia driver 470.57.02 on Razer Blade 15 Advanced (2018)

Revision history for this message
Steve Roberts (drgrumpy) wrote :

Same as #31 for me, Mint 20.2 (=Ubuntu 20.04) Nvidia 470.86

Revision history for this message
Tolga Durak (big.brew) wrote :

I'm also having the identical problem with external monitor connected via only HDMI. No problem with DisplayPort.

Screen does not wake up from sleeping when using refresh rate 240 Hz. It is fine with 60 Hz.

My graphics card is RTX 2070 Mobile

Using Ubuntu 20.04.3 LTS and Nvidia driver 470.86

Revision history for this message
Shawn B (kantlivelong) wrote :

I recently was able to work around this issue using the nvidia 510 drivers and following this suggestion:
https://forums.developer.nvidia.com/t/fixed-suspend-resume-issues-with-the-driver-version-470/187150/3

Thus far no issues with resuming. Not sure if it helps with 460 but thought I'd share.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.