Kernel crash in nvidia_modeset hangs the whole graphic system (page allocation failure)

Bug #1897659 reported by Facundo Batista
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-450 (Ubuntu)
Confirmed
Undecided
Unassigned
nvidia-graphics-drivers-455 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

This happens to me around once per week. The first line I found in syslog is:

Xorg: page allocation failure: order:5, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0

Attached is the whole syslog extract for the problem.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: xorg 1:7.7+19ubuntu14
ProcVersionSignature: Ubuntu 5.4.0-48.52-generic 5.4.60
Uname: Linux 5.4.0-48-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
.proc.driver.nvidia.capabilities.gpu0: Error: [Errno 21] Es un directorio: '/proc/driver/nvidia/capabilities/gpu0'
.proc.driver.nvidia.capabilities.mig: Error: [Errno 21] Es un directorio: '/proc/driver/nvidia/capabilities/mig'
.proc.driver.nvidia.gpus.0000.07.00.0: Error: [Errno 21] Es un directorio: '/proc/driver/nvidia/gpus/0000:07:00.0'
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.suspend: suspend hibernate resume
.proc.driver.nvidia.suspend_depth: default modeset uvm
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.66 Wed Aug 12 19:42:48 UTC 2020
 GCC version: gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)
ApportVersion: 2.20.11-0ubuntu27.9
Architecture: amd64
BootLog: Error: [Errno 13] Permiso denegado: '/var/log/boot.log'
CasperMD5CheckResult: skip
CompositorRunning: None
CurrentDesktop: KDE
Date: Mon Sep 28 23:58:11 2020
DistUpgraded: Fresh install
DistroCodename: focal
DistroVariant: ubuntu
DkmsStatus: nvidia, 450.66, 5.4.0-48-generic, x86_64: installed
ExtraDebuggingInterest: No
GraphicsCard:
 NVIDIA Corporation GK107 [GeForce GT 740] [10de:0fc8] (rev a1) (prog-if 00 [VGA controller])
   Subsystem: NVIDIA Corporation GK107 [GeForce GT 740] [10de:1099]
InstallationDate: Installed on 2020-07-11 (79 days ago)
InstallationMedia: Kubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
MachineType: System manufacturer System Product Name
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-48-generic root=UUID=3d184b61-094a-475b-b817-3f588547fea1 ro quiet splash vt.handoff=7
SourcePackage: xorg
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 03/07/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 4602
dmi.board.asset.tag: Default string
dmi.board.name: PRIME A320M-K
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr4602:bd03/07/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEA320M-K:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.101-2
version.libgl1-mesa-dri: libgl1-mesa-dri 20.0.8-0ubuntu1~20.04.1
version.libgl1-mesa-glx: libgl1-mesa-glx N/A
version.nvidia-graphics-drivers: nvidia-graphics-drivers-* N/A
version.xserver-xorg-core: xserver-xorg-core 2:1.20.8-2ubuntu2.4
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20200226-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.16-1

Revision history for this message
Facundo Batista (facundo) wrote :
summary: - Xorg memory issues hangs the whole graphic system
+ Memory issues hangs the whole graphic system
affects: xorg (Ubuntu) → nvidia-graphics-drivers-450 (Ubuntu)
tags: added: nvidia
Revision history for this message
Facundo Batista (facundo) wrote : Re: Memory issues hangs the whole graphic system

The behaviour is that I stand up from my machine, and comeback a couple of hours later (desktop machine, no sleep or hibernation involved). When I come back it's unresponsive. It doesn't provide signal to the monitor, the keyboard "numlock light" doesn't even work.

However I can ssh into it just fine, and work just fine inside. But no graphic system, and no way to really restart any process to get it back (everything I tried, at least).

I normally end up issuing a `sudo shutdown now` (after sshing inside), which makes it work a while but DO NOT turn the machine off (it looks that whatever is hung prevents that).

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Kernel crash in nvidia_modeset hangs the whole graphic system (GeForce GT 740)

We can't debug or fix the Nvidia driver directly because it is closed source. I suggest the quickest solution might be to downgrade to the 440 driver in the 'Additional Drivers' app.

summary: - Memory issues hangs the whole graphic system
+ Kernel crash in nvidia_modeset hangs the whole graphic system
summary: - Kernel crash in nvidia_modeset hangs the whole graphic system
+ Kernel crash in nvidia_modeset hangs the whole graphic system (GeForce
+ GT 740)
Revision history for this message
Facundo Batista (facundo) wrote :

I've installed nvidia-graphics-drivers-455 from https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa and I'm still getting the same crash.

As indicated by Alberto Milone, I'm attaching here the nvidia report when crashed.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-450 (Ubuntu):
status: New → Confirmed
Changed in nvidia-graphics-drivers-455 (Ubuntu):
status: New → Confirmed
summary: - Kernel crash in nvidia_modeset hangs the whole graphic system (GeForce
- GT 740)
+ Kernel crash in nvidia_modeset hangs the whole graphic system (page
+ allocation failure)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.