nvidia-driver-545 (and version 550) randomly hangs with compositing manager on VT change
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nvidia-graphics-drivers-545 (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
This issue doesn't happen with Nnvidia version 525 or 535 but it happens with 545 and 550.
Steps to reproduce:
- Run window manager or desktop environment without compositor manager. For example XFCE with Settings – Window Manager Tweaks – Compositor and disable Enable display compositing. (The problem is visible with this compositor, too, but getting out of hang with this one is harder.)
- The problem seems to be some kind of race condition because in "only" happens about 70% of the time for me. I'm using package "linux-
- Run picom as follows to make sure it has no config of any kind that might avoid the problem:
picom --config /dev/null --show-all-xerrors --log-level=TRACE & sleep 20; killall picom
Note that this will emit a lot of log messages and picom will be killed after 20 seconds (resulting in no compositor X session which should be usable if not pretty).
- Switch to different virtual terminals with CTRL+ALT+F1, CTRL+ALT+F2, ..., CTRL+ALT+F7 and wait for display to refresh on each terminal.
- Assuming your initial virtual terminal was VT 7 as usual for GUI desktop, you should now see fully black screen with your usual mouse cursor only. Wait for the sleep timer above (20 seconds) to go off and kill picom to restore your screen. If you had compositing enabled in XFCE Window manager, you would be seeing the same issue but getting out of this issue is much harder because XFCE window manager implements the compositing internally and having compositing hang you would have to kill your window manager to get rid of it!
I'm assuming this also happens with other software, too, that simply happens to trigger the racy codepath during VT switch but the above is the best way to reproduce the issue at will. The problem happens pretty often rapidly changing to VT 1 and back to VT 7, too. It may be faster way to reproduce the problem.
When picom hangs, the last line it outputs to stderr is as follows:
[ 2024-03-23 22:24:17.069 draw_callback_impl TRACE ] Render start, frame 416
Normally this should be followed by
[ 2024-03-23 22:24:17.069 draw_callback_impl TRACE ] Render end
on the same millisecond or one later but when NVidia driver hangs on VT switch, this never happens.
As I wrote above, this bug doesn't occur with Nvidia driver version 535 so the bug has been introduced between version 535 and 545. I cannot debug this further because I don't have the source code.
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: nvidia-driver-545 (not installed)
ProcVersionSign
Uname: Linux 6.5.0-26-lowlatency x86_64
NonfreeKernelMo
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: amd64
CasperMD5CheckR
CurrentDesktop: XFCE
Date: Sat Mar 23 22:08:21 2024
EcryptfsInUse: Yes
InstallationDate: Installed on 2019-01-05 (1904 days ago)
InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
SourcePackage: nvidia-
UpgradeStatus: Upgraded to jammy on 2023-08-11 (225 days ago)
modified.
mtime.conffile.
Here's driver parameters (should be all defaults):
# grep . /sys/module/ nvidia* /parameters/ * nvidia_ drm/parameters/ fbdev:N nvidia_ drm/parameters/ modeset: Y nvidia_ modeset/ parameters/ config_ file:(null) nvidia_ modeset/ parameters/ disable_ hdmi_frl: N nvidia_ modeset/ parameters/ disable_ vrr_memclk_ switch: N nvidia_ modeset/ parameters/ fail_malloc: -1 nvidia_ modeset/ parameters/ hdmi_deepcolor: N nvidia_ modeset/ parameters/ malloc_ verbose: N nvidia_ modeset/ parameters/ opportunistic_ display_ sync:Y nvidia_ modeset/ parameters/ output_ rounding_ fix:Y nvidia_ modeset/ parameters/ vblank_ sem_control: N nvidia_ uvm/parameters/ uvm_ats_ mode:1 nvidia_ uvm/parameters/ uvm_block_ cpu_to_ cpu_copy_ with_ce: 0 nvidia_ uvm/parameters/ uvm_channel_ gpfifo_ loc:auto nvidia_ uvm/parameters/ uvm_channel_ gpput_loc: auto nvidia_ uvm/parameters/ uvm_channel_ num_gpfifo_ entries: 1024 nvidia_ uvm/parameters/ uvm_channel_ pushbuffer_ loc:auto nvidia_ uvm/parameters/ uvm_conf_ computing_ channel_ iv_rotation_ limit:214748364 8 nvidia_ uvm/parameters/ uvm_cpu_ chunk_allocatio n_sizes: 2166784 nvidia_ uvm/parameters/ uvm_debug_ enable_ push_acquire_ info:0 nvidia_ uvm/parameters/ uvm_debug_ enable_ push_desc: 0 nvidia_ uvm/parameters/ uvm_debug_ prints: 0 nvidia_ uvm/parameters/ uvm_disable_ hmm:N nvidia_ uvm/parameters/ uvm_downgrade_ force_membar_ sys:1 nvidia_ uvm/parameters/ uvm_enable_ builtin_ tests:0 nvidia_ uvm/parameters/ uvm_enable_ debug_procfs: 0 nvidia_ uvm/parameters/ uvm_enable_ va_space_ mm:1 nvidia_ uvm/parameters/ uvm_exp_ gpu_cache_ peermem: 0 nvidia_ uvm/parameters/ uvm_exp_ gpu_cache_ sysmem: 0 nvidia_ uvm/parameters/ uvm_fault_ force_sysmem: 0 nvidia_ uvm/parameters/ uvm_force_ prefetch_ fault_support: 0 nvidia_ uvm/parameters/ uvm_global_ oversubscriptio n:1 nvidia_ uvm/parameters/ uvm_leak_ checker: 0 nvidia_ uvm/parameters/ uvm_page_ table_location: (null) nvidia_ uvm/parameters/ uvm_peer_ copy:phys nvidia_ uvm/parameters/ uvm_perf_ access_ counter_ batch_count: 256 nvidia_ uvm/parameters/ uvm_perf_ access_ counter_ mimc_migration_ enable: -1 nvidia_ uvm/parameters/ uvm_perf_ access_ counter_ momc_migration_ enable: -1 nvidia_ uvm/parameters/ uvm_perf_ access_ counter_ threshold: 256 nvidia_ uvm/parameters/ uvm_perf_ fault_batch_ count:256 nvidia_ uvm/parameters/ uvm_perf_ fault_coalesce: 1 nvidia_ uvm/parameters/ uvm_perf_ fault_max_ batches_ per_service: 20 nvidia_ uvm/parameters/ uvm_perf_ fault_max_ throttle_ per_service: 5 nvidia_ uvm/parameters/ uvm_perf_ fault_replay_ policy: 2 nvidia_ uvm/parameters/ uvm_perf_ fault_replay_ update_ put_ratio: 50 nvidia_ uvm/parameters/ uvm_perf_ map_remote_ on_eviction: 1 nvidia_ uvm/parameters/ uvm_perf_ map_remote_ on_native_ atomics_ fault:0 nvidia_ uvm/parameters/ uvm_perf_ migrate_ cpu_preunmap_ block_order: 2 nvidia_ uvm...
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/
/sys/module/