[amdgpu] System freezes coming out from suspension or randomly from screen power save (5.15.0 fails but 5.18.14 works)

Bug #1971460 reported by Mechano
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

I've Ubuntu 22.04 with Wayland and Sapphire Radeon RX 6600 with Xiaomi 34" curved gaming monitor capable of 3440x1440 @144,00hz.

Using refresh of 120hz or 144hz, the system freezes coming out from suspension or randomly from screen power save.

Setting @60hz it's more stable.

It seems something related to wrong Modeline for this VGA and monitor resolution/frequencies. The problem is with both DP or HDMI cable. With HDMI it supports only 99,99hz

I had no problem with the provious HP RX 460 HanSolo or XFX GTX 1060 3GB with proprietary nVidia drivers.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: xorg 1:7.7+23ubuntu2
ProcVersionSignature: Ubuntu 5.15.0-27.28-generic 5.15.30
Uname: Linux 5.15.0-27-generic x86_64
ApportVersion: 2.20.11-0ubuntu82
Architecture: amd64
BootLog: Error: [Errno 13] Permesso negato: '/var/log/boot.log'
CasperMD5CheckResult: pass
CompositorRunning: None
CurrentDesktop: ubuntu:GNOME
Date: Tue May 3 18:29:05 2022
DistUpgraded: Fresh install
DistroCodename: jammy
DistroVariant: ubuntu
DkmsStatus:
 virtualbox/6.1.32, 5.15.0-25-generic, x86_64: installed
 virtualbox/6.1.32, 5.15.0-27-generic, x86_64: installed
ExtraDebuggingInterest: Yes, if not too technical
GpuHangFrequency: Several times a day
GpuHangReproducibility: Yes, I can easily reproduce it
GpuHangStarted: Immediately after installing this version of Ubuntu
GraphicsCard:
 Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] [1002:73ff] (rev c7) (prog-if 00 [VGA controller])
   Subsystem: Sapphire Technology Limited Navi 23 [Radeon RX 6600/6600 XT/6600M] [1da2:e447]
InstallationDate: Installed on 2022-04-23 (9 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 (20220419)
MachineType: System manufacturer System Product Name
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=it_IT.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-27-generic root=UUID=be1c4139-f2bc-472b-87c1-099f2be97b1a ro quiet splash vt.handoff=7
SourcePackage: xorg
Symptom: display
Title: Xorg freeze
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/25/2022
dmi.bios.release: 5.17
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3604
dmi.board.asset.tag: Default string
dmi.board.name: TUF B450M-PRO GAMING
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3604:bd02/25/2022:br5.17:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnTUFB450M-PROGAMING:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:skuSKU:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.110-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 22.0.1-1ubuntu2
version.libgl1-mesa-glx: libgl1-mesa-glx N/A
version.xserver-xorg-core: xserver-xorg-core 2:21.1.3-2ubuntu2
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-2build3
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20210115-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.17-2build1

Revision history for this message
Mechano (mr-mechano) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [amdgpu] System freezes coming out from suspension or randomly from screen power save

Thanks for the bug report. Next time a freeze happens please:

1. Wait 10 seconds.

2. Reboot.

3. Run:

   journalctl -b-1 > prevboot.txt

4. Attach the resulting text file here.

5. Also check for crashes by following https://wiki.ubuntu.com/Bugs/Responses#Missing_a_crash_report_or_having_a_.crash_attachment

tags: added: amdgpu
summary: - Xorg and Wayland freeze
+ [amdgpu] System freezes coming out from suspension or randomly from
+ screen power save
affects: xorg (Ubuntu) → ubuntu
Changed in ubuntu:
status: New → Incomplete
Revision history for this message
Mechano (mr-mechano) wrote :

Now it's not freezing on power on monitor from power save.

If it freezes I'll do a log file also after return from monitor power off.

The file is a log made on reboot after freeze returning from a suspend on a 3440x1440 @144hz screen. It shows at first the desktop but mouse is frozen. And after 5 secs deskto disappears and it shows a screen plenty of Z shaped black and pink orizontal stripes.

No problem if the screen was 3440x1440 @60hz.

Revision history for this message
Mechano (mr-mechano) wrote :

No problem also with 3440x1440 @100hz on Display Port connection.

Freezes only on 120hz and 144hz.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. The log in comment #3 seems to have enough errors coming from the 'amdgpu' kernel driver that we should assign the bug there.

affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Incomplete → New
tags: added: resume suspend-resume
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Bug 1949497 may also be related.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Mechano (mr-mechano) wrote : Re: [amdgpu] System freezes coming out from suspension or randomly from screen power save

Adding log file. This time the system freezed switching from 100hz to 144hz the monitor frequency.

Revision history for this message
Mechano (mr-mechano) wrote :

Finally got a freeze during monitor power save when was 3440x1440 @144hz.
Here the log file.

Revision history for this message
Mechano (mr-mechano) wrote :

And I got a freeze also switching from 144hz to 100hz (always at 3440x1440).

Here the log file.

Revision history for this message
Mechano (mr-mechano) wrote :

Here a temporary workaround before a patch is released.

Disable Wayland uncommenting on /etc/gdm3/custom.conf the line

#WaylandEnable=false

to

WaylandEnable=false

and add the file /etc/X11/xorg.conf.d/10-monitor.conf

with the following lines:

Section "Monitor"
    Identifier "DisplayPort-0"
    Modeline "3440x1440_144.00" 1086.75 3440 3744 4128 4816 1440 1443 1453 1568 -hsync +vsync
    Option "PreferredMode" "3440x1440_144.00"
EndSection

Modeline has been calculated by cvt tool with command:

cvt -v 3440 1440 144

Change the identifier with the port normally used to connect monitor.

This is necessary because X11 by default shows a maximum of 120hz.
Adding the 10-monitor.conf now on settings I've 143.91hz.

Now Suspend and Resume works with no freeze.

Revision history for this message
Mechano (mr-mechano) wrote :

After 2 kernel update now the situation using Wayland is better with my RX 6600 expecially in returning from suspend.

But there's still random freeze on power on of monitor after screen suspend and sync back 144hz.

Still lot of errors on amdgpu module.

Attach log file.

Revision history for this message
Mechano (mr-mechano) wrote :
Download full text (3.5 KiB)

And now I see that my system still hangs if suspend the system just few minutes after login.
Here the list of errors of amdgpu

mechano@desktop:~$ sudo journalctl -b-1 | grep ERROR
[sudo] password di mechano:
giu 23 22:08:24 desktop networkd-dispatcher[870]: ERROR:Unknown state for interface NetworkctlListState(idx=1, name='lo', type='loopback', operational='n/a', administrative='unmanaged'): n/a
giu 23 22:08:24 desktop networkd-dispatcher[870]: ERROR:Unknown state for interface NetworkctlListState(idx=2, name='enp5s0', type='ether', operational='n/a', administrative='unmanaged'): n/a
giu 23 22:08:28 desktop gnome-shell[1385]: JS ERROR: TypeError: this._managerProxy is undefined
giu 23 22:08:29 desktop gnome-shell[1385]: JS ERROR: Failed to initialize fprintd service: Gio.IOErrorEnum: GDBus.Error:net.reactivated.Fprint.Error.NoSuchDevice: No devices available
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=571, emitted seq=573
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2326, emitted seq=2328
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process chrome pid 3777 thread chrome:cs0 pid 3790
giu 23 22:11:07 desktop kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
giu 23 22:11:07 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
giu 23 22:11:07 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
giu 23 22:11:07 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
giu 23 22:11:07 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
giu 23 22:11:07 desktop kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
giu 23 22:11:07 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
giu 23 22:11:07 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
giu 23 22:11:07 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
giu 23 22:11:07 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
giu 23 22:11:07 desktop kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
giu 23 22:11:07 desktop kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate tmr
giu 23 22:11:07 desktop kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
giu 23 22:11:07 desktop kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=573, emitted seq=573
giu 23 22:11:07 desktop kernel: [drm:amdgpu_job_timedout [...

Read more...

Revision history for this message
Mechano (mr-mechano) wrote :
Download full text (4.6 KiB)

Still freeze coming out from monitor power save and 3440x1440@144hz.

lug 23 21:04:05 desktop networkd-dispatcher[955]: ERROR:Unknown state for interface NetworkctlListState(idx=1, name='lo', type='loopback', operational='n/a', administrative='unmanaged'): n/a
lug 23 21:04:05 desktop networkd-dispatcher[955]: ERROR:Unknown state for interface NetworkctlListState(idx=2, name='enp5s0', type='ether', operational='n/a', administrative='unmanaged'): n/a
lug 23 21:04:07 desktop gnome-shell[1380]: JS ERROR: TypeError: this._managerProxy is undefined
lug 23 21:04:08 desktop gnome-shell[1380]: JS ERROR: Failed to initialize fprintd service: Gio.IOErrorEnum: GDBus.Error:net.reactivated.Fprint.Error.NoSuchDevice: No devices available
lug 23 21:04:48 desktop google-chrome.desktop[3699]: [3739:3739:0723/210448.306613:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
lug 23 21:04:50 desktop google-chrome.desktop[3699]: [3739:3739:0723/210450.839881:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 2 times!
lug 23 21:04:54 desktop google-chrome.desktop[3699]: [3739:3739:0723/210454.584064:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 3 times!
lug 23 21:05:22 desktop google-chrome.desktop[3699]: [3693:3693:0723/210522.140157:ERROR:interface_endpoint_client.cc(665)] Message 1 rejected by interface blink.mojom.WidgetHost
lug 23 21:15:38 desktop gnome-shell[2259]: JS ERROR: TypeError: menuInstance is null
lug 23 21:15:48 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=1436, emitted seq=1437
lug 23 21:15:48 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
lug 23 21:15:48 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=48857, emitted seq=48859
lug 23 21:15:48 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 2259 thread gnome-shel:cs0 pid 2277
lug 23 21:15:53 desktop kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
lug 23 21:15:53 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
lug 23 21:15:53 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
lug 23 21:15:54 desktop kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
lug 23 21:15:54 desktop kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
lug 23 21:15:54 desktop kernel: [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
lug 23 21:15:59 desktop kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
lug 23 21:16:11 desktop google-chrome.desktop[3699]: [3693:3693:0723/211611.917978:ERROR:gpu_process_host.cc(975)] GPU process exited unexpectedly: exit_code=512
lug 23 21:16:15 desktop kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
lug 23 21:16:18 desktop kernel: [drm:psp_load...

Read more...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

You will need a newer kernel to fix this. Please try:

  https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18.14/amd64/

or

  https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/2022-07-17/amd64/

To test a kernel, download all the debs into the same folder and then run:

   sudo dpkg -i *.deb

Revision history for this message
Mechano (mr-mechano) wrote :

Any hope to see backported amdgpu driver on Ubuntu 22.04 kernel mainline?
It's because official kernels are signed.

Using different kernels it's mandatory to disable UEFI Sercure Boot.

And if there's dual boot with Windows 10/11 I need to change continuosly into bios Secure Boot settings.

Anyway I'm testing now the 5.18.14-051814-generic and it seems to work. But I needed change Secure Boot to Other OS.

Yes I tried to sign but it didn't work.

summary: [amdgpu] System freezes coming out from suspension or randomly from
- screen power save
+ screen power save (5.15.0 fails but 5.18.14 works)
Revision history for this message
jason (jsn1987) wrote :

Has this been backported to 5.15 in the meantime? I think I have the same issue using 22.04 with kernel 5.15.0-56-generic also with a rx6000 (using kisak mesa 22.3.0). I'm using 1080p @ 144hz.

When entering sleep mode, my monitor wouldn't turn on, but the pc itself seemed to be running. I can also reproduce this issue using xrandr:

xrandr --output DisplayPort-0 --primary --mode 1920x1080 --scale-from 1280x960 --rate 144 --pos 0x0

I used to use this command for a game which does not support changing resolution using vulkan. It worked perfectly fine with my old install of ubuntu 20.04, same gpu and driver.

Revision history for this message
Mechano (mr-mechano) wrote :

Tired to wait for backport I upgraded to 22.10 and the issue is out now.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.