[amdgpu][psr] Screen flickering/ tearing on 6.1/6.2/6.3 kernel

Bug #2009952 reported by Roemer Claasen
42
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Undecided
Unassigned
linux-oem-6.1 (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Won't Fix
Undecided
Unassigned

Bug Description

After upgrading from kernel 5.19.0-35-generic to 6.1.0-1007-oem there is occasional screen flicker/ tear. It happens around every minute; it seems connected to window/ pointer movement, but I have no clear way of reproducing. Disruption is minor, but annoying.

I'm running 22.04 LTS on a new Thinkpad T14s with AMD Ryzen 6850u. The system is fully functional. No crashes, nothing breaks.

Background on the use of 6.1.0-1007-oem: 5.19.0-35-generic breaks suspend/ resume on my laptop, see https://bugs.launchpad.net/ubuntu/+source/linux-hwe-5.19/+bug/2007718.

In a way, running 6.1 is a 'flight forward', since 5.15.0-67-generic (previous 22.04.1 LTS kernel) works well, with working suspend/ resume, and without screen flicker. But that's an 'old' kernel now, missing some nice improvements for the newer Ryzen APUs.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-oem-22.04c 6.1.0.1007.7
ProcVersionSignature: Ubuntu 6.1.0-1007.7-oem 6.1.6
Uname: Linux 6.1.0-1007-oem x86_64
ApportVersion: 2.20.11-0ubuntu82.3
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Fri Mar 10 13:09:20 2023
InstallationDate: Installed on 2023-02-06 (31 days ago)
InstallationMedia: Ubuntu 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809.1)
SourcePackage: linux-meta-oem-6.1
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm not familiar with toggling PSR on amdgpu systems but does it let you write 0 to '/sys/kernel/debug/dri/0/eDP-1/psr_state' ?

tags: added: flicker
Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

Daniel, thanks for the suggestion!

This is quite possibly a duplicate of https://gitlab.freedesktop.org/drm/amd/-/issues/2352.

I'm evaluating the suggestion to add "amdgpu.dcdebugmask=0x10" to the kernel boot parameters. That parameter seems to disable PSR, see below.

➜ ~ sudo cat /sys/kernel/debug/dri/0/eDP-1/psr_state
0
➜ ~ sudo cat /sys/kernel/debug/dri/0/eDP-1/psr_capability
Sink support: yes [0x03]
Driver support: no [0xffffffff]

Kernel package: Ubuntu linux-oem-22.04c, current version 6.1.0-1008-oem.

I'll report back in a few days to see if this indeed fixes the flickering.

Revision history for this message
Roemer Claasen (rclaasen) wrote :

Adding the kernel boot parameter fixes the flickering screen.

Short HOWTO:

 - open `/etc/ default/grub`
 - add "amdgpu.dcdebugmask=0x10" as kernel boot parameter to `GRUB_CMDLINE_LINUX_DEFAULT` (on my system this now reads "quiet splash amdgpu.dcdebugmask=0x10")
 - call `sudo update-grub`
 - reboot

Revision history for this message
Roemer Claasen (rclaasen) wrote :

@Daniel: so yes, this was indeed PSR. Disabling it fixes this issue. Thanks for the support!

summary: - Screen flickering/ tearing on 6.1 kernel
+ [amdgpu][psr] Screen flickering/ tearing on 6.1 kernel
tags: added: psr
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Re: [amdgpu][psr] Screen flickering/ tearing on 6.1 kernel

do you see this with a mainline 6.2 or 6.3rc build?

https://kernel.ubuntu.com/~kernel-ppa/mainline/

no longer affects: linux-meta-oem-6.1 (Ubuntu)
Changed in linux-oem-6.1 (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Jammy):
status: New → Invalid
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

ah sorry, missed that this was already filed upstream and still unresolved.. so the mainline builds probably won't help you

Revision history for this message
Roemer Claasen (rclaasen) wrote :

@Timo: I tried 6.1.21 recently, and an older 6.2 mainline kernel (6.2.2?) from the kernel-ppa. The screen flicker/ tearing was still present on both.

As a whole the system appeared somewhat unstable (occasional system halt), but I did not investigate, just went back to linux-oem-6.1.

Should it be fixed in the mainline and on linux-oem-6.1 without the kernel parameter? I can retry if that would help?!

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

We probably don't need the confusing Invalid status on linux-oem-6.1 when that only exists for jammy.

Revision history for this message
Mario Limonciello (superm1) wrote :

I do think a double check against 6.3-rc4 would be worthwhile with no parameters set.

The upstream issue is a bit confusing because it has both PSR and S/G issues conflated.

Revision history for this message
Roemer Claasen (rclaasen) wrote :

I've tried 6.3rc5 (6.3.0-060300rc5-generic) without the parameter, and the flickering is still there at times. It appears to be a lot less than on 6.1/6.2 (without parameter set), though that's a very subjective observation.

@Mario: Is there anything I can do to analyze further?

Revision history for this message
Mario Limonciello (superm1) wrote :

There are potentially two bugs at play here. One with scatter/gather and one for PSR. They both manifest really similarly, so let's try with both disabled.

There is a parameter that is introduced in 6.3 for amdgpu.sg_display. Can you please explicitly set that to "0" along with the PSR disable parameter and see if everything is cleared up?

If it is; please allow PSR again and see if SG disabled alone is enough to fix it.

Revision history for this message
Roemer Claasen (rclaasen) wrote :

Hi Mario, thanks for your support!

I will test the following kernel parameter settings and report back:

Test 1: amdgpu.dcdebugmask=0x10 amdgpu.sg_display=0 (running now)
Test 2: amggpu.sg_display=0

Cheers,

Roemer

Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):
Download full text (6.7 KiB)

Test 1: amdgpu.dcdebugmask=0x10 amdgpu.sg_display=0
All well, no observations during a full workday.

Test 2: amggpu.sg_display=0
Flicker/tear is back, and a full screen freeze within 10 minutes.

About the screen freeze: contrary to yesterday the laptop was now connected to a second screen. I was using both the internal laptop screen (PSR capable, theoretically), and an older external HDMI monitor (non-PSR capable presumably, if that even matters).

The laptop screen froze, while the external monitor was still functioning (the mouse pointer was fixed mid-screen on the laptop, while I could move it on the external monitor).

kern.log (grep amdgpu):

Apr 6 09:22:59 rct14s kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.3.0-060300rc5-generic root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash amdgpu.sg_display=0 vt.handoff=7
Apr 6 09:22:59 rct14s kernel: [ 0.004000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.3.0-060300rc5-generic root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash amdgpu.sg_display=0 vt.handoff=7
Apr 6 09:22:59 rct14s kernel: [ 3.971093] [drm] amdgpu kernel modesetting enabled.
Apr 6 09:22:59 rct14s kernel: [ 3.980870] amdgpu: Virtual CRAT table created for CPU
Apr 6 09:22:59 rct14s kernel: [ 3.980887] amdgpu: Topology: Add CPU node
Apr 6 09:22:59 rct14s kernel: [ 3.981020] amdgpu 0000:33:00.0: enabling device (0006 -> 0007)
Apr 6 09:22:59 rct14s kernel: [ 3.982752] amdgpu 0000:33:00.0: amdgpu: Fetched VBIOS from VFCT
Apr 6 09:22:59 rct14s kernel: [ 3.982755] amdgpu: ATOM BIOS: 113-REMBRANDT-X37
Apr 6 09:22:59 rct14s kernel: [ 3.987559] amdgpu 0000:33:00.0: vgaarb: deactivate vga console
Apr 6 09:22:59 rct14s kernel: [ 3.987562] amdgpu 0000:33:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
Apr 6 09:22:59 rct14s kernel: [ 3.987566] amdgpu 0000:33:00.0: amdgpu: PCIE atomic ops is not supported
Apr 6 09:22:59 rct14s kernel: [ 3.987613] amdgpu 0000:33:00.0: amdgpu: VRAM: 1024M 0x000000F400000000 - 0x000000F43FFFFFFF (1024M used)
Apr 6 09:22:59 rct14s kernel: [ 3.987616] amdgpu 0000:33:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
Apr 6 09:22:59 rct14s kernel: [ 3.987617] amdgpu 0000:33:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
Apr 6 09:22:59 rct14s kernel: [ 3.988145] [drm] amdgpu: 1024M of VRAM memory ready
Apr 6 09:22:59 rct14s kernel: [ 3.988148] [drm] amdgpu: 15421M of GTT memory ready.
Apr 6 09:22:59 rct14s kernel: [ 3.989361] amdgpu 0000:33:00.0: amdgpu: Will use PSP to load VCN firmware
Apr 6 09:22:59 rct14s kernel: [ 4.158660] amdgpu 0000:33:00.0: amdgpu: RAS: optional ras ta ucode is not available
Apr 6 09:22:59 rct14s kernel: [ 4.169798] amdgpu 0000:33:00.0: amdgpu: RAP: optional rap ta ucode is not available
Apr 6 09:22:59 rct14s kernel: [ 4.169805] amdgpu 0000:33:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Apr 6 09:22:59 rct14s kernel: [ 4.171302] amdgpu 0000:33:00.0: amdgpu: SMU is initialized successfully!
Apr 6 09:22:59 rct14s kernel: [ 4.194388] snd_hda_intel 0000:33:00.1: bound ...

Read more...

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Download full text (11.0 KiB)

Test 2, no external monitor connected

Another system halt with the following error:

Apr 6 12:51:59 rct14s kernel: [11686.738468] amdgpu 0000:33:00.0: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out

I am now re-enabling "amdgpu.dcdebugmask=0x10" since that seemed to be running quite well.

kern.log | grep amdgpu:

Apr 6 09:37:17 rct14s kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.3.0-060300rc5-generic root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash amdgpu.sg_display=0 vt.handoff=7
Apr 6 09:37:17 rct14s kernel: [ 0.004000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.3.0-060300rc5-generic root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash amdgpu.sg_display=0 vt.handoff=7
Apr 6 09:37:17 rct14s kernel: [ 5.089998] [drm] amdgpu kernel modesetting enabled.
Apr 6 09:37:17 rct14s kernel: [ 5.105793] amdgpu: Virtual CRAT table created for CPU
Apr 6 09:37:17 rct14s kernel: [ 5.105810] amdgpu: Topology: Add CPU node
Apr 6 09:37:17 rct14s kernel: [ 5.105975] amdgpu 0000:33:00.0: enabling device (0006 -> 0007)
Apr 6 09:37:17 rct14s kernel: [ 5.108157] amdgpu 0000:33:00.0: amdgpu: Fetched VBIOS from VFCT
Apr 6 09:37:17 rct14s kernel: [ 5.108161] amdgpu: ATOM BIOS: 113-REMBRANDT-X37
Apr 6 09:37:17 rct14s kernel: [ 5.112655] amdgpu 0000:33:00.0: vgaarb: deactivate vga console
Apr 6 09:37:17 rct14s kernel: [ 5.112658] amdgpu 0000:33:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
Apr 6 09:37:17 rct14s kernel: [ 5.112662] amdgpu 0000:33:00.0: amdgpu: PCIE atomic ops is not supported
Apr 6 09:37:17 rct14s kernel: [ 5.112720] amdgpu 0000:33:00.0: amdgpu: VRAM: 1024M 0x000000F400000000 - 0x000000F43FFFFFFF (1024M used)
Apr 6 09:37:17 rct14s kernel: [ 5.112723] amdgpu 0000:33:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
Apr 6 09:37:17 rct14s kernel: [ 5.112725] amdgpu 0000:33:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
Apr 6 09:37:17 rct14s kernel: [ 5.113338] [drm] amdgpu: 1024M of VRAM memory ready
Apr 6 09:37:17 rct14s kernel: [ 5.113343] [drm] amdgpu: 15421M of GTT memory ready.
Apr 6 09:37:17 rct14s kernel: [ 5.115154] amdgpu 0000:33:00.0: amdgpu: Will use PSP to load VCN firmware
Apr 6 09:37:17 rct14s kernel: [ 5.284417] amdgpu 0000:33:00.0: amdgpu: RAS: optional ras ta ucode is not available
Apr 6 09:37:17 rct14s kernel: [ 5.296530] amdgpu 0000:33:00.0: amdgpu: RAP: optional rap ta ucode is not available
Apr 6 09:37:17 rct14s kernel: [ 5.296533] amdgpu 0000:33:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Apr 6 09:37:17 rct14s kernel: [ 5.298687] amdgpu 0000:33:00.0: amdgpu: SMU is initialized successfully!
Apr 6 09:37:17 rct14s kernel: [ 5.326714] snd_hda_intel 0000:33:00.1: bound 0000:33:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Apr 6 09:37:17 rct14s kernel: [ 5.441665] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Apr 6 09:37:17 rct14s kernel: [ 5.441734] amdgpu: sdma_bitmap: 3
Apr 6 09:37:18 rct14s kernel: [ 5.481933] amdgpu: HMM registered 1024MB device memory
Apr 6 09:37:18 rct...

Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

Short, massive flickering while running with parameters "amdgpu.dcdebugmask=0x10 amdgpu.sg_display=0", see video: https://www.youtube.com/watch?v=7WPtMy1nrT8

I suspect this is a separate issue, as I have not seen flickering on this scale before, it used to be a very quick, hardly noticeable tear. It appeared to be only the application with focus: first Pycharm, then Chrome.

At the same time I now notice applications being replicated across workspaces, see pictures. Seems related to the massive flickering in the video (suspicion only), not to the AMD PSR kernel parameter.

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Revision history for this message
Mario Limonciello (superm1) wrote :

If it's trending to be caused by PSR (which it sounds like it is), try to revert this patch:
https://github.com/torvalds/linux/commit/751281c55579f0cb0e56c9797d4663f689909681

> Short, massive flickering while running with parameters "amdgpu.dcdebugmask=0x10 amdgpu.sg_display=0",

Have a try with this patch: https://gitlab.freedesktop.org/drm/amd/uploads/ebd02a1dc605110a3f28b9c4eb62c313/0001-drm-amd-display-fix-flickering-caused-by-S-G-mode.patch (and drop amdgpu.sg_display=0)

summary: - [amdgpu][psr] Screen flickering/ tearing on 6.1 kernel
+ [amdgpu][psr] Screen flickering/ tearing on 6.1/6.2/6.3 kernel
Revision history for this message
Mario Limonciello (superm1) wrote :

OK, here's the patch being submitted upstream that should help this if it's S/G caused:

https://patchwork.freedesktop.org/patch/532323/

no longer affects: linux (Ubuntu Jammy)
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: New → Confirmed
Revision history for this message
Mario Limonciello (superm1) wrote :

This commit from 6.4-rc1 fixes the S/G mode issue.

https://github.com/torvalds/linux/commit/08da182175db4c7f80850354849d95f2670e8cd9

It's CC to stable.

Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

Testing 6.4-rc2 now, it seems the S/G mode issue is indeed resolved. It's been 2 days only, but the new kernel seems really stable.

I'm now running without the "amdgpu.sg_display=0" kernel param.

It seems the tearing issue is still there though, the "amdgpu.dcdebugmask=0x10" is still needed.

UPDATE: the 6.4 release candidates seem to work really well for me. Running with the following cmdline:

BOOT_IMAGE=/boot/vmlinuz-6.4.0-060400rc3-generic root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash amdgpu.dcdebugmask=0x10 vt.handoff=7

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Download full text (6.3 KiB)

UPDATE 2 after testing 6.4rc5.

Still a PSR-related problem when running without "amdgpu.dcdebugmask=0x10" (see kern.log below), apart from that very happy with 6.4-rc5. The flicker/ tearing issue seems to be solved (yeah!).

Without the kernel parameter I do get the following error though, apparently related to PSR:

Jun 7 14:46:12 rct14s kernel: [ 1057.362873] WARNING: CPU: 3 PID: 226 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_psr.c:226 dmub_psr_enable+0x115/0x120 [amdgpu]
Jun 7 14:46:12 rct14s kernel: [ 1057.363546] Modules linked in: ccm michael_mic xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink rfcomm snd_seq_dummy snd_hrtimer nvme_fabrics cmac algif_hash algif_skcipher af_alg bnep qrtr_mhi amdgpu snd_soc_dmic snd_acp6x_pdm_dma snd_soc_acp6x_mach snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp joydev snd_sof intel_rapl_msr intel_rapl_common snd_sof_utils edac_mce_amd snd_soc_core iommu_v2 drm_buddy snd_compress ac97_bus snd_ctl_led gpu_sched snd_pcm_dmaengine qrtr kvm_amd drm_suballoc_helper snd_hda_codec_realtek drm_ttm_helper snd_pci_ps ath11k_pci snd_hda_codec_generic binfmt_misc snd_hda_codec_hdmi ttm snd_rpl_pci_acp6x thinkpad_acpi snd_acp_pci kvm ath11k uvcvideo drm_display_helper nvram snd_hda_intel irqbypass videobuf2_vmalloc ledtrig_audio cec qmi_helpers snd_intel_dspcfg platform_profile crct10dif_pclmul uvc snd_pci_acp6x polyval_clmulni
Jun 7 14:46:12 rct14s kernel: [ 1057.363616] snd_intel_sdw_acpi rc_core videobuf2_memops polyval_generic snd_hda_codec snd_pci_acp5x snd_seq_midi ghash_clmulni_intel input_leds videobuf2_v4l2 drm_kms_helper sha512_ssse3 snd_seq_midi_event mac80211 snd_rn_pci_acp3x aesni_intel i2c_algo_bit snd_hda_core videodev crypto_simd snd_rawmidi snd_acp_config syscopyarea snd_hwdep cryptd sysfillrect videobuf2_common snd_soc_acpi snd_pcm sysimgblt nls_iso8859_1 ccp snd_pci_acp3x rapl mc serio_raw snd_seq hid_multitouch cfg80211 btusb snd_seq_device think_lmi firmware_attributes_class wmi_bmof btrtl snd_timer btbcm libarc4 btintel snd ucsi_acpi btmtk mhi typec_ucsi k10temp typec soundcore mac_hid amd_pmc acpi_tad bluetooth ecdh_generic ecc sch_fq_codel overlay iptable_filter ip6table_filter ip6_tables br_netfilter bridge stp llc arp_tables msr parport_pc ppdev lp parport drm ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 nvme nvme_core video hid_generic crc32_pclmul psmouse thunderbolt xhci_pci i2c_piix4 xhci_pci_renesas
Jun 7 14:46:12 rct14s kernel: [ 1057.363702] nvme_common i2c_hid_acpi i2c_hid wmi hid
Jun 7 14:46:12 rct14s kernel: [ 1057.363708] CPU: 3 PID: 226 Comm: kworker/3:1H Not tainted 6.4.0-060400rc5-generic #202306041930
Jun 7 14:46:12 rct14s kernel: [ 1057.363712] Hardware name: LENOVO 21CQCTO1WW/21CQCTO1WW, BIOS R22ET60W (1.30 ) 02/09/2023
Jun 7 14:46:12 rct14s kernel: [ 1057.363714] Workqueue: events_highpri dm_irq_work_func [amdgpu]
Jun 7 14:46:12 rct14s kernel: [ 1057.364201] RIP: 0010:dmub_psr_enable+0x115/0x120 [amdgpu]
Jun 7 14:46:12 rct14s kernel: [ 1057.364777] Code: 45 d0 65...

Read more...

Revision history for this message
Mario Limonciello (superm1) wrote :

That's good news. Just today there was a new patch posted that might help the remaining PSR issue. Can you please apply this on top of 6.4-rc5?

https://patchwork.freedesktop.org/patch/541535/

Revision history for this message
Roemer Claasen (rclaasen) wrote :

Hi Mario,

I have to apologize, I was a little bit too fast sending that update. Immediately afterwards I had the screen flickering again, with some blackouts as well. I was able to make a video, see https://www.youtube.com/watch?v=yjL9OuXhdeg. No errors in the log though.

I will have to see if I can compile with the patch, not sure yet if I can find the time.

Thank you for your support!

Roemer

Revision history for this message
Mario Limonciello (superm1) wrote :

OK let's look for results with that patch when you have time.

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Download full text (6.4 KiB)

Hi Mario,

I've recompiled kernel 6.4-rc5 with the patch applied, and get the same error (see below). No screen tearing/ flicker - yet. I'll report back after a day of work to see if the patch had any effect.

Thanks again for your support, best,

Roemer

$ cat /proc/cmdline:

BOOT_IMAGE=/boot/vmlinuz-6.4.0-rc5patch_541535-dirty root=UUID=41d8f993-282a-48a5-b355-6f537a3a17ab ro quiet splash vt.handoff=7

$ cat /var/log/kern.log:

Jun 7 21:30:28 rct14s kernel: [ 1211.044086] WARNING: CPU: 3 PID: 7362 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_psr.c:226 dmub_psr_enable+0x10b/0x120 [amdgpu]
Jun 7 21:30:28 rct14s kernel: [ 1211.044741] Modules linked in: tls ccm michael_mic xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink rfcomm snd_seq_dummy snd_hrtimer nvme_fabrics cmac algif_hash algif_skcipher af_alg bnep qrtr_mhi amdgpu qrtr ath11k_pci snd_soc_dmic snd_acp6x_pdm_dma snd_soc_acp6x_mach snd_sof_amd_rembrandt snd_sof_amd_renoir joydev ath11k snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr intel_rapl_common btusb btrtl edac_mce_amd snd_ctl_led btbcm iommu_v2 snd_hda_codec_realtek qmi_helpers btintel drm_buddy gpu_sched btmtk kvm_amd snd_hda_codec_generic snd_sof_utils mac80211 snd_hda_codec_hdmi drm_suballoc_helper bluetooth uvcvideo drm_ttm_helper kvm snd_hda_intel snd_soc_core ttm binfmt_misc snd_intel_dspcfg videobuf2_vmalloc snd_intel_sdw_acpi uvc snd_hda_codec videobuf2_memops drm_display_helper videobuf2_v4l2 snd_compress videodev snd_hda_core ac97_bus irqbypass cec crct10dif_pclmul
Jun 7 21:30:28 rct14s kernel: [ 1211.044873] snd_pcm_dmaengine polyval_clmulni snd_hwdep snd_pci_ps rc_core polyval_generic ghash_clmulni_intel snd_rpl_pci_acp6x sha512_ssse3 snd_seq_midi thinkpad_acpi snd_acp_pci aesni_intel snd_seq_midi_event crypto_simd drm_kms_helper snd_rawmidi videobuf2_common nvram ecdh_generic snd_pci_acp6x input_leds cryptd cfg80211 snd_pcm ecc think_lmi i2c_algo_bit ledtrig_audio nls_iso8859_1 rapl mc serio_raw snd_seq hid_multitouch firmware_attributes_class platform_profile wmi_bmof snd_pci_acp5x syscopyarea ucsi_acpi snd_seq_device snd_rn_pci_acp3x sysfillrect typec_ucsi ccp k10temp snd_timer snd_acp_config libarc4 sysimgblt snd_soc_acpi mhi typec snd snd_pci_acp3x sch_fq_codel soundcore mac_hid amd_pmc acpi_tad overlay iptable_filter ip6table_filter ip6_tables br_netfilter bridge stp llc arp_tables drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 nvme hid_generic thunderbolt psmouse crc32_pclmul nvme_core xhci_pci i2c_hid_acpi i2c_piix4
Jun 7 21:30:28 rct14s kernel: [ 1211.045018] xhci_pci_renesas i2c_hid nvme_common video hid wmi
Jun 7 21:30:28 rct14s kernel: [ 1211.045030] CPU: 3 PID: 7362 Comm: kworker/3:2H Not tainted 6.4.0-rc5patch_541535-dirty #1
Jun 7 21:30:28 rct14s kernel: [ 1211.045035] Hardware name: LENOVO 21CQCTO1WW/21CQCTO1WW, BIOS R22ET60W (1.30 ) 02/09/2023
Jun 7 21:30:28 rct14s kernel: [ 1211.045039] Workqueue: events_highpri dm_irq_work_func [amdgpu]
Jun 7 21:30:28 rct14s kernel: [ 1211.045...

Read more...

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Download full text (17.6 KiB)

No luck running with the patch and without "amdgpu.dcdebugmask=0x10", sadly.

There is still occasional screen flickering, and sometimes (once every 15 mimutes or so) a blank screen.

Then after a few hours of use, the display froze completely, and I could only reclaim control through 1) suspend and 2) TTY1. See kern.log below.

$ cat /var/log/kern.log:

Jun 7 23:32:11 rct14s kernel: [ 8513.360433] amdgpu 0000:33:00.0: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out
Jun 7 23:32:56 rct14s kernel: [ 8558.928374] amdgpu 0000:33:00.0: [drm] *ERROR* flip_done timed out
Jun 7 23:32:56 rct14s kernel: [ 8558.928380] amdgpu 0000:33:00.0: [drm] *ERROR* [CRTC:72:crtc-0] commit wait timed out
Jun 7 23:33:07 rct14s kernel: [ 8569.168721] amdgpu 0000:33:00.0: [drm] *ERROR* flip_done timed out
Jun 7 23:33:07 rct14s kernel: [ 8569.168728] amdgpu 0000:33:00.0: [drm] *ERROR* [PLANE:55:plane-3] commit wait timed out
Jun 7 23:33:07 rct14s kernel: [ 8569.168816] ------------[ cut here ]------------
Jun 7 23:33:07 rct14s kernel: [ 8569.168817] WARNING: CPU: 4 PID: 1109 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8145 amdgpu_dm_atomic_commit_tail+0x3b30/0x41f0 [amdgpu]
Jun 7 23:33:07 rct14s kernel: [ 8569.169357] Modules linked in: tls ccm michael_mic xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink rfcomm snd_seq_dummy snd_hrtimer nvme_fabrics cmac algif_hash algif_skcipher af_alg bnep qrtr_mhi amdgpu qrtr ath11k_pci snd_soc_dmic snd_acp6x_pdm_dma snd_soc_acp6x_mach snd_sof_amd_rembrandt snd_sof_amd_renoir joydev ath11k snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr intel_rapl_common btusb btrtl edac_mce_amd snd_ctl_led btbcm iommu_v2 snd_hda_codec_realtek qmi_helpers btintel drm_buddy gpu_sched btmtk kvm_amd snd_hda_codec_generic snd_sof_utils mac80211 snd_hda_codec_hdmi drm_suballoc_helper bluetooth uvcvideo drm_ttm_helper kvm snd_hda_intel snd_soc_core ttm binfmt_misc snd_intel_dspcfg videobuf2_vmalloc snd_intel_sdw_acpi uvc snd_hda_codec videobuf2_memops drm_display_helper videobuf2_v4l2 snd_compress videodev snd_hda_core ac97_bus irqbypass cec crct10dif_pclmul
Jun 7 23:33:07 rct14s kernel: [ 8569.169428] snd_pcm_dmaengine polyval_clmulni snd_hwdep snd_pci_ps rc_core polyval_generic ghash_clmulni_intel snd_rpl_pci_acp6x sha512_ssse3 snd_seq_midi thinkpad_acpi snd_acp_pci aesni_intel snd_seq_midi_event crypto_simd drm_kms_helper snd_rawmidi videobuf2_common nvram ecdh_generic snd_pci_acp6x input_leds cryptd cfg80211 snd_pcm ecc think_lmi i2c_algo_bit ledtrig_audio nls_iso8859_1 rapl mc serio_raw snd_seq hid_multitouch firmware_attributes_class platform_profile wmi_bmof snd_pci_acp5x syscopyarea ucsi_acpi snd_seq_device snd_rn_pci_acp3x sysfillrect typec_ucsi ccp k10temp snd_timer snd_acp_config libarc4 sysimgblt snd_soc_acpi mhi typec snd snd_pci_acp3x sch_fq_codel soundcore mac_hid amd_pmc acpi_tad overlay iptable_filter ip6table_filter ip6_tables br_netfilter bridge stp llc arp_tables drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tabl...

Revision history for this message
Gert Karpelin (gkarpelin) wrote :

I have a similar problem with Xubuntu and kernel version linux-image-6.5.0-9-generic. Adding "amdgpu.dcdebugmask=0x10" to /etc/default/grub fixes the flickering screen.

OS: Xubuntu 23.10 x86_64
21F8003HMX ThinkPad T14s Gen 4
CPU: AMD Ryzen 7 PRO 7840U w/ Radeon

Revision history for this message
Mario Limonciello (superm1) wrote :

Can you please share your kernel log from a boot that had the failure, and without that parameter in place share the output of this script?
https://gitlab.freedesktop.org/drm/amd/-/blob/master/scripts/psr.py

Revision history for this message
Gert Karpelin (gkarpelin) wrote :

DRI device 0 DMCUB F/W version: 0x08002300
○ PSR 2 with Y coordinates (eDP 1.4a) [3]
○ Sink OUI: Parade
○ resv_40f: 01
○ ID String: 08-03
○ PSR Status: 00-00-02

Revision history for this message
Gert Karpelin (gkarpelin) wrote :

kernel log from boot

Revision history for this message
Mario Limonciello (superm1) wrote :

Sorry for delay looking at this. I might have not been clear. That script needs to be run after the error has occurred. I don't see the error in this kernel log.

Revision history for this message
Philip (phillau) wrote :

Hi,
has the same problem over the whole last week. Tried different distros.
I‘ve read that the kernel parameter solves it?
May e I try ubuntu again tomorrow with this kernel parameter.
I used arch and had the black screen problem. Maybe this kernel parameter fixes the problems on arch as well.

Revision history for this message
Philip (phillau) wrote :

I added the parameter now and will later tell how it is going.

For now it feels buttery smooth!

I couldn‘t use „sudo update-grub“.
I needed to use „sudo grub-mkconfig -o /boot/grub/grub.cfg“ to update grub.

Revision history for this message
Gert Karpelin (gkarpelin) wrote (last edit ):

I removed this parameter"amdgpu.dcdebugmask=0x10" from /etc/default/grub yesterday.
There is still occasional screen flickering, but no (once every 15 minutes or so) a blank screen.

I will continue testing

Revision history for this message
Gert Karpelin (gkarpelin) wrote :

To clarify, I had this blank screen problem before as well.

Revision history for this message
zhilong hwang (vzhilong) wrote :

I met the bug, only 'amdgpu.dcdebugmask=0x10' can solved it.

I have tested the lastest kernel.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

oem-6.1 is about to be EOL

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.