Displayport monitors hang on blanking and must REISUB my 20.04.5 system

Bug #1999189 reported by Scott P
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I've been getting issues with my 2 DP monitors (and system) not recovering from blanking (possibly 100% of the time). If I disable blanking, the issue never occurs. Otherwise, I am forced to REISUB the system to regain access.

As I'm using amdgpu from the HWE kernel, my best guess is thats where the issue lies.

When the issue occurs I am completely unable to access the system due to a softlock and CPU getting maxed by Xorg. SSH barely works and the system doesn't respond to request to perform dumps like Alt-PrintScreen-1.

The only relevant output I have seen (as a user) is in syslog, and possibly dmesg.

In addition, when I try to submit an apport-bug it fails (multiple times). Here is the Error ID: OOPS-108b77945d65b4a0f23eba000f0d8a8e while trying to wait for the launchpad page to process my data.

Finally, as I'm more of a user, I can only likely manage to test anything with some decent instructions. This issue has been going on for a few weeks, and due to a recent hardware change I cannot confirm if there was a time when my current system did not have the issue.

OS: Description: Ubuntu 20.04.5 LTS
Release: 20.04

System: Asus PN51-E1 (minipc), 32 GB RAM, AMD Ryzen 7 5700U with Radeon Graphics
2 DP monitors
DP to DP cable
USB-C to DP cable (tested without this connected and it makes no difference)
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.24
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: test 20733 F.... pulseaudio
 /dev/snd/controlC0: test 20733 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2022-02-17 (294 days ago)
InstallationMedia: Ubuntu 20.04.3 LTS "Focal Fossa" - Release amd64 (20210819)
MachineType: ASUSTeK COMPUTER INC. MINIPC PN51-E1
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.15.0-56-generic root=/dev/mapper/vgubuntu-root ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.15.0-56.62~20.04.1-generic 5.15.64
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-56-generic N/A
 linux-backports-modules-5.15.0-56-generic N/A
 linux-firmware 1.187.35
Tags: focal
Uname: Linux 5.15.0-56-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
WifiSyslog:

_MarkForUpload: True
dmi.bios.date: 06/21/2021
dmi.bios.release: 5.3
dmi.bios.vendor: ASUSTeK COMPUTER INC.
dmi.bios.version: 0503
dmi.board.asset.tag: Default string
dmi.board.name: PN51-E1
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 35
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnASUSTeKCOMPUTERINC.:bvr0503:bd06/21/2021:br5.3:svnASUSTeKCOMPUTERINC.:pnMINIPCPN51-E1:pvr0503:rvnASUSTeKCOMPUTERINC.:rnPN51-E1:rvrTobefilledbyO.E.M.:cvnDefaultstring:ct35:cvrDefaultstring:sku:
dmi.product.family: Vivo PC
dmi.product.name: MINIPC PN51-E1
dmi.product.version: 0503
dmi.sys.vendor: ASUSTeK COMPUTER INC.

Revision history for this message
Scott P (spause) wrote :

lspci-vnvn.log

Revision history for this message
Scott P (spause) wrote :

version.log

Scott P (spause)
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1999189

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Scott P (spause) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected focal
description: updated
Revision history for this message
Scott P (spause) wrote : CRDA.txt

apport information

Revision history for this message
Scott P (spause) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Scott P (spause) wrote : IwConfig.txt

apport information

Revision history for this message
Scott P (spause) wrote : Lspci.txt

apport information

Revision history for this message
Scott P (spause) wrote : Lspci-vt.txt

apport information

Revision history for this message
Scott P (spause) wrote : Lsusb.txt

apport information

Revision history for this message
Scott P (spause) wrote : Lsusb-t.txt

apport information

Revision history for this message
Scott P (spause) wrote : Lsusb-v.txt

apport information

Revision history for this message
Scott P (spause) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Scott P (spause) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Scott P (spause) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Scott P (spause) wrote : ProcModules.txt

apport information

Revision history for this message
Scott P (spause) wrote : PulseList.txt

apport information

Revision history for this message
Scott P (spause) wrote : RfKill.txt

apport information

Revision history for this message
Scott P (spause) wrote : UdevDb.txt

apport information

Revision history for this message
Scott P (spause) wrote : acpidump.txt

apport information

Scott P (spause)
description: updated
Scott P (spause)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Scott P (spause)
description: updated
description: updated
Revision history for this message
Scott P (spause) wrote :
Download full text (5.8 KiB)

Here is an example error/crash from journalctl:

Dec 14 17:51:18 hostname kernel: ------------[ cut here ]------------
Dec 14 17:51:18 hostname kernel: WARNING: CPU: 3 PID: 2408 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:2807 decide_link_settings+0x11a/0x200 [amdgpu]
Dec 14 17:51:18 hostname kernel: Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hd>
Dec 14 17:51:18 hostname kernel: lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 dm_crypt hid_generic usbhid rtsx_usb_sdmmc rtsx_usb amdgpu iommu_v2 gpu_>
Dec 14 17:51:18 hostname kernel: CPU: 3 PID: 2408 Comm: Xorg Not tainted 5.15.0-56-generic #62~20.04.1-Ubuntu
Dec 14 17:51:18 hostname kernel: Hardware name: ASUSTeK COMPUTER INC. MINIPC PN51-E1/PN51-E1, BIOS 0503 06/21/2021
Dec 14 17:51:18 hostname kernel: RIP: 0010:decide_link_settings+0x11a/0x200 [amdgpu]
Dec 14 17:51:18 hostname kernel: Code: 76 4c b9 f6 0a 00 00 48 c7 c2 a0 72 ca c0 bf 02 00 00 00 48 c7 c6 72 a0 d3 c0 e8 01 ba 87 ff 8b 43 64 85 c0 0f 85 67 ff ff ff <0f> 0b e9 60 ff ff ff ba 02 00 00>
Dec 14 17:51:18 hostname kernel: RSP: 0018:ffffb226836ab5a0 EFLAGS: 00010246
Dec 14 17:51:18 hostname kernel: RAX: 0000000000000000 RBX: ffff95d05a482800 RCX: 0000000000000af6
Dec 14 17:51:18 hostname kernel: RDX: ffffffffc0ca72a0 RSI: ffffffffc0d3a072 RDI: 0000000000000002
Dec 14 17:51:18 hostname kernel: RBP: ffffb226836ab5d8 R08: ffff95d051a1d000 R09: ffffb226836ab5b8
Dec 14 17:51:18 hostname kernel: R10: 0000000000800000 R11: 0000000000000000 R12: ffff95d051a1d000
Dec 14 17:51:18 hostname kernel: R13: ffffb226836ab5f0 R14: 0000000000093828 R15: ffff95d13df206a0
Dec 14 17:51:18 hostname kernel: FS: 00007f57c4414a40(0000) GS:ffff95d556ec0000(0000) knlGS:0000000000000000
Dec 14 17:51:18 hostname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 14 17:51:18 hostname kernel: CR2: 000055f5c0d69070 CR3: 0000000118f2c000 CR4: 0000000000350ee0
Dec 14 17:51:18 hostname kernel: Call Trace:
Dec 14 17:51:18 hostname kernel: <TASK>
Dec 14 17:51:18 hostname kernel: enable_link_dp+0xa0/0x250 [amdgpu]
Dec 14 17:51:18 hostname kernel: enable_link+0x295/0x3c0 [amdgpu]
Dec 14 17:51:18 hostname kernel: core_link_enable_stream+0x32a/0x5a0 [amdgpu]
Dec 14 17:51:18 hostname kernel: dce110_apply_ctx_to_hw+0x749/0x7d0 [amdgpu]
Dec 14 17:51:18 hostname kernel: dc_commit_state_no_check+0x2fb/0xf50 [amdgpu]
Dec 14 17:51:18 hostname kernel: dc_commit_state+0xd4/0x150 [amdgpu]
Dec 14 17:51:18 hostname kernel: amdgpu_dm_atomic_commit_tail+0x56e/0x1940 [amdgpu]
Dec 14 17:51:18 hostname kernel: ? dml_rq_dlg_get_rq_params+0x3b0/0x3b0 [amdgpu]
Dec 14 17:51:18 hostname kernel: ? dcn21_validate_bandwidth_fp+0x147/0x360 [amdgpu]
Dec 14 17:51:18 hostname kernel: ? dcn21_validate_bandwidth_fp+0x147/0x360 [amdgpu]
Dec 14 17:51:18 hostname kernel: ? save_fpregs_to_fpstate+0x3f/0xa0
Dec 14 17:51:18 hostname kernel: ? dc_fpu_end+0x7e/0x90 [amdgpu]
Dec 14 17:51:18 hostname kernel: ? ttm_bo_mem_compat+0x30/0x90 [ttm]
Dec 14 17:51:18 hostname kernel: ? ttm_bo_validate+0x50/0x110 [t...

Read more...

Scott P (spause)
description: updated
Revision history for this message
Scott P (spause) wrote :

The issue does not occur using the live version of ubuntu-22.04.1-desktop-amd64 ISO via usb.

Revision history for this message
Scott P (spause) wrote (last edit ):

I have tested the issue on the live 20.04.5 ISO usb, and the issue occurs in the Live config.

If I choose the Safe Graphics config - the issue does NOT happen.

My view of what is happening in the Safe Graphics mode that is notably different is that my "main monitor" never fully powers off when Blanking kicks in. This means there is always a display for the OS/driver to refer to.

On my normal/default mode, the main monitor actually turns off (both monitors turn off) and I am immediately unable to wake them. I suspect there is some major issue when the count of displays present/connected becomes zero. That's my best guess.

Revision history for this message
Scott P (spause) wrote :

If I leave my system at the login screen, the monitors will go to sleep and then I cannot wake the monitors with mouse or keyboard.

Can someone please look at this report?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.