Freeze on system-resume caused by kwin and amdgpu driver

Bug #1849084 reported by Richard Baka on 2019-10-21
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kwin (Ubuntu)
Undecided
Unassigned
plasma-desktop (Ubuntu)
Undecided
Unassigned
xserver-xorg-video-amdgpu (Ubuntu)
Undecided
Unassigned

Bug Description

This is a well investigated bug by me.

Problem: a black screen freeze occurs by suspend-resume process
Ubuntu version: Kubuntu 19.10 default plasma, or the backported 5.17
Kernel version: default 5.3.10 or the mainline 5.4.0rc3
xserver-xorg-video-amdgpu: 19.* or the newest from git, the 18.0.1-1 works well

This freeze can be reproduced only if opengl 2.0 or 3.1 compositor is enabled in Plasma settings. Xrender is OK however I don't like screen tearing.

On Ubuntu 19.10 Gnome desktop (X based) this doesn't occur even if the Gnome's compositor is enabled. (I don't know what is the default one there but I haven't disabled it)

I think this is a Kwin / amdgpu driver related bug because of the differentiation. These can be fixed even in Kwin/opengl compositor not just in the amdgpu driver or in kernel.

One possible solution: disable opengl compositor by suspend and re-enable it after login. Use fe.: Xrender before login / before the system restored from suspend.

For Ubuntu's maintainers: Couldn't be this problem solved by a downstream solution? Maybe a proper script could be enough by resume and by suspend. There are a lot of bugs like this reported in freedesktop bugreport and the developers haven't got enough time for fix them fast enough.

Syslog:

Oct 21 10:57:26 pc kernel: [ 9475.308852] Code: 85 78 ff ff ff e9 9f f8 ff ff 8b b0 98 04 00 00 48 c7 c7 ef 5f a5 c0 e8 49 2d 9d ff 44 0f b6 45 a3 49 8b 4d 08 e9 bf fa ff ff <0f> 0b e9 ca fb ff ff 0f 0b e8 7d 36 84 c1 66 66 2e 0f 1f 84 00 00
Oct 21 10:57:26 pc kernel: [ 9475.308853] RSP: 0018:ffffb7b54274b7b0 EFLAGS: 00010002
Oct 21 10:57:26 pc kernel: [ 9475.308855] RAX: 0000000000000202 RBX: 0000000000000202 RCX: 000000000000046a
Oct 21 10:57:26 pc kernel: [ 9475.308856] RDX: 0000000000000001 RSI: 0000000000000202 RDI: 0000000000000002
Oct 21 10:57:26 pc kernel: [ 9475.308857] RBP: ffffb7b54274b870 R08: 0000000000000000 R09: ffff94da76b2d170
Oct 21 10:57:26 pc kernel: [ 9475.308858] R10: ffffb7b54274b708 R11: ffffb7b54274b70c R12: ffff94da76b2d000
Oct 21 10:57:26 pc kernel: [ 9475.308859] R13: ffff94d970f2c300 R14: ffff94da758d25d0 R15: ffff94da270d1400
Oct 21 10:57:26 pc kernel: [ 9475.308861] FS: 00007f2a1b9bea80(0000) GS:ffff94da87c40000(0000) knlGS:0000000000000000
Oct 21 10:57:26 pc kernel: [ 9475.308863] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 21 10:57:26 pc kernel: [ 9475.308864] CR2: 00007fb083e9b4a5 CR3: 000000012c110000 CR4: 00000000003406e0
Oct 21 10:57:26 pc kernel: [ 9475.308865] Call Trace:
Oct 21 10:57:26 pc kernel: [ 9475.308977] amdgpu_dm_atomic_commit_tail+0x96f/0x1030 [amdgpu]
Oct 21 10:57:26 pc kernel: [ 9475.308991] commit_tail+0x50/0xc0 [drm_kms_helper]
Oct 21 10:57:26 pc kernel: [ 9475.309000] ? commit_tail+0x50/0xc0 [drm_kms_helper]
Oct 21 10:57:26 pc kernel: [ 9475.309009] drm_atomic_helper_commit+0x118/0x120 [drm_kms_helper]
Oct 21 10:57:26 pc kernel: [ 9475.309115] amdgpu_dm_atomic_commit+0x95/0xa0 [amdgpu]
Oct 21 10:57:26 pc kernel: [ 9475.309135] drm_atomic_commit+0x4a/0x50 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309144] drm_atomic_helper_set_config+0x89/0xa0 [drm_kms_helper]
Oct 21 10:57:26 pc kernel: [ 9475.309159] drm_mode_setcrtc+0x1cd/0x7a0 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309234] ? amdgpu_cs_wait_ioctl+0xd6/0x150 [amdgpu]
Oct 21 10:57:26 pc kernel: [ 9475.309249] ? drm_mode_getcrtc+0x190/0x190 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309262] drm_ioctl_kernel+0xae/0xf0 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309276] drm_ioctl+0x234/0x3d0 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309290] ? drm_mode_getcrtc+0x190/0x190 [drm]
Oct 21 10:57:26 pc kernel: [ 9475.309370] amdgpu_drm_ioctl+0x4e/0x80 [amdgpu]
Oct 21 10:57:26 pc kernel: [ 9475.309376] do_vfs_ioctl+0x407/0x670
Oct 21 10:57:26 pc kernel: [ 9475.309379] ? do_futex+0x10f/0x1e0
Oct 21 10:57:26 pc kernel: [ 9475.309382] ksys_ioctl+0x67/0x90
Oct 21 10:57:26 pc kernel: [ 9475.309384] __x64_sys_ioctl+0x1a/0x20
Oct 21 10:57:26 pc kernel: [ 9475.309388] do_syscall_64+0x57/0x190
Oct 21 10:57:26 pc kernel: [ 9475.309392] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 21 10:57:26 pc kernel: [ 9475.309394] RIP: 0033:0x7f2a1bd0c67b
Oct 21 10:57:26 pc kernel: [ 9475.309397] Code: 0f 1e fa 48 8b 05 15 28 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e5 27 0d 00 f7 d8 64 89 01 48
Oct 21 10:57:26 pc kernel: [ 9475.309398] RSP: 002b:00007ffe0e27c518 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Oct 21 10:57:26 pc kernel: [ 9475.309400] RAX: ffffffffffffffda RBX: 00007ffe0e27c550 RCX: 00007f2a1bd0c67b
Oct 21 10:57:26 pc kernel: [ 9475.309401] RDX: 00007ffe0e27c550 RSI: 00000000c06864a2 RDI: 000000000000000d
Oct 21 10:57:26 pc kernel: [ 9475.309402] RBP: 00000000c06864a2 R08: 0000000000000000 R09: 0000562bcc215600
Oct 21 10:57:26 pc kernel: [ 9475.309403] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Oct 21 10:57:26 pc kernel: [ 9475.309404] R13: 000000000000000d R14: 0000562bcb275db0 R15: 0000000000000000
Oct 21 10:57:26 pc kernel: [ 9475.309407] ---[ end trace cae28d1e69119104 ]---
Oct 21 10:57:26 pc kernel: [ 9475.309432] ------------[ cut here ]------------

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in kwin (Ubuntu):
status: New → Confirmed
Changed in plasma-desktop (Ubuntu):
status: New → Confirmed
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Confirmed
Mike Lykov (combr) wrote :

Yes, this bug also affects me.
I upgrade from Kubuntu 19.04 with 5.0 kernel to 19.10 with 5.3 kernel, and got described behaviour:

1. On 5.0 there is no problems (if I boot it in 19.10)
2. on 5.3 with default settings - there is a black screen after try to resume.
3. If I choose XRender with 5.3 in plasma settings - no problems.

current kernel 5.3.0-19-generic #20-Ubuntu SMP Fri Oct 18 09:04:39
I have no traces of oops in log, journalctl -b -1 -k show only
kernel: PM: suspend entry (deep)
string before resume/hang at black screen.

current driver xserver-xorg-video-amdgpu 19.0.1-1ubuntu1 amd64
xorg log
[ 10.353] (II) AMDGPU(0): glamor X acceleration enabled on AMD RAVEN (DRM 3.33.0, 5.3.0-19-generic, LLVM 9.0.0)
[ 10.353] (II) AMDGPU(0): glamor detected, initialising EGL layer.
[ 10.353] (==) AMDGPU(0): TearFree property default: auto
[ 10.353] (==) AMDGPU(0): VariableRefresh: disabled
[ 10.353] (II) AMDGPU(0): KMS Pageflipping: enabled

notebook with attached monitor
[ 10.396] (II) AMDGPU(0): EDID for output eDP
[ 10.396] (II) AMDGPU(0): Manufacturer: BOE Model: 6a5 Serial#: 0
[ 10.396] (II) AMDGPU(0): Year: 2015 Week: 1
[ 10.396] (II) AMDGPU(0): EDID Version: 1.4
[ 10.396] (II) AMDGPU(0): Digital Display Input
[ 10.396] (II) AMDGPU(0): 6 bits per channel
[ 10.396] (II) AMDGPU(0): Digital interface is DisplayPort
[ 10.397] (II) AMDGPU(0): EDID for output HDMI-A-0
[ 10.397] (II) AMDGPU(0): Manufacturer: BNQ Model: 801b Serial#: 21573
[ 10.397] (II) AMDGPU(0): Year: 2017 Week: 15
[ 10.397] (II) AMDGPU(0): EDID Version: 1.3
[ 10.397] (II) AMDGPU(0): Digital Display Input
[ 10.398] (II) AMDGPU(0): Output eDP connected
[ 10.398] (II) AMDGPU(0): Output HDMI-A-0 connected
[ 10.398] (II) AMDGPU(0): Using spanning desktop for initial modes
[ 10.398] (II) AMDGPU(0): Output eDP using initial mode 1366x768 +0+0
[ 10.398] (II) AMDGPU(0): Output HDMI-A-0 using initial mode 2560x1440 +1366+0
[ 10.399] (II) AMDGPU(0): [DRI2] Setup complete
[ 10.399] (II) AMDGPU(0): [DRI2] DRI driver: radeonsi
[ 10.399] (II) AMDGPU(0): [DRI2] VDPAU driver: radeonsi
[ 10.533] (II) AMDGPU(0): Front buffer pitch: 15872 bytes
[ 10.534] (II) AMDGPU(0): SYNC extension fences enabled
[ 10.534] (II) AMDGPU(0): Present extension enabled
[ 10.534] (==) AMDGPU(0): DRI3 enabled
[ 10.534] (==) AMDGPU(0): Backing store enabled
[ 10.534] (II) AMDGPU(0): Direct rendering enabled
[ 10.541] (II) AMDGPU(0): Use GLAMOR acceleration.
[ 10.541] (II) AMDGPU(0): Acceleration enabled
[ 10.541] (==) AMDGPU(0): DPMS enabled
[ 10.541] (==) AMDGPU(0): Silken mouse enabled
[ 10.541] (II) AMDGPU(0): Set up textured video (glamor)

Mike Lykov (combr) wrote :

I see the miracle :)
according https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues

I try two options:
1. initcall_debug
2. echo 0 > /sys/power/pm_async

Before I can reproduce the bug each time I press power button for suspend/resume (after resume I have a black screen with in 100% times when conditions described above satisfied).

then I try to disable pm_async and after suspend/resume I have no this bug. BUT, after restoring pm_async value to 1 via reboot, I cannot reproduce bug anymore. It is not happen in same situation as earlier.

ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu8.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: combr 3311 F.... pulseaudio
 /dev/snd/controlC0: combr 3311 F.... pulseaudio
CurrentDesktop: KDE
DistroRelease: Ubuntu 19.10
InstallationDate: Installed on 2018-11-22 (346 days ago)
InstallationMedia: Kubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.2)
MachineType: HP HP Laptop 15-db0xxx
Package: linux-image-5.3.0-19-generic 5.3.0-19.20
PackageArchitecture: amd64
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-19-generic root=UUID=ff84e034-43ff-4c6f-914b-42dcdd53cf31 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.3.0-19.20-generic 5.3.1
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-19-generic N/A
 linux-backports-modules-5.3.0-19-generic N/A
 linux-firmware 1.183.1
Tags: eoan
Uname: Linux 5.3.0-19-generic x86_64
UpgradeStatus: Upgraded to eoan on 2019-11-02 (1 days ago)
UserGroups: adm cdrom dip disk docker libvirt lpadmin plugdev sambashare sudo wireshark
_MarkForUpload: True
dmi.bios.date: 04/06/2018
dmi.bios.vendor: Insyde
dmi.bios.version: F.02
dmi.board.asset.tag: Type2 - Board Asset Tag
dmi.board.name: 84AE
dmi.board.vendor: HP
dmi.board.version: 86.19
dmi.chassis.asset.tag: Chassis Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: HP
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnInsyde:bvrF.02:bd04/06/2018:svnHP:pnHPLaptop15-db0xxx:pvrType1ProductConfigId:rvnHP:rn84AE:rvr86.19:cvnHP:ct10:cvrChassisVersion:
dmi.product.family: 103C_5335KV HP Notebook
dmi.product.name: HP Laptop 15-db0xxx
dmi.product.sku: 4MK59EA#ACB
dmi.product.version: Type1ProductConfigId
dmi.sys.vendor: HP

tags: added: apport-collected eoan

apport information

Mike Lykov (combr) wrote : CRDA.txt

apport information

apport information

apport information

apport information

Mike Lykov (combr) wrote : Lspci.txt

apport information

Mike Lykov (combr) wrote : Lsusb.txt

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Mike Lykov (combr) wrote :

Fixed by bios upgrade.....

-kernel: DMI: HP HP Laptop 15-db0xxx/84AE, BIOS F.02 04/06/2018
+kernel: DMI: HP HP Laptop 15-db0xxx/84AE, BIOS F.20 05/15/2019

also interesting
-kernel: amdgpu 0000:04:00.0: VRAM: 256M 0x000000F400000000 - 0x000000F40FFFFFFF (256M used)
+kernel: amdgpu 0000:04:00.0: VRAM: 1024M 0x000000F400000000 - 0x000000F43FFFFFFF (1024M used)

-kernel: AMD-Vi: [Firmware Bug]: : IOAPIC[4] not in IVRS table
-kernel: AMD-Vi: [Firmware Bug]: : IOAPIC[5] not in IVRS table
-kernel: AMD-Vi: [Firmware Bug]: : No southbridge IOAPIC found
-kernel: AMD-Vi: Disabling interrupt remapping

and other changes
with previous bios kernel 5.4 did not load (black screen at boot).
with current bios it is loaded now :
Linux 5.4.0-050400rc5-generic #201910271430

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers