[amdgpu] kernel crash when resuming from automatic suspension to RAM (started in 5.7.0, not in 5.6.19)

Bug #1917674 reported by Douglas Silva
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
linux-hwe-5.8 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Automatic suspension causes the system to freeze on resume. This doesn't happen if I suspend it manually by opening the system menu and pressing the suspend button. Suspension works flawlessly when it is triggered manually.

When the system resumes from automatic suspension, the lock screen is visible but does not respond. It's frozen. A notification says something like "system will suspend shortly due to inactivity". It never goes away.

The keyboard responds during this (at least for a while). The CAPS LOCK LED turns on and off as I press it (with no delay). Sometimes I can even switch to another TTY and kill the X server, which is why I'm reporting it against gnome-shell. When switching TTY is impossible, I use the "REISUB" combination to shut down. Either way, I always lose the running session.

This is reproducible about 60% of the time. Eventually I ended up disabling it and relying only on manual suspension.

# Additional information:
1) Ubuntu 20.04.2 LTS / Release: 20.04
2) gnome-shell 3.36.4-1ubuntu1~20.04.2

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: gnome-shell 3.36.4-1ubuntu1~20.04.2
ProcVersionSignature: Ubuntu 5.8.0-44.50~20.04.1-generic 5.8.18
Uname: Linux 5.8.0-44-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.16
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Wed Mar 3 16:56:01 2021
DisplayManager: gdm3
InstallationDate: Installed on 2021-01-31 (31 days ago)
InstallationMedia: Ubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
RelatedPackageVersions: mutter-common 3.36.7+git20201123-0.20.04.1
SourcePackage: gnome-shell
UpgradeStatus: No upgrade log present (probably fresh install)
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.16
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: dsilva 1884 F.... pulseaudio
 /dev/snd/controlC1: dsilva 1884 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-01-31 (49 days ago)
InstallationMedia: Ubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 002: ID 05e3:0608 Genesys Logic, Inc. Hub
 Bus 001 Device 004: ID 0951:16de Kingston Technology HyperX Pulsefire Core
 Bus 001 Device 003: ID 0951:16d2 Kingston Technology HyperX Alloy FPS Pro Mechanical Gaming Keyboard
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Gigabyte Technology Co., Ltd. B460MDS3H
Package: linux-hwe-5.8
ProcEnviron:
 LANGUAGE=pt_BR:pt:en
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=pt_BR.UTF-8
 SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.8.0-45-generic root=/dev/mapper/vgubuntu-root ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.8.0-45.51~20.04.1-generic 5.8.18
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-45-generic N/A
 linux-backports-modules-5.8.0-45-generic N/A
 linux-firmware 1.187.10
RfKill:

Tags: focal
Uname: Linux 5.8.0-45-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 05/27/2020
dmi.bios.release: 5.17
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F3
dmi.board.asset.tag: Default string
dmi.board.name: B460M DS3H
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF3:bd05/27/2020:br5.17:svnGigabyteTechnologyCo.,Ltd.:pnB460MDS3H:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnB460MDS3H:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: B460 MB
dmi.product.name: B460MDS3H
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Douglas Silva (o-alquimista) wrote :
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It sounds like some part of the system has crashed. To help us find the cause of the crash please follow these steps:

1. Look in /var/crash for crash files and if found run:
    ubuntu-bug YOURFILE.crash
Then tell us the ID of the newly-created bug.

2. If step 1 failed then look at https://errors.ubuntu.com/user/ID where ID is the content of file /var/lib/whoopsie/whoopsie-id on the machine. Do you find any links to recent problems on that page? If so then please send the links to us.

3. If step 2 also failed then apply the workaround from bug 994921, reboot, reproduce the crash, and retry step 1.

Please take care to avoid attaching .crash files to bugs as we are unable to process them as file attachments. It would also be a security risk for yourself.

Changed in gnome-shell (Ubuntu):
status: New → Incomplete
Revision history for this message
Douglas Silva (o-alquimista) wrote :

I can't see any files in /var/crash/, even after applying the workaround and reproducing the freeze again.

This time the freeze was more severe. There was no response from the keyboard and a hard reset was the only way to recover.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK then. Next time the freeze happens please:

1. Wait 10 seconds.

2. Reboot.

3. Open a Terminal and run:

   journalctl -b-1 > prevboot.txt

4. Attach the resulting text file here.

5. Follow all the steps in comment #2 again.

Revision history for this message
Douglas Silva (o-alquimista) wrote :

I'm attaching the requested logs.

Near the end of this file you'll see the last action I performed, which was the CTRL+ALT+PRINT+REISUB combination to reboot. You'll also see that there is a long crash log related to AMDGPU.

Oh, and this time when it crashed I attempted to switch TTY, and then the screen changed to a display of trillions of tiny colorful artifacts, like a TV with bad reception.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [amdgpu] kernel crash when resuming from automatic suspension to RAM

Yes the problem appears to be crashes in the amdgpu kernel driver so please try some other kernel versions and tell us which have or don't have the same bug:

  https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=M;O=D

summary: - Freeze when resuming from automatic suspension to RAM
+ [amdgpu] kernel crash when resuming from automatic suspension to RAM
affects: gnome-shell (Ubuntu) → linux (Ubuntu)
tags: added: amdgpu
tags: added: resume suspend-resume
Revision history for this message
Douglas Silva (o-alquimista) wrote :

I was able to reproduce this with kernel 5.0.0 too. Isn't it possible that this is not caused by the kernel? I don't think I had this issue back then - it's something more recent.

Revision history for this message
Douglas Silva (o-alquimista) wrote :

Nevermind, I was running 5.8, not 5.0.0. I forgot to select 5.0.0 in the boot menu.

Revision history for this message
Douglas Silva (o-alquimista) wrote :

I've finished the tests, here are the results:

5.0.0: OK
5.4.0: OK
5.6.0: OK
5.7.0: BUG
5
.8.0: BUG

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Excellent, thanks. Are you able to find out which minor release 5.6.N the problem started in? Once we know the number it is sometimes easy to identify the code change that caused it.

Please also test some current kernels like 5.11 and 5.12 to be sure it's not something that's already been fixed.

Changed in linux-hwe-5.8 (Ubuntu):
status: New → Incomplete
Revision history for this message
Douglas Silva (o-alquimista) wrote :

Kernel 5.6.19: OK

If I'm not mistaken, this was the last version before 5.7, so the bug must have been introduced in 5.7.0.

tags: added: regression-release
summary: [amdgpu] kernel crash when resuming from automatic suspension to RAM
+ (started in 5.7.0, not in 5.6.19)
Changed in linux (Ubuntu):
status: Incomplete → New
Changed in linux-hwe-5.8 (Ubuntu):
status: Incomplete → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1917674

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Douglas Silva (o-alquimista) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Douglas Silva (o-alquimista) wrote : CRDA.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : IwConfig.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : Lspci.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : Lspci-vt.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : Lsusb-t.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : Lsusb-v.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : ProcModules.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : UdevDb.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : WifiSyslog.txt

apport information

Revision history for this message
Douglas Silva (o-alquimista) wrote : acpidump.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux-hwe-5.8 (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

See also bug 1937321.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Please test latest drm-tip kernel:
https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/

Headers are not needed.

Revision history for this message
Douglas Silva (o-alquimista) wrote :

I'm now using Ubuntu 21.04 hirsute, and this bug can no longer be reproduced :) Kernel 5.11.0-25-generic.

I've done 3 tests. No crashes so far.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux-hwe-5.8 (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.