AMD: Suspend not working when some cores are disabled through cpufreq

Bug #1954930 reported by You-Sheng Yang
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
High
You-Sheng Yang
Focal
Invalid
Undecided
Unassigned
Impish
Fix Released
High
You-Sheng Yang
Jammy
Fix Released
High
You-Sheng Yang
linux-oem-5.14 (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Fix Released
High
You-Sheng Yang
Impish
Invalid
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned

Bug Description

[SRU Justification]

[Impact]

Detailed in https://gitlab.freedesktop.org/drm/amd/-/issues/1708, taking
some cpu cores offline using cpufreq gadgets or via sysfs may hang the
system.

[Fix]

In v5.16-rc1 commit d6b88ce2eb9d ("ACPI: processor idle: Allow playing
dead in C3 state") fixes this issue.

[Test Case]

As stated in aforementioned bug url, setup cpufreq extention to take
down a few cpu cores, and trigger system suspend. There are ~50% chances
that networking/input/... would hang and the user can only reboot by
sysrq keys.

[Where problems could occur]

According to the patch discussion thread in
https://<email address hidden>/,
the limitation to allow enter_dead in no more than ACPI_STATE_C2 might
not have a practical meaning, but simply C2 was the deepest supported
then.

[Other Info]

While this is currently only available in v5.16-rc1 and affects AMD
Cezanne/Barcelo, oem-5.14/impish and jammy are nominated.

========== original bug report ==========

https://gitlab.freedesktop.org/drm/amd/-/issues/1708

Reproduce steps:
1. Install cpufeq gnome extension (https://extensions.gnome.org/extension/1082/cpufreq/)
2. Click on the cpu freq extention in the top bar
3. Slide the "cores online" from 16 to 3
4. close lid of laptop

Expected result: the laptop goes into suspend
Actual result: the laptop stay on but screen is now always black and keyboard input is ignored

Fix committed to v5.16-rc1: https://github.com/torvalds/linux/commit/d6b88ce2eb9d2698eb24451eb92c0a1649b17bb1
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: ubuntu 1188 F.... pulseaudio
 /dev/snd/controlC2: ubuntu 1188 F.... pulseaudio
 /dev/snd/controlC0: ubuntu 1188 F.... pulseaudio
CasperMD5CheckResult: skip
Dependencies:

DistributionChannelDescriptor:
 # This is the distribution channel descriptor for the OEM CDs
 # For more information see http://wiki.ubuntu.com/DistributionChannelDescriptor
 canonical-oem-somerville-focal-amd64-20200502-85+fossa-edge-staging+X152
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-09-09 (97 days ago)
InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 20200502-05:58
IwConfig:
 lo no wireless extensions.

 enp1s0f0 no wireless extensions.
Lsusb:
 Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 003 Device 002: ID 062a:4c01 MosArt Semiconductor Corp. 2.4G INPUT DEVICE
 Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: AMD Celadon-CZN
Package: linux-firmware 1.187.23+staging.38 [origin: LP-PPA-canonical-hwe-team-linux-firmware-staging]
PackageArchitecture: all
ProcFB: 0 amdgpu
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.14.0-9011-oem root=UUID=668f30b7-78ec-472e-9916-c9b1cbdbbbc6 ro automatic-oem-config no_console_suspend
ProcVersionSignature: Ubuntu 5.14.0-9011.11+staging.37-oem 5.14.20
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.14.0-9011-oem N/A
 linux-backports-modules-5.14.0-9011-oem N/A
 linux-firmware 1.187.23+staging.38
RfKill:

Tags: third-party-packages focal
Uname: Linux 5.14.0-9011-oem x86_64
UnreportableReason: This is not an official Ubuntu package. Please remove any third party package and try again.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 06/30/2021
dmi.bios.release: 19.1
dmi.bios.vendor: INSYDE Corp.
dmi.bios.version: RLD1005B_AB
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Celadon-CZN
dmi.board.vendor: AMD
dmi.board.version: Base Board Version
dmi.chassis.asset.tag: Chassis Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDECorp.:bvrRLD1005B_AB:bd06/30/2021:br19.1:svnAMD:pnCeladon-CZN:pvr1:rvnAMD:rnCeladon-CZN:rvrBaseBoardVersion:cvnChassisManufacturer:ct10:cvrChassisVersion:sku123456789:
dmi.product.family: Renoir
dmi.product.name: Celadon-CZN
dmi.product.sku: 123456789
dmi.product.version: 1
dmi.sys.vendor: AMD

CVE References

You-Sheng Yang (vicamo)
tags: added: amd oem-priority originate-from-1954322
description: updated
Changed in linux (Ubuntu Focal):
status: New → Invalid
Changed in linux-oem-5.14 (Ubuntu Impish):
status: New → Invalid
Changed in linux-oem-5.14 (Ubuntu Jammy):
status: New → Invalid
You-Sheng Yang (vicamo)
Changed in linux (Ubuntu Impish):
status: New → In Progress
importance: Undecided → High
assignee: nobody → You-Sheng Yang (vicamo)
Changed in linux (Ubuntu Jammy):
status: New → In Progress
importance: Undecided → High
assignee: nobody → You-Sheng Yang (vicamo)
Changed in linux-oem-5.14 (Ubuntu Focal):
status: New → In Progress
importance: Undecided → High
assignee: nobody → You-Sheng Yang (vicamo)
Revision history for this message
You-Sheng Yang (vicamo) wrote :

[ 123.964355] r8169 0000:01:00.0 enp1s0f0: Link is Down
[ 124.353255] PM: suspend entry (s2idle)
[ 124.366348] Filesystems sync: 0.013 seconds
[ 124.983434] rfkill: input handler enabled
[ 125.062413] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 125.064287] OOM killer disabled.
[ 125.064288] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 125.390929] ACPI: EC: interrupt blocked
[ 125.469200] ACPI: EC: interrupt unblocked
[ 125.510332] pci 0000:00:00.2: can't derive routing for PCI INT A
[ 125.510335] pci 0000:00:00.2: PCI INT A: no GSI
[ 125.511064] [drm] PCIE GART of 1024M enabled.
[ 125.511068] [drm] PTB located at 0x000000F400900000
[ 125.511083] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[ 125.512588] amdgpu 0000:03:00.0: amdgpu: dpm has been disabled
[ 125.513573] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[ 125.522694] nvme nvme0: Shutdown timeout set to 10 seconds
[ 125.526041] nvme nvme0: 16/0/0 default/read/poll queues
[ 125.691534] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
[ 125.691698] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110
[ 125.691806] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
[ 125.691808] PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
[ 125.691820] amdgpu 0000:03:00.0: PM: failed to resume async: error -110
[ 125.693966] OOM killer enabled.
[ 125.693967] Restarting tasks ... done.
[ 125.702960] PM: suspend exit
[ 201.335694] sysrq: This sysrq operation is disabled.
[ 201.543676] sysrq: This sysrq operation is disabled.
[ 201.719682] sysrq: This sysrq operation is disabled.
[ 202.231681] sysrq: Emergency Sync
[ 202.240025] Emergency Sync complete
[ 203.031695] sysrq: Emergency Remount R/O

tags: added: apport-collected focal third-party-packages
description: updated
Revision history for this message
You-Sheng Yang (vicamo) wrote : AlsaInfo.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : CRDA.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : CurrentDmesg.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : Lspci.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : Lspci-vt.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : Lsusb-t.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : Lsusb-v.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : ProcEnviron.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : ProcInterrupts.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : ProcModules.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : UdevDb.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : WifiSyslog.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote : acpidump.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote :
description: updated
Timo Aaltonen (tjaalton)
Changed in linux-oem-5.14 (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Impish):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Jammy):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-oem-5.14/5.14.0-1012.12 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
You-Sheng Yang (vicamo) wrote :

verified linux-oem-5.14/focal-proposed version 5.14.0-1012.12.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem-5.14 - 5.14.0-1013.13

---------------
linux-oem-5.14 (5.14.0-1013.13) focal; urgency=medium

  * focal/linux-oem-5.14: 5.14.0-1013.13 -proposed tracker (LP: #1955464)

  * devices on thunderbolt dock are not recognized on adl-p platform
    (LP: #1955016)
    - SAUCE: thunderbolt: Runtime PM activate both ends of the device link
    - SAUCE: thunderbolt: Tear down existing tunnels when resuming from hibernate
    - SAUCE: thunderbolt: Runtime resume USB4 port when retimers are scanned
    - SAUCE: thunderbolt: Do not allow subtracting more NFC credits than
      configured
    - SAUCE: thunderbolt: Do not program path HopIDs for USB4 routers
    - SAUCE: thunderbolt: Add debug logging of DisplayPort resource allocation

 -- Chia-Lin Kao (AceLan) <email address hidden> Tue, 21 Dec 2021 16:59:25 +0800

Changed in linux-oem-5.14 (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.13.0-24.24 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
Revision history for this message
You-Sheng Yang (vicamo) wrote :

verified linux/impish version 5.13.0-26.27.

tags: added: verification-done-impish
removed: verification-needed-impish
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 5.15.0-17.17

---------------
linux (5.15.0-17.17) jammy; urgency=medium

  * jammy/linux: 5.15.0-17.17 -proposed tracker (LP: #1957809)

 -- Andrea Righi <email address hidden> Thu, 13 Jan 2022 17:11:21 +0100

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (74.6 KiB)

This bug was fixed in the package linux - 5.13.0-28.31

---------------
linux (5.13.0-28.31) impish; urgency=medium

  * amd_sfh: Null pointer dereference on early device init causes early panic
    and fails to boot (LP: #1956519)
    - HID: amd_sfh: Fix potential NULL pointer dereference

  * impish: ddebs build take too long and times out (LP: #1957810)
    - [Packaging] enforce xz compression for ddebs

  * audio mute/ mic mute are not working on a HP machine (LP: #1955691)
    - ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook

  * rtw88_8821ce causes freeze (LP: #1927808)
    - rtw88: Disable PCIe ASPM while doing NAPI poll on 8821CE

  * alsa/sdw: fix the audio sdw codec parsing logic in the acpi table
    (LP: #1955686)
    - ALSA: hda: intel-sdw-acpi: harden detection of controller
    - ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2

  * icmp_redirect from selftests fails on F/kvm (unary operator expected)
    (LP: #1938964)
    - selftests: icmp_redirect: pass xfail=0 to log_test()

  * Impish update: upstream stable patchset 2021-12-17 (LP: #1955180)
    - arm64: zynqmp: Do not duplicate flash partition label property
    - arm64: zynqmp: Fix serial compatible string
    - ARM: dts: sunxi: Fix OPPs node name
    - arm64: dts: allwinner: h5: Fix GPU thermal zone node name
    - arm64: dts: allwinner: a100: Fix thermal zone node name
    - staging: wfx: ensure IRQ is ready before enabling it
    - ARM: dts: NSP: Fix mpcore, mmc node names
    - scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
    - arm64: dts: rockchip: Disable CDN DP on Pinebook Pro
    - arm64: dts: hisilicon: fix arm,sp805 compatible string
    - RDMA/bnxt_re: Check if the vlan is valid before reporting
    - bus: ti-sysc: Add quirk handling for reinit on context lost
    - bus: ti-sysc: Use context lost quirk for otg
    - usb: musb: tusb6010: check return value after calling
      platform_get_resource()
    - usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
    - ARM: dts: ux500: Skomer regulator fixes
    - staging: rtl8723bs: remove possible deadlock when disconnect (v2)
    - ARM: BCM53016: Specify switch ports for Meraki MR32
    - arm64: dts: qcom: msm8998: Fix CPU/L2 idle state latency and residency
    - arm64: dts: qcom: ipq6018: Fix qcom,controlled-remotely property
    - arm64: dts: freescale: fix arm,sp805 compatible string
    - ASoC: SOF: Intel: hda-dai: fix potential locking issue
    - clk: imx: imx6ul: Move csi_sel mux to correct base register
    - ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
    - scsi: advansys: Fix kernel pointer leak
    - ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on ES8336
      codec
    - firmware_loader: fix pre-allocated buf built-in firmware use
    - ARM: dts: omap: fix gpmc,mux-add-data type
    - usb: host: ohci-tmio: check return value after calling
      platform_get_resource()
    - ARM: dts: ls1021a: move thermal-zones node out of soc/
    - ARM: dts: ls1021a-tsn: use generic "jedec,spi-nor" compatible for flash
    - ALSA: ISA: not for M68K
    - tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
    - MIPS: sni:...

Changed in linux (Ubuntu Impish):
status: Fix Committed → Fix Released
Timo Aaltonen (tjaalton)
Changed in hwe-next:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.