AMD

Yellow Carp S0i3 stability fix

Bug #1945348 reported by Mario Limonciello
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
AMD
Fix Released
Undecided
Alex Hung
linux-oem-5.14 (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

  Below errors are reported with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8.

  [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110
  amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
  PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
  amdgpu 0000:04:00.0: PM: failed to resume async: error -110

[Fix]

  The patch fixes this by forcing exit gfxoff for sdma resume.

  The patch is in 5.15-rc4 (https://github.com/torvalds/linux/commit/26db706a6d77b9e184feb11725e97e53b7a89519)

[Test]

  This is requested by AMD.

[Where problems could occur]

  Low risk. This only affects AMD platforms with s0ix supports. The changes repeat what is (should be) done in firmware.

===== original descriptions =====

A problem is identified with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8 sometimes there will be failures.

It will manifest as:

[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110
amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
amdgpu 0000:04:00.0: PM: failed to resume async: error -110

The fix for this will be going into 5.15-rcX and also to 5.14.y. Just want to give a pointer to Canonical team that if this comes up where to expect the fix for OEM kernel.
The patch going into 5.15-rcX and 5.14.y is: https://lists.freedesktop.org/archives/amd-gfx/2021-September/069451.html

Revision history for this message
Mario Limonciello (superm1) wrote :

It's in 5.15-rc4, can you please pick it up for OEM kernel?
https://github.com/torvalds/linux/commit/26db706a6d77b9e184feb11725e97e53b7a89519

Alex Hung (alexhung)
description: updated
information type: Proprietary → Public
Changed in amd:
assignee: nobody → Alex Hung (alexhung)
Revision history for this message
Alex Hung (alexhung) wrote :
Revision history for this message
Alex Hung (alexhung) wrote (last edit ):
Changed in amd:
status: New → Fix Released
status: Fix Released → Fix Committed
Alex Hung (alexhung)
Changed in linux-oem-5.14 (Ubuntu):
status: New → Fix Committed
Alex Hung (alexhung)
Changed in linux-oem-5.14 (Ubuntu):
status: Fix Committed → Invalid
Changed in linux-oem-5.14 (Ubuntu Focal):
status: New → Fix Committed
Timo Aaltonen (tjaalton)
Changed in linux-oem-5.14 (Ubuntu Focal):
status: Fix Committed → Fix Released
Alex Hung (alexhung)
Changed in amd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.