AMD

Yellow Carp S0i3 stability fix

Bug #1945348 reported by Mario Limonciello
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
AMD
Fix Released
Undecided
Alex Hung
linux-oem-5.14 (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

  Below errors are reported with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8.

  [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110
  amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
  PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
  amdgpu 0000:04:00.0: PM: failed to resume async: error -110

[Fix]

  The patch fixes this by forcing exit gfxoff for sdma resume.

  The patch is in 5.15-rc4 (https://github.com/torvalds/linux/commit/26db706a6d77b9e184feb11725e97e53b7a89519)

[Test]

  This is requested by AMD.

[Where problems could occur]

  Low risk. This only affects AMD platforms with s0ix supports. The changes repeat what is (should be) done in firmware.

===== original descriptions =====

A problem is identified with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8 sometimes there will be failures.

It will manifest as:

[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110
amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
amdgpu 0000:04:00.0: PM: failed to resume async: error -110

The fix for this will be going into 5.15-rcX and also to 5.14.y. Just want to give a pointer to Canonical team that if this comes up where to expect the fix for OEM kernel.
The patch going into 5.15-rcX and 5.14.y is: https://lists.freedesktop.org/archives/amd-gfx/2021-September/069451.html

Revision history for this message
Mario Limonciello (superm1) wrote :

It's in 5.15-rc4, can you please pick it up for OEM kernel?
https://github.com/torvalds/linux/commit/26db706a6d77b9e184feb11725e97e53b7a89519

Alex Hung (alexhung)
description: updated
information type: Proprietary → Public
Changed in amd:
assignee: nobody → Alex Hung (alexhung)
Revision history for this message
Alex Hung (alexhung) wrote :
Revision history for this message
Alex Hung (alexhung) wrote (last edit ):
Changed in amd:
status: New → Fix Released
status: Fix Released → Fix Committed
Alex Hung (alexhung)
Changed in linux-oem-5.14 (Ubuntu):
status: New → Fix Committed
Alex Hung (alexhung)
Changed in linux-oem-5.14 (Ubuntu):
status: Fix Committed → Invalid
Changed in linux-oem-5.14 (Ubuntu Focal):
status: New → Fix Committed
Timo Aaltonen (tjaalton)
Changed in linux-oem-5.14 (Ubuntu Focal):
status: Fix Committed → Fix Released
Alex Hung (alexhung)
Changed in amd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers