AMD

Activity log for bug #1945348

Date Who What changed Old value New value Message
2021-09-28 16:50:40 Mario Limonciello bug added bug
2021-10-15 19:57:41 Alex Hung description A problem is identified with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8 sometimes there will be failures. It will manifest as: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110 amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 amdgpu 0000:04:00.0: PM: failed to resume async: error -110 The fix for this will be going into 5.15-rcX and also to 5.14.y. Just want to give a pointer to Canonical team that if this comes up where to expect the fix for OEM kernel. The patch going into 5.15-rcX and 5.14.y is: https://lists.freedesktop.org/archives/amd-gfx/2021-September/069451.html [Impact] Below errors are reported with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8. [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110 amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 amdgpu 0000:04:00.0: PM: failed to resume async: error -110 [Fix] The patch fixes this by forcing exit gfxoff for sdma resume. The patch is in 5.15-rc4 (https://github.com/torvalds/linux/commit/26db706a6d77b9e184feb11725e97e53b7a89519) [Test] This is requested by AMD. [Where problems could occur] Low risk. This only affects AMD platforms with s0ix supports. The changes repeat what is (should be) done in firmware. ===== original descriptions ===== A problem is identified with S0i3 on Yellow carp where under stress testing with 5.14.0 through 5.14.8 sometimes there will be failures. It will manifest as: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v5_2> failed -110 amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 amdgpu 0000:04:00.0: PM: failed to resume async: error -110 The fix for this will be going into 5.15-rcX and also to 5.14.y. Just want to give a pointer to Canonical team that if this comes up where to expect the fix for OEM kernel. The patch going into 5.15-rcX and 5.14.y is: https://lists.freedesktop.org/archives/amd-gfx/2021-September/069451.html
2021-10-15 19:57:45 Alex Hung information type Proprietary Public
2021-10-15 19:58:14 Alex Hung amd: assignee Alex Hung (alexhung)
2021-10-25 17:33:49 Alex Hung amd: status New Fix Released
2021-10-25 17:33:54 Alex Hung amd: status Fix Released Fix Committed
2021-10-29 23:48:34 Alex Hung bug task added linux-oem-5.14 (Ubuntu)
2021-10-29 23:48:50 Alex Hung linux-oem-5.14 (Ubuntu): status New Fix Committed
2021-10-29 23:57:21 Alex Hung nominated for series Ubuntu Focal
2021-10-29 23:57:21 Alex Hung bug task added linux-oem-5.14 (Ubuntu Focal)
2021-10-29 23:57:29 Alex Hung linux-oem-5.14 (Ubuntu): status Fix Committed Invalid
2021-10-29 23:57:33 Alex Hung linux-oem-5.14 (Ubuntu Focal): status New Fix Committed
2021-12-13 12:35:17 Timo Aaltonen linux-oem-5.14 (Ubuntu Focal): status Fix Committed Fix Released
2022-01-03 18:39:23 Alex Hung amd: status Fix Committed Fix Released