Activity log for bug #2051636

Date Who What changed Old value New value Message
2024-01-30 06:20:58 You-Sheng Yang bug added bug
2024-01-30 06:21:08 You-Sheng Yang nominated for series Ubuntu Mantic
2024-01-30 06:21:08 You-Sheng Yang bug task added linux-firmware (Ubuntu Mantic)
2024-01-30 06:21:08 You-Sheng Yang nominated for series Ubuntu Jammy
2024-01-30 06:21:08 You-Sheng Yang bug task added linux-firmware (Ubuntu Jammy)
2024-01-30 06:21:08 You-Sheng Yang nominated for series Ubuntu Noble
2024-01-30 06:21:08 You-Sheng Yang bug task added linux-firmware (Ubuntu Noble)
2024-01-30 07:33:31 You-Sheng Yang summary AMD phenix/phenix2 platforms facing amdgpu(PHX) hangs during stress loading AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading
2024-01-30 07:53:31 You-Sheng Yang linux-firmware (Ubuntu Jammy): status New In Progress
2024-01-30 07:53:32 You-Sheng Yang linux-firmware (Ubuntu Mantic): status New In Progress
2024-01-30 07:53:39 You-Sheng Yang linux-firmware (Ubuntu Noble): status New Incomplete
2024-01-30 07:53:47 You-Sheng Yang linux-firmware (Ubuntu Noble): status Incomplete Triaged
2024-01-30 07:53:51 You-Sheng Yang linux-firmware (Ubuntu Jammy): importance Undecided High
2024-01-30 07:53:53 You-Sheng Yang linux-firmware (Ubuntu Mantic): importance Undecided High
2024-01-30 07:53:55 You-Sheng Yang linux-firmware (Ubuntu Jammy): assignee You-Sheng Yang (vicamo)
2024-01-30 07:53:57 You-Sheng Yang linux-firmware (Ubuntu Mantic): assignee You-Sheng Yang (vicamo)
2024-01-30 07:53:59 You-Sheng Yang linux-firmware (Ubuntu Noble): assignee You-Sheng Yang (vicamo)
2024-01-30 08:58:08 You-Sheng Yang description With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs within a few minutes or sometimes even quicker. Also using mantic + v6.7 hit the hang, so need to update new FWs to fix this issue. PHX series https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85 [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=27035, emitted seq=27037 [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421 [ 415.783004] amdgpu 0000:0d:00.0: amdgpu: GPU reset begin! [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.996743] amdgpu 0000:0d:00.0: amdgpu: MODE2 reset [ 417.026498] amdgpu 0000:0d:00.0: amdgpu: GPU reset succeeded, trying to resume [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000). [ 417.027149] amdgpu 0000:0d:00.0: amdgpu: SMU is resuming... [ 417.029520] amdgpu 0000:0d:00.0: amdgpu: SMU is resumed successfully! [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000 [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0 [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG Mode). [ 417.193037] amdgpu 0000:0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully. [ 417.193447] amdgpu 0000:0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [ 417.193449] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 417.193451] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 417.193452] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0 [ 417.193453] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0 [ 417.193454] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0 [ 417.193455] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0 [ 417.193456] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0 [ 417.193458] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0 [ 417.193459] amdgpu 0000:0d:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [ 417.193460] amdgpu 0000:0d:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8 [ 417.193461] amdgpu 0000:0d:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8 [ 417.193462] amdgpu 0000:0d:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0 [ 417.195893] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow start [ 417.195894] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow done [ 417.195904] amdgpu 0000:0d:00.0: amdgpu: GPU reset(2) succeeded! [ 417.197048] [drm] Skip scheduling IBs! [ 417.197057] [drm] Skip scheduling IBs! [ 417.197063] [drm] Skip scheduling IBs! [ 443.578688] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! [SRU Justification] [Impact] With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs within a few minutes or sometimes even quicker [Fix] Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and other prerequisites: * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 firmware") * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 firmware") * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 firmware") * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 firmware") * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 firmware") [Test Case] Run stress tool like 3DMark or GravityMark. [Where problems could occur] Binary firmware update recommended by chip vendor. No known issue so far. [Other Info] Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy is also nominated for fix. ========== original bug report ========== With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs within a few minutes or sometimes even quicker. Also using mantic + v6.7 hit the hang, so need to update new FWs to fix this issue. PHX series https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85 [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=27035, emitted seq=27037 [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421 [ 415.783004] amdgpu 0000:0d:00.0: amdgpu: GPU reset begin! [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue [ 416.996743] amdgpu 0000:0d:00.0: amdgpu: MODE2 reset [ 417.026498] amdgpu 0000:0d:00.0: amdgpu: GPU reset succeeded, trying to resume [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000). [ 417.027149] amdgpu 0000:0d:00.0: amdgpu: SMU is resuming... [ 417.029520] amdgpu 0000:0d:00.0: amdgpu: SMU is resumed successfully! [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000 [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0 [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG Mode). [ 417.193037] amdgpu 0000:0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully. [ 417.193447] amdgpu 0000:0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [ 417.193449] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 417.193451] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 417.193452] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0 [ 417.193453] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0 [ 417.193454] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0 [ 417.193455] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0 [ 417.193456] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0 [ 417.193458] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0 [ 417.193459] amdgpu 0000:0d:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [ 417.193460] amdgpu 0000:0d:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8 [ 417.193461] amdgpu 0000:0d:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8 [ 417.193462] amdgpu 0000:0d:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0 [ 417.195893] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow start [ 417.195894] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow done [ 417.195904] amdgpu 0000:0d:00.0: amdgpu: GPU reset(2) succeeded! [ 417.197048] [drm] Skip scheduling IBs! [ 417.197057] [drm] Skip scheduling IBs! [ 417.197063] [drm] Skip scheduling IBs! [ 443.578688] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
2024-01-30 09:00:04 You-Sheng Yang bug added subscriber Canonical Hardware Enablement
2024-01-30 09:00:12 You-Sheng Yang tags amd oem-priority originate-from-2051539
2024-01-31 06:32:18 Juerg Haefliger tags amd oem-priority originate-from-2051539 amd kern-9038 oem-priority originate-from-2051539
2024-01-31 06:32:25 Juerg Haefliger bug added subscriber Juerg Haefliger
2024-02-08 14:07:01 Timo Aaltonen linux-firmware (Ubuntu Noble): status Triaged Fix Released
2024-02-09 16:10:57 Timo Aaltonen linux-firmware (Ubuntu Mantic): status In Progress Fix Committed
2024-02-09 16:10:59 Timo Aaltonen bug added subscriber Ubuntu Stable Release Updates Team
2024-02-09 16:11:02 Timo Aaltonen bug added subscriber SRU Verification
2024-02-09 16:13:23 Timo Aaltonen linux-firmware (Ubuntu Jammy): status In Progress Fix Committed
2024-02-15 15:48:25 Mario Limonciello tags amd kern-9038 oem-priority originate-from-2051539 amd kern-9038 oem-priority originate-from-2051539 verification-done-jammy
2024-02-29 07:43:07 Timo Aaltonen tags amd kern-9038 oem-priority originate-from-2051539 verification-done-jammy amd kern-9038 oem-priority originate-from-2051539 verification-done-jammy verification-needed-mantic
2024-03-04 08:05:17 You-Sheng Yang tags amd kern-9038 oem-priority originate-from-2051539 verification-done-jammy verification-needed-mantic amd kern-9038 oem-priority originate-from-2051539 verification-done-jammy verification-done-mantic
2024-03-08 04:01:47 Timo Aaltonen removed subscriber Ubuntu Stable Release Updates Team
2024-03-08 04:28:12 Launchpad Janitor linux-firmware (Ubuntu Jammy): status Fix Committed Fix Released
2024-03-11 15:38:29 Launchpad Janitor linux-firmware (Ubuntu Mantic): status Fix Committed Fix Released