[SRU] Fix amdgpu loading errors on Navi3x systems
| Affects | Status | Importance | Assigned to | Milestone | ||
|---|---|---|---|---|---|---|
| linux-firmware (Ubuntu) | Status tracked in Resolute | |||||
| Noble |
Fix Released
|
Undecided
|
Unassigned | |||
| Plucky |
Fix Released
|
Undecided
|
Unassigned | |||
| Questing |
Fix Released
|
Undecided
|
Unassigned | |||
| Resolute |
Fix Released
|
Undecided
|
Leo Lin | |||
Bug Description
[ Impact ]
On navi3x on 6.17-oem kernel, the amdgpu drivers fails to load on navi3x systems due to errors in PSP firmware loading, leading to the following error messages:
[ 627.871752] amdgpu 0000:03:00.0: amdgpu: PSP load kdb failed!
[ 628.056253] [drm:psp_
[ 628.056543] amdgpu 0000:03:00.0: amdgpu: PSP firmware loading failed
[ 628.056546] [drm:amdgpu_
[ 628.056777] amdgpu 0000:03:00.0: amdgpu: amdgpu_
[ 628.056779] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init
[ 628.056781] amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
[ 628.056866] ------------[ cut here ]------------
[ 628.056867] WARNING: CPU: 8 PID: 2133 at drivers/
[ 628.057093] Modules linked in: amdgpu(+) amdxcp gpu_sched drm_panel_
[ 628.057124] snd_pcm_dmaengine coretemp snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm_intel snd_usb_audio snd_hda_codec snd_hda_core snd_usbmidi_lib snd_hwdep snd_seq_midi kvm snd_ump snd_seq_midi_event snd_rawmidi snd_pcm irqbypass polyval_clmulni mc cmdlinepart polyval_generic snd_seq ghash_clmulni_intel sha256_ssse3 spi_nor sha1_ssse3 snd_seq_device aesni_intel snd_timer mei_hdcp spd5118 mtd mei_pxp crypto_simd cryptd mfd_aaeon rapl asus_nb_wmi eeepc_wmi asus_wmi snd i2c_i801 sparse_keymap intel_cstate mei_me platform_profile wmi_bmof i2c_smbus spi_intel_pci i2c_mux soundcore spi_intel mei intel_pmc_core pmt_telemetry pmt_class intel_vsec acpi_pad acpi_tad mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid nvme nvme_core igc nvme_auth ahci intel_lpss_pci intel_lpss libahci idma64 vmd ucsi_acpi typec_ucsi typec video pinctrl_alderlake wmi
[ Fix ]
Update the following firmware to the latest:
- DMCUB:
amdgpu/
- GC:
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
amdgpu/
- PSP:
amdgpu/
amdgpu/
amdgpu/
amdgpu/
- SDMA:
amdgpu/
- SMU:
amdgpu/
amdgpu/
- VCN:
amdgpu/
[ Test ]
On a navi3x system, boot into graphic environment, SSH into the DUT, and repeatedly load and unload module ~20 times:
1. sudo modprobe amdgpu
2. sudo modprobe -r amdgpu
[ Where the problem could occur ]
This should impact only the GPUs with those versions of IP blocks.
[ Other Information ]
Relevant upstream commits:
- DMCUB
https:/
- GC
https:/
https:/
- PSP
https:/
https:/
https:/
- SDMA
https:/
- SMU
https:/
https:/
- VCN
https:/
| Changed in linux-firmware (Ubuntu): | |
| assignee: | nobody → Leo Lin (0xff07) |
| tags: | added: originate-from-2122662 |
| tags: | added: kernel-daily-bug |
| description: | updated |
| description: | updated |
| Changed in linux-firmware (Ubuntu Resolute): | |
| status: | New → Fix Released |
| Changed in linux-firmware (Ubuntu Plucky): | |
| status: | New → Fix Released |
| status: | Fix Released → Fix Committed |
| Changed in linux-firmware (Ubuntu Noble): | |
| status: | New → Fix Committed |
| Changed in linux-firmware (Ubuntu Questing): | |
| status: | New → Fix Committed |

We also need this for Q. Leo, can you please review: /kernel. ubuntu. com/forgejo/ kernel/ linux-firmware/ pulls/253
https:/