UVD firmware for AMD Southern Islands (GCN 1) GPUs is missing

Bug #1953249 reported by Ivo Cavalcante
54
This bug affects 11 people
Affects Status Importance Assigned to Milestone
linux-firmware (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Juerg Haefliger

Bug Description

[ Impact ]

AMD GPU not functional on Focal with HWE kernel.

[ Test Case ]

See original description below.

[ Fix ]

Cherry-pick relevant commit from upstream linux-firmware.

[ Where Problems Could Occur ]

Broken graphics with AMD GPUs.

[ Original Description ]

Release: up-to-date Focal LTS (20.04.3)
Package-version: linux-firmware 1.187.20
Hardware model: [AMD/ATI] Chelsea LP [Radeon HD 7730M]

With the latest kernel upgrade (5.4 --> 5.11, if I recall correctly), my laptop's discrete graphics stopped working. Looking at the logs, I found these messages:

-- snippet --
kernel: [ 1.492908] [drm] amdgpu: dpm initialized
kernel: [ 1.492932] [drm] AMDGPU Display Connectors
kernel: [ 1.492951] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/verde_uvd.bin failed with error -2
kernel: [ 1.492954] amdgpu 0000:01:00.0: amdgpu: amdgpu_uvd: Can't load firmware "amdgpu/verde_uvd.bin"
kernel: [ 1.492957] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -2
kernel: [ 1.493196] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: [ 1.493198] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
kernel: [ 1.493200] amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
-- snippet --

In fact, file '/lib/firmware/amdgpu/verde_uvd.bin' was missing. Running '$ dpkg -L linux-firmware | sort' gives this:

-- snippet --
/lib/firmware/amdgpu/vegam_uvd.bin
/lib/firmware/amdgpu/vegam_vce.bin
/lib/firmware/amdgpu/verde_ce.bin
/lib/firmware/amdgpu/verde_k_smc.bin
/lib/firmware/amdgpu/verde_mc.bin
/lib/firmware/amdgpu/verde_me.bin
/lib/firmware/amdgpu/verde_pfp.bin
/lib/firmware/amdgpu/verde_rlc.bin
/lib/firmware/amdgpu/verde_smc.bin
/lib/firmware/amdgpu/yellow_carp_asd.bin
/lib/firmware/amdgpu/yellow_carp_ce.bin
-- snippet --

Copying the file from upstream (https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu/verde_uvd.bin) didn't work on my system, probably because I use UEFI and the module wasn't signed (error below):

-- snippet --
kernel: [ 502.174932] amdgpu 0000:01:00.0: amdgpu: amdgpu_uvd: Can't validate firmware "amdgpu/verde_uvd.bin"
kernel: [ 502.174992] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -22
kernel: [ 502.175285] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: [ 502.175289] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
kernel: [ 502.175293] amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
-- snippet --

I can confirm that reverting kernel driver to 'radeonsi' makes the device usable again, but at the expense of Vulkan APIs (not supported by this driver).

Could you, please, fix this?

description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-firmware (Ubuntu):
status: New → Confirmed
Revision history for this message
Jean-Pierre van Riel (jpvr) wrote :

Per coincidence, I worked on this same bug today, as I'm hoping to try make use of the amdpro legacy OpenCL drivers, which will need amdgpu as a base.

The issue is Ubuntu is providing newer kernel HWE stacks with amdgpu driver modules, but failing to keep related Linux firmware packages up to date and in lockstep. E.g. 20.04.3 LTS is based on kernel 5.11 from hirsute (21.04), but the linux-firmware collection is left outdated on 1.187.20 that came with kernel 5.4?

https://packages.ubuntu.com/hirsute-updates/linux-firmware shows v 1.197.3, so I'd expect a HWE stack with 20.04.3 to also provide the same level of firmware updates for kernel 5.11.

> I use UEFI and the module wasn't signed

Perhaps installing the official deb/repo instead of upstream has a good shot at the firmware that gets added into initramfs being signed properly (one would hope!). I manged the following workaround, but my case is still legacy BIOS.

curl -OL http://archive.ubuntu.com/ubuntu/pool/main/l/linux-firmware/linux-firmware_1.197.3_all.deb
sudo dpkg -i linux-firmware_1.197.3_all.deb

Extra hint: maybe download the deb from your closest mirror because this package is almost 200MB large.

P.S. While most firmware is hopefully decoupled and backward compatible to use with older kernel modules/drivers, one never knows if a bug might show up by using very new firmware with much older modules, since you stray into a untested path between the firmware and the kernel modules. Hence I tried to match the Ubuntu release, kernel versions, and tested-at-the-time firmware more closely than just jumping to latest upstream firmware versions.

Its similar but a kinda upside version of the older proprietary graphics driver blob hell that plagued Linux over the previous decade. Firmware is the other way round where the in-tree admgpu driver needs the firmware to provide a stable interface. Can't recall if Linux ever solved the lack of a stable in-kernel/versioned ABI between drivers and binary driver blobs. It's why AMD FGLRX proprietary drivers blobs were always hell, and AMD failed to keep up with ABI changes on the kernel and often didn't work with newer kernels, hence amdgpu became the new middle layer/foundation to bridge this ABI gap and add create a stable/presumably versioned ABI for AMD that the kernel developers never bothered to provide. Now AMD-pro software and extended driver blobs can rely on the opensource amdgpu parts to be less of a moving target as ABI changes on the kernel core will need to update/include required changes to amdgpu because amdgpu is now in-tree and should be tested in lockstep with other kernel changes.

Juerg Haefliger (juergh)
Changed in linux-firmware (Ubuntu Focal):
status: New → Triaged
assignee: nobody → Juerg Haefliger (juergh)
Juerg Haefliger (juergh)
description: updated
Changed in linux-firmware (Ubuntu Focal):
status: Triaged → In Progress
Juerg Haefliger (juergh)
Changed in linux-firmware (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Ivo, or anyone else affected,

Accepted linux-firmware into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/1.187.24 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Mathew Hodson (mhodson)
Changed in linux-firmware (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Ivo Cavalcante (ivo-cavalcante) wrote :

Sorry guys, been busy. Tested the package on propose, works just fine. Thanks for fixing!

Revision history for this message
Yuan-Chen Cheng (ycheng-twn) wrote :

given #4, update status tag.

tags: added: verification-done verification-done-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 1.187.24

---------------
linux-firmware (1.187.24) focal; urgency=medium

  * amdgpu: add UVD firmware for SI asics (LP: #1953249)
  * linux-firmware: update frimware for mediatek bluetooth chip (MT7921) (LP: #1954300)

 -- Juerg Haefliger <email address hidden> Thu, 16 Dec 2021 16:01:58 +0100

Changed in linux-firmware (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Update Released

The verification of the Stable Release Update for linux-firmware has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Carlos Vieira (cavvieira) wrote :

Sorry for the edits earlier, I don't know how to indicate this is affecting 22.04 with 6.5 security fix kernel. Here's what I'm getting:

Setting up linux-image-oem-22.04d (6.5.0.1004.4) ...
Processing triggers for linux-image-6.5.0-1004-oem (6.5.0-1004.4) ...
/etc/kernel/postinst.d/dkms:
 * dkms: running auto installation service for kernel 6.5.0-1004-oem
   ...done.
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-6.5.0-1004-oem
W: Possible missing firmware /lib/firmware/amdgpu/ip_discovery.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/vega10_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/navi12_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/psp_13_0_6_ta.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/psp_13_0_6_sos.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/psp_13_0_10_ta.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/psp_13_0_10_sos.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/aldebaran_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/aldebaran_sjt_mec2.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/aldebaran_sjt_mec.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_9_4_3_rlc.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_9_4_3_mec.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_imu.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_rlc.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_mec.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_me.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_pfp.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_0_toc.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sdma_4_4_2.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sdma_6_0_3.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_mes1.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/navi10_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_mes1.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_mes_2.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/vcn_4_0_3.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/smu_13_0_10.bin for module amdgpu
/etc/kernel/postinst.d/zz-update-grub:
Sourcing file `/etc/default/grub'

no longer affects: linux-signed-oem-6.5 (Ubuntu Focal)
no longer affects: linux-signed-oem-6.5 (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.