amdgpu fails to init with kernel 5.19.0

Bug #2007633 reported by Jeff Schmidt
46
This bug affects 9 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Update 2023-03-04: A new version of the 5.19 kernel, 5.19.0-35-generic, will now boot into a GUI desktop instead of leaving me with a completely unusable system, BUT, there is no working Vulkan implementation, as if the amdgpu driver still failed to load so the new kernel did a fallback to the radeon driver, I would guess.

Partial output from vulkaninfo with 5.19 kernel:

------

Device Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------
        apiVersion = 4206816 (1.3.224)
        driverVersion = 1 (0x0001)
        vendorID = 0x10005
        deviceID = 0x0000
        deviceType = PHYSICAL_DEVICE_TYPE_CPU
        deviceName = llvmpipe (LLVM 15.0.6, 256 bits)
        pipelineCacheUUID = 76616c2d-2573-0000-0000-000000000000

END of Update 2023-03-04
------------

I recently installed apt upgrades, and among the upgrades, was the 5.19.0-32-generic kernel. I also still have installed the 5.15.0-58-generic kernel.

Every time I try to boot with the 5.19 kernel, my AMD Radeon R9 280 GPU fails to init. This results in the computer effectively getting 'hung' part way through the kernel boot as X or Wayland, whatever should start up, doesn't. The kernel keeps running (I can see the hard drive light occasionally flash on my case, and the kernel logs continue after the point of the apparent hang - although kernel messages stop getting output to the console - to see the messages, I have to reboot with 5.15 then use journalctl to review the previous boot messages).

It inits fine with the older kernel. Not sure what the root cause of the failure is. Found this in kernel boot logs (via "journalctl -b -1" to get the logs for the previous boot, that failed), which I think is indicative of the problem:

Feb 16 21:37:17 jeff-linux kernel: status:
Feb 16 21:37:17 jeff-linux kernel: [drm] amdgpu: dpm initialized
Feb 16 21:37:17 jeff-linux kernel: [drm] Found UVD firmware Version: 64.0 Family ID: 13
Feb 16 21:37:17 jeff-linux kernel: amdgpu: Move buffer fallback to memcpy unavailable
Feb 16 21:37:17 jeff-linux kernel: [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -19
Feb 16 21:37:17 jeff-linux kernel: amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
Feb 16 21:37:17 jeff-linux kernel: amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
Feb 16 21:37:17 jeff-linux kernel: amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
Feb 16 21:37:17 jeff-linux kernel: BUG: kernel NULL pointer dereference, address: 0000000000000090

----------------------
version signature info (line 2 from kernel boot log):

Feb 16 21:37:14 jeff-linux kernel: Linux version 5.19.0-32-generic (buildd@lcy02-amd64-026) (x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Jan 30 17:03:34 UTC 2 (Ubuntu 5.19.0-32.33~22.04.1-generic 5.19.17)

---------------------

root@jeff-linux:/home/jeff# lsb_release -rd
Description: Ubuntu 22.04.1 LTS
Release: 22.04

See attached lspci-vnvn.log
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu82.3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: jeff 1525 F.... pulseaudio
 /dev/snd/controlC0: jeff 1525 F.... pulseaudio
 /dev/snd/controlC1: jeff 1525 F.... pulseaudio
CRDA: N/A
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 22.04
InstallationDate: Installed on 2022-02-27 (355 days ago)
InstallationMedia: Ubuntu 21.10 "Impish Indri" - Release amd64 (20211012)
IwConfig:
 lo no wireless extensions.

 eno1 no wireless extensions.
MachineType: System manufacturer System Product Name
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic root=UUID=bd9a6ac0-c013-4645-a8d7-23af9e13d239 ro quiet splash radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 amdgpu.dc=1 amdgpu.dpm=1 amdgpu.modeset=1 vt.handoff=7
ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-58-generic N/A
 linux-backports-modules-5.15.0-58-generic N/A
 linux-firmware 20220329.git681281e4-0ubuntu3.10
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
Tags: jammy wayland-session
Uname: Linux 5.15.0-58-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 08/06/2014
dmi.bios.release: 4.6
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1302
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: P8Q77-M
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1302:bd08/06/2014:br4.6:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnP8Q77-M:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:skuSKU:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Jeff Schmidt (jsbiff) wrote :
Revision history for this message
Jeff Schmidt (jsbiff) wrote :

Attaching output of journalctl -b -1, as xz compressed file

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2007633

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jeff Schmidt (jsbiff) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected jammy wayland-session
description: updated
Revision history for this message
Jeff Schmidt (jsbiff) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : Lspci.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : Lspci-vt.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : Lsusb.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : Lsusb-t.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : Lsusb-v.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : PaInfo.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : ProcModules.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : PulseList.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : UdevDb.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : WifiSyslog.txt

apport information

Revision history for this message
Jeff Schmidt (jsbiff) wrote : acpidump.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Douglas Wiedemann (ventman) wrote :

I'm having the same problem with one of my systems. Same family of GPU, Tahiti.
This bug https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1953249 is very similar to what is happening. However, I'm able to get all the way to an X session, but functionality of the card is affected (can't change display refresh rate). Something must have changed, because the machine has the relevant firmware files in /lib/firmware/amdgpu.

Revision history for this message
grogd (abitgroggy) wrote :

I too see the same issue as OP with 5.19.0-32-generic. Running 22.04.2 LTS with Kaveri. Here are some relevant messages:

From syslog:

Feb 21 12:00:26 mythtv kernel: [ 4.725116] [drm] Loading kaveri Microcode
...
Feb 21 12:00:26 mythtv kernel: [ 4.762659] [drm] Found UVD firmware Version: 1.64 Family ID: 9

From dmesg:

[ 7.619038] [drm] Found UVD firmware Version: 1.64 Family ID: 9
[ 7.619095] amdgpu: Move buffer fallback to memcpy unavailable
[ 7.619111] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* sw_init of IP block <uvd_v4_2> failed -19
[ 7.619626] amdgpu 0000:00:01.0: amdgpu: amdgpu_device_ip_init failed
[ 7.619632] amdgpu 0000:00:01.0: amdgpu: Fatal error during GPU init
[ 7.619637] amdgpu 0000:00:01.0: amdgpu: amdgpu: finishing device.

It goes on to load a generic display at 640x480 resolution. Reboot hangs and requires a rest. Using Ubuntu 5.15.0-60.66-generic 5.15.78 works successfully with no problems.

Jeff Schmidt (jsbiff)
description: updated
Jeff Schmidt (jsbiff)
description: updated
description: updated
Revision history for this message
grogd (abitgroggy) wrote :

I see no improvement with 5.19.0-35 versus 5.19.0-32.

Still in syslog with 5.19.0-35:

Mar 4 11:14:02 mythtv kernel: [ 8.008470] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* sw_init of IP block <uvd_v4_2> failed -19

Revision history for this message
sniglom (sniglom) wrote :

I'm having the same problem with my 290x (Hawaii).

Revision history for this message
grogd (abitgroggy) wrote :

I see that the error "amdgpu: Move buffer fallback to memcpy unavailable" comes from drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c. I downloaded the Ubuntu source code for 5.19.0-35 and compiled with the amdgpu_ttm.c from the Linux 5.19 snapshot on github. This eliminated the error and my Kaveri machine boots successfully with amdgpu. If that helps at all.

Though my Cezanne 5600G works fine with 5.19.0-35.

Revision history for this message
Terence Tan (terence-tan) wrote :

I had a similar problem. Using a 5.19-series kernel build on an AMD APU "Mullins" class:

[drm:amdgpu_device_init.cold] *ERROR* sw_init of IP block <uvd_v4_2> failed -19

I resolved the problem by cherry-picking commit 8273b4048664fff356fd10059033f0e2f5a422a1 ("drm/amdgpu: Fix for BO move issue").

I'd be interested to hear if this helps anybody else.

Revision history for this message
grogd (abitgroggy) wrote :

Yes, cherry picked commit "Fix for BO move issue" worked for me too. Using jammy-5.19.0-35.36.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.