Comment 23 for bug 2036742

Revision history for this message
Mario Limonciello (superm1) wrote : Re: amdgpu crash on Mantic

To explain some of the differences in the logs:

* UVD is the IP block that fails to init (so no "UVD and UVD ENC initialized successfully").
* Once an IP block fails, next one isn't even tried (so no "VCE initialized successfully")
* kfd doesn't initialize because amdgpu_amdkfd_device_init() won't be called until end of hw_init (all blocks must succeed)

Between 5.13.19 and 5.14-rc2 there aren't any UVD changes that would cause this.

$ git log --oneline v5.13.19..v5.14-rc2 amdgpu/amdgpu_uvd.c amdgpu/uvd_v6_0.c
09b020bb05a5 Merge tag 'drm-misc-next-2021-06-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
d3fae3b3daac dma-buf: drop the _rcu postfix on function names v3
5745d647d556 Merge tag 'amd-drm-next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
66c46621c812 amdgpu: remove unreachable code
ae4c0d7674a7 drm/amdgpu: make sure we unpin the UVD BO
f89f8c6bafd0 drm/amdgpu: Guard against write accesses after device removal
355b60296143 Merge drm/drm-next into drm-misc-next

This tells me it's probably something outside of the direct code causing it. Without seeing further steps into the bisect the biggest suspect in my mind is 0064b0ce85bb.

Can you please try amdgpu.aspm=0?