Fix power state transition on navi4x

Bug #2122659 reported by Anson Tsao
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-firmware (Ubuntu)
Status tracked in Resolute
Noble
Fix Released
Undecided
Leo Lin
Plucky
Fix Released
Undecided
Leo Lin
Questing
Fix Released
Undecided
Leo Lin
Resolute
Fix Released
Undecided
Leo Lin

Bug Description

[Impact]

Navi44/48 support was enabled earlier (Bug #2117517 / #2092225).
However, a power state transition issue has been observed when unloading/reloading amdgpu.

This issue can be addressed with updated GPU firmware, which resolves the problematic transitions

[ 343.133908] amdgpu 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 343.133975] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x7551 0x1002:0x7551 0xC0).
[ 343.133983] [drm] register mmio base: 0xDD600000
[ 343.133983] [drm] register mmio size: 524288
[ 343.134027] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.135995] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.138991] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.141991] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.144987] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.145993] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.147993] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.150993] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.153992] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 343.156995] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 348.134985] xgpu_nv_mailbox_trans_msg: 2480 callbacks suppressed
[ 348.134990] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 348.136987] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !
[ 348.138986] amdgpu 0000:03:00.0: amdgpu: trn=2 ACK should not assert! wait again !

[ Fix ]
Pulls changes from linux-firmware from the following files, so that they are
updated to the latest:

DMCUB:
amdgpu/dcn_4_0_1_dmcub.bin

GC:
amdgpu/gc_12_0_1_imu.bin
amdgpu/gc_12_0_1_me.bin
amdgpu/gc_12_0_1_mec.bin
amdgpu/gc_12_0_1_pfp.bin
amdgpu/gc_12_0_1_pfp.bin
amdgpu/gc_12_0_1_rlc.bin
amdgpu/gc_12_0_1_uni_mes.bin

PSP:
amdgpu/psp_14_0_3_sos.bin
amdgpu/psp_14_0_3_ta.bin

SDMA:
amdgpu/sdma_7_0_1.bin

SMU:
amdgpu/smu_14_0_3.bin

[Test Plan]
Repeat the flow 20x for unload/reload
sudo modprobe amdgpu
sudo moprobe -r amdgpu

[Where problems could occur]
These new AMDGPU FWs are only for navi4x, no further effect on other naviX family.

[Other info]
66a604e1 amdgpu: update GC 12.0.1 firmware
82687ff0 amdgpu: update gc 12.0.1 firmware
118bd6c7 amdgpu: update gc 12.0.1 firmware
0a5ac406 amdgpu: update SDMA 7.0.1 firmware
06f096fc amdgpu: update PSP 14.0.3 firmware
72a8d254 amdgpu: update smu 14.0.3 firmware
a4a82784 amdgpu: DMCUB updates for various ASICs

tags: added: kernel-daily-bug
Leo Lin (0xff07)
Changed in linux-firmware (Ubuntu):
assignee: nobody → Leo Lin (0xff07)
Revision history for this message
Leo Lin (0xff07) wrote :

Hmmm it seems that I'm no longer be able to target to series? Anyway Questing has them all.

Leo Lin (0xff07)
description: updated
Revision history for this message
Leo Lin (0xff07) wrote :
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hey Leo,
let me know if you need other series added.

Revision history for this message
Leo Lin (0xff07) wrote :

Thanks for your help Sam! This is good for now.

Leo Lin (0xff07)
Changed in linux-firmware (Ubuntu Plucky):
assignee: nobody → Leo Lin (0xff07)
Changed in linux-firmware (Ubuntu Noble):
assignee: nobody → Leo Lin (0xff07)
Juerg Haefliger (juergh)
Changed in linux-firmware (Ubuntu Questing):
status: New → Fix Released
Changed in linux-firmware (Ubuntu Resolute):
status: New → Fix Released
Revision history for this message
Juerg Haefliger (juergh) wrote :

You're mentioning that this is a follow-on fix for bug 2117517 which is Radeon 9060 XT. Don't we need updated 'gc_12_0_0_*', 'sdma_7_0_1.bin' and 'psp_14_0_2_*' for that GPU?

Revision history for this message
Anson Tsao (ansontsao) wrote :

Radeon 9060/9070 are part of the Navi4x family, and the existing firmware are sufficient for 9060.

Revision history for this message
Juerg Haefliger (juergh) wrote :

Per https://dri.freedesktop.org/docs/drm/gpu/amdgpu/amd-hardware-list-info.html
9060 also wants gc_12_0_0_*, sdma_7_0_1.bin and psp_14_0_2_* which are *not* updated by the PRs.

Revision history for this message
Juerg Haefliger (juergh) wrote :

Oh you're saying we don't need to update them for this bug? Sigh. I wish there were useful changelogs for firmware blobs :-(

Juerg Haefliger (juergh)
Changed in linux-firmware (Ubuntu Plucky):
status: New → Fix Committed
Changed in linux-firmware (Ubuntu Noble):
status: New → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Anson, or anyone else affected,

Accepted linux-firmware into plucky-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20250317.git1d4c88ee-0ubuntu1.10 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-plucky to verification-done-plucky. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-plucky. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Anson, or anyone else affected,

Accepted linux-firmware into noble-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20240318.git3b128b60-0ubuntu2.20 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-noble to verification-done-noble. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-noble. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Anson Tsao (ansontsao) wrote :

Internal team verified 100 rounds of GPU reload test with 6.14-generic & linux-firmware-20240318.git3b128b60-0ubuntu2.20 on Navi48XTW, no further issue found.

tags: added: verification-done-noble
Revision history for this message
Leo Lin (0xff07) wrote (last edit ):

Internal team also verified 100 rounds of reloading for Plucky (0ubuntu1.10 with kernel 6.14.0-35-generic) on a Navi48XTW, and confirmed that issue not reproducible.

(Special thanks to @ansontsao for tirelessly explaining the SRU policy to internal teams)

tags: added: verification-done-plucky
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Anson, or anyone else affected,

Accepted linux-firmware into plucky-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20250317.git1d4c88ee-0ubuntu1.11 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-plucky to verification-done-plucky. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-plucky. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Anson, or anyone else affected,

Accepted linux-firmware into noble-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20240318.git3b128b60-0ubuntu2.21 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-noble to verification-done-noble. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-noble. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Leo Lin (0xff07) wrote :

Between -0ubuntu2.20 to -0ubuntu2.21 and -0ubuntu1.10 to -0ubuntu1.11[2] the only differences are the WCN7851 firmware files from, which is nowhere to be used by amdgpu driver.

Also confirmed that the checksum for firmware files in this SRU remain the same in -0ubuntu2.21 and -0ubuntu1.11. So the verification result should hold.

Hashes for P

237c036706f2a93ac5a630433674d553 amdgpu/dcn_4_0_1_dmcub.bin
6bbe2051c6d879dfbdc68a3d306abcaf amdgpu/gc_12_0_1_imu.bin
fd961b3c8f3847680b686ed46b9f3e33 amdgpu/gc_12_0_1_me.bin
4eb0cfc35595911179660d441e8e60e7 amdgpu/gc_12_0_1_mec.bin
144bf5070e57c5d75219e72fe56473a1 amdgpu/gc_12_0_1_pfp.bin
e72144839f90b00ffd576399912fc366 amdgpu/gc_12_0_1_uni_mes.bin
e27fae99d31671327069b142a65d2515 amdgpu/psp_14_0_3_sos.bin
d4c96168891d7844369a442ab2857345 amdgpu/psp_14_0_3_ta.bin
12a779227def6a9fc484517c62e20899 amdgpu/sdma_7_0_1.bin
7cf5dc1d929bc54a562eac883290e977 amdgpu/smu_14_0_3.bin

Hashes for N

237c036706f2a93ac5a630433674d553 amdgpu/dcn_4_0_1_dmcub.bin
6bbe2051c6d879dfbdc68a3d306abcaf amdgpu/gc_12_0_1_imu.bin
fd961b3c8f3847680b686ed46b9f3e33 amdgpu/gc_12_0_1_me.bin
4eb0cfc35595911179660d441e8e60e7 amdgpu/gc_12_0_1_mec.bin
144bf5070e57c5d75219e72fe56473a1 amdgpu/gc_12_0_1_pfp.bin
30b29c1a630aea984d4733510d9a9be8 amdgpu/gc_12_0_1_rlc.bin
e72144839f90b00ffd576399912fc366 amdgpu/gc_12_0_1_uni_mes.bin
e27fae99d31671327069b142a65d2515 amdgpu/psp_14_0_3_sos.bin
d4c96168891d7844369a442ab2857345 amdgpu/psp_14_0_3_ta.bin
12a779227def6a9fc484517c62e20899 amdgpu/sdma_7_0_1.bin
7cf5dc1d929bc54a562eac883290e977 amdgpu/smu_14_0_3.bin

[1] https://kernel.ubuntu.com/forgejo/kernel/linux-firmware/commit/b26f0d211f10790eb70d3e8c2095333154131173
[2] https://kernel.ubuntu.com/forgejo/kernel/linux-firmware/commit/7edd3af00001e8fbfdf355b16cd55357c34fa1b6

Revision history for this message
Juerg Haefliger (juergh) wrote :

Yes, the new upload only reverts ath12k firmware so the verification for this bug is still valid.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.6 KiB)

This bug was fixed in the package linux-firmware - 20240318.git3b128b60-0ubuntu2.21

---------------
linux-firmware (20240318.git3b128b60-0ubuntu2.21) noble; urgency=medium

  * [SRU] eDP showing garbage on GPT1 platforms (LP: #2129011)
    - amdgpu: update PSP 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
    - amdgpu: update psp 14.0.0 firmware
    - amdgpu: update psp 14.0.0 firmware
    - amdgpu: update psp 14.0.0 firmware
    - amdgpu: update psp 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
  * [SRU] Update GC 11.5.1 firmware to enable MES WA on Strix Halo (LP: #2129150)
    - amdgpu: update gc 11.5.1 firmware
    - amdgpu: update GC 11.5.1 firmware
    - amdgpu: update GC 11.5.1 firmware
  * [SRU] Fix eDP showing garbage on GPT2 platforms (LP: #2129172)
    - amdgpu: update PSP 14.0.4 firmware
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update PSP 14.0.4 firmware
    - amdgpu: update PSP 14.0.4 firmware
  * Fix power state transition on navi4x (LP: #2122659)
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update GC 12.0.1 firmware
    - amdgpu: update sdma 7.0.1 firmware
    - amdgpu: update sdma 7.0.1 firmware
    - amdgpu: update SDMA 7.0.1 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update gc 14.0.3 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update PSP 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: DMCUB updates for various AMDGPU ASICs
    - amdgpu: DMCUB updates for various AMDGPU ASICs
    - amdgpu: DMCUB updates forvarious AMDGPU ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB update for DCN401
    - amdgpu: DCUB update for DCN401 and DCN315
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: update dcn 4.01 firmware to 0.1.3.0
    - amdgpu: update dcn 4.01 frmware to 0.1.6.0
    - amdgpu: update dcn 4.01 firmware to 0.1.8.0
    - amdgpu: updates for dcn 3.20 and dcn 4.01 firmware to 0.1.10.0
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: Update DMCUB fw for DCN401 & DCN315
    - amdgpu: update dmcub fw for dcn32 and dcn401
    - amdgpu: update dmcub fw for dcn401
    - amdgpu: DMCUB updates for DCN401
    - amdgpu: update dmcub fw for various DCN version
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
  * Missing vendor/product/sku specific ISH firmware for Dell laptops (LP: #2094768)
    - intel: ish: ...

Read more...

Changed in linux-firmware (Ubuntu Noble):
status: Fix Committed → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Update Released

The verification of the Stable Release Update for linux-firmware has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.4 KiB)

This bug was fixed in the package linux-firmware - 20250317.git1d4c88ee-0ubuntu1.11

---------------
linux-firmware (20250317.git1d4c88ee-0ubuntu1.11) plucky; urgency=medium

  * [SRU] eDP showing garbage on GPT1 platforms (LP: #2129011)
    - amdgpu: update psp 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
    - amdgpu: update PSP 14.0.0 firmware
  * [SRU] Update GC 11.5.1 firmware to enable MES WA on Strix Halo (LP: #2129150)
    - amdgpu: update gc 11.5.1 firmware
    - amdgpu: update GC 11.5.1 firmware
    - amdgpu: update GC 11.5.1 firmware
  * [SRU] Fix eDP showing garbage on GPT2 platforms (LP: #2129172)
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update psp 14.0.4 firmware
    - amdgpu: update PSP 14.0.4 firmware
    - amdgpu: update PSP 14.0.4 firmware
  * Fix power state transition on navi4x (LP: #2122659)
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update gc 12.0.1 firmware
    - amdgpu: update GC 12.0.1 firmware
    - amdgpu: update sdma 7.0.1 firmware
    - amdgpu: update SDMA 7.0.1 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update psp 14.0.3 firmware
    - amdgpu: update PSP 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: update smu 14.0.3 firmware
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: update dcn 4.01 firmware to 0.1.3.0
    - amdgpu: update dcn 4.01 frmware to 0.1.6.0
    - amdgpu: update dcn 4.01 firmware to 0.1.8.0
    - amdgpu: updates for dcn 3.20 and dcn 4.01 firmware to 0.1.10.0
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: Update DMCUB fw for DCN401 & DCN315
    - amdgpu: update dmcub fw for dcn32 and dcn401
    - amdgpu: update dmcub fw for dcn401
    - amdgpu: DMCUB updates for DCN401
    - amdgpu: update dmcub fw for various DCN version
    - amdgpu: DMCUB updates for various ASICs
    - amdgpu: DMCUB updates for various ASICs
  * Missing vendor/product/sku specific ISH firmware for Dell laptops (LP: #2094768)
    - intel: ish: Update license file for ISH
    - linux-firmware: Add Dell ISH firmware for Intel Lunar Lake systems
  * [SRU] Fix amdgpu loading errors on Navi3x systems (LP: #2125139)
    - amdgpu: update vcn 4.0.0 firmware
    - amdgpu: update vcn 4.0.0 firmware
    - amdgpu: update VCN 4.0.0 firmware
    - amdgpu: update VCN 4.0.0 firmware
    - amdgpu: update smu 13.0.0 firmware
    - amdgpu: update SMU 13.0.0 firmware
    - amdgpu: update SDMA 6.0.0 firmware
    - amdgpu: update SDMA 6.0.0 firmware
    - amdgpu: update psp 13.0.0 firmware
    - amdgpu: update psp 13.0.0 firmware
    - amdgpu: update PSP 13.0.0 firmware
    - amdgpu: update gc 11.0.0 firmware
    - amdgpu: update gc 11.0.0 firmware
    - amdgpu: update GC 11.0.0 firmware
    - amdgpu: update GC 11.0.0 firmware
    - amdgpu: updates for dcn 3.20 and dcn 4.01 firmware to 0.1.10.0
    - amdgpu: update dmcub fw for dcn32 and dcn401
    - amdgpu: add gc 11.0.0 kicker firmware
    - amdgpu: add psp 13.0.0 kicker firmware
    - amdgpu: update PSP 13.0.0 kicker firmware
    - amdgpu: update PSP 13.0.0 kicker firmware
    - amdgpu: add smu 13.0.0...

Read more...

Changed in linux-firmware (Ubuntu Plucky):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.