Fix divide by zero errors in DML2

Bug #2106923 reported by Mario Limonciello
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
New
Undecided
Unassigned
linux (Ubuntu)
Invalid
High
Unassigned
Noble
Invalid
Undecided
Unassigned
Plucky
In Progress
Undecided
AceLan Kao
linux-oem-6.11 (Ubuntu)
Invalid
High
Edoardo Canepa
Noble
Fix Committed
Undecided
AceLan Kao
Plucky
Invalid
Undecided
Unassigned

Bug Description

[Impact]
Random crashes have been reported on GPUs that use DML2 (Ryzen AI APUs and Radeon 9000 dGPU). The crashes are root caused to corruption in registers used for floating point code.

[Fix]
They are caused by a lack of FP context protection around that code. They are fixed by these 3 commits:

https://github.com/torvalds/linux/commit/afcdf51d97cd58dd7a2e0aa8acbaea5108fa6826
https://github.com/torvalds/linux/commit/366e77cd4923c3aa45341e15dcaf3377af9b042f
https://github.com/torvalds/linux/commit/4408b59eeacfea777aae397177f49748cadde5ce

[Test]
1. Install ubuntu with 6.11 kernel on AMD platform
2. Run reboot test around 300 times
3. The system should not hang while booting up

[Where problems could occur]
The 3 commits are straightforward for the protection, and didn't involve any functional changes.

tags: added: originate-from-2106922
Edoardo Canepa (ecanepa)
Changed in linux-oem-6.11 (Ubuntu):
importance: Undecided → High
assignee: nobody → Edoardo Canepa (ecanepa)
Edoardo Canepa (ecanepa)
Changed in linux-oem-6.11 (Ubuntu):
status: New → Triaged
Edoardo Canepa (ecanepa)
Changed in linux-oem-6.11 (Ubuntu):
status: Triaged → Confirmed
Changed in linux (Ubuntu):
status: New → Confirmed
importance: Undecided → High
AceLan Kao (acelankao)
Changed in linux (Ubuntu Noble):
status: New → Invalid
Changed in linux (Ubuntu Plucky):
status: New → In Progress
assignee: nobody → AceLan Kao (acelankao)
Changed in linux-oem-6.11 (Ubuntu Noble):
status: New → In Progress
assignee: nobody → AceLan Kao (acelankao)
Changed in linux-oem-6.11 (Ubuntu Plucky):
status: New → Invalid
description: updated
AceLan Kao (acelankao)
tags: added: jira-stella-1107 oem-priority stella
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in linux-oem-6.11 (Ubuntu):
status: Confirmed → Invalid
AceLan Kao (acelankao)
Changed in linux-oem-6.11 (Ubuntu Noble):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-oem-6.11/6.11.0-1023.23 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-noble-linux-oem-6.11' to 'verification-done-noble-linux-oem-6.11'. If the problem still exists, change the tag 'verification-needed-noble-linux-oem-6.11' to 'verification-failed-noble-linux-oem-6.11'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-noble-linux-oem-6.11-v2 verification-needed-noble-linux-oem-6.11
AceLan Kao (acelankao)
tags: added: verification-done-noble-linux-oem-6.11
removed: verification-needed-noble-linux-oem-6.11
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.