CPU not thermal throttling at max temp (AMD 7965WX on WRX90E-SAGE motherboard)

Bug #2063165 reported by Jess Ferments
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
thermald (Ubuntu)
New
Undecided
Unassigned

Bug Description

I am running Ubuntu 23.10, and using an ASUS WRX90E-SAGE motherboard.

My CPU (AMD 7965WX) is not properly thermally throttling, and is exceeding the max operating temperature of 95C without throttling (it has gotten as high as 98 before I shut it off).

If I run CPU intensive processes that max out all cores (such as stress-ng), I can quickly exceed the maximum temp within less than a minute, and Ubuntu does nothing to throttle the CPU. I'm worried that my CPU is going to get damaged.

When I look at /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors there are only "powersave" and "performance" available ("powersave" is what it is currently set to).

If I try to set to "performance" governor in /etc/init.d/cpufrequtils and reboot, it has no effect. It just stays in "powersave" mode. (I don't know that this would help anyway as far as throttling, but I don't know what else to try)

When I run `systemctl status --lines=50 thermald` I get the following output:

========

○ thermald.service - Thermal Daemon Service
     Loaded: loaded (/lib/systemd/system/thermald.service; enabled; preset: enabled)
     Active: inactive (dead) since Mon 2024-04-22 14:34:09 PDT; 26min ago
    Process: 1878 ExecStart=/usr/sbin/thermald --systemd --dbus-enable --adaptive (code=exited, status=0/SUCCESS)
   Main PID: 1878 (code=exited, status=0/SUCCESS)
        CPU: 11ms

Apr 22 14:34:09 ML-tower systemd[1]: Starting thermald.service - Thermal Daemon Service...
Apr 22 14:34:09 ML-tower thermald[1878]: Unsupported cpu model or platform
Apr 22 14:34:09 ML-tower systemd[1]: thermald.service: Deactivated successfully.
Apr 22 14:34:09 ML-tower systemd[1]: Started thermald.service - Thermal Daemon Service.

=======

Note that it says "Unsupported cpu model or platform". The AMD 7965WX is definitely supported by this motherboard, so I'm not sure what is going on.

It does seem to be able to read the CPU thermal sensor, because when I run `sensors` I get the following output:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +46.9°C
Tccd1: +40.4°C
Tccd2: +40.6°C
Tccd3: +40.1°C
Tccd4: +39.6°C

... but I don't know how accurate the readings are, and am concerned especially since it doesn't appear tyo do anything when max temperature exceeded.

How can I ensure that my CPU will not exceed a specified temperature threshold (ideally ~93C or less)?

Please let me know if there is any other information that I could provide to help debug this issue.

ProblemType: Bug
DistroRelease: Ubuntu 23.10
Package: thermald 2.5.4-2
ProcVersionSignature: Ubuntu 6.5.0-28.29-generic 6.5.13
Uname: Linux 6.5.0-28-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.27.0-0ubuntu5
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: KDE
Date: Mon Apr 22 17:36:51 2024
InstallationDate: Installed on 2024-04-08 (15 days ago)
InstallationMedia: Kubuntu 23.10 "Mantic Minotaur" - Release amd64 (20231010)
SourcePackage: thermald
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.init.thermald.conf: [deleted]
mtime.conffile..etc.thermald.thermal-cpu-cdev-order.xml: 2023-08-25T03:29:11

Revision history for this message
Jess Ferments (jferments) wrote :
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.