Request: Scale back CPU and GPU to avoid notebook overheating

Bug #990731 reported by Ben Linsey-Bloom
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

I've found this happening an increasing amount since upgrading to 12.04.

When under heavy load my laptop has all fans at full speed but still often reachs critical heat (on the hard drive I think) and shuts itself off.

I think this is mostly poor hardware design but it would be awesome if linux was able to detect dangerous temperatures, and if fans are already on full, scale CPU and GPU operations down to reduce heat so that a fatal shut-off can be avoided.

I have an HP Pavilion with Intel core i7 and ATI Radeon HD 5000.

I've installed indicator-cpufreq so I can manually control CPU frequency when at high load. This seems to effectively reduce the temperature but it would be good if the system could do this automatically.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu7
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: ben 2326 F.... pulseaudio
 /dev/snd/controlC0: ben 2326 F.... pulseaudio
 /dev/snd/pcmC0D0p: ben 2326 F...m pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xd4100000 irq 48'
   Mixer name : 'IDT 92HD81B1X5'
   Components : 'HDA:111d7605,103c1448,00100402'
   Controls : 16
   Simple ctrls : 9
Card1.Amixer.info:
 Card hw:1 'Generic'/'HD-Audio Generic at 0xd4020000 irq 49'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100200'
   Controls : 6
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=d8ef4f63-5434-4d95-80cd-649702cfb3b3
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: Hewlett-Packard HP Pavilion dv6 Notebook PC
NonfreeKernelModules: fglrx
Package: linux (not installed)
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-24-generic root=UUID=e8743556-da7b-440d-bc19-956ba1e48ca5 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-24.37-generic 3.2.14
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic N/A
 linux-backports-modules-3.2.0-24-generic N/A
 linux-firmware 1.79
SourcePackage: linux
StagingDrivers: mei
Tags: precise staging precise staging
Uname: Linux 3.2.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 10/21/2010
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.23
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 1448
dmi.board.vendor: Hewlett-Packard
dmi.board.version: 65.35
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.23:bd10/21/2010:svnHewlett-Packard:pnHPPaviliondv6NotebookPC:pvr058A110000242B10010020100:rvnHewlett-Packard:rn1448:rvr65.35:cvnHewlett-Packard:ct10:cvrN/A:
dmi.product.name: HP Pavilion dv6 Notebook PC
dmi.product.version: 058A110000242B10010020100
dmi.sys.vendor: Hewlett-Packard

summary: - Series overheating issues: Scale back CPU and GPU to avoid overheating
+ Serious overheating issues: Scale back CPU and GPU to avoid overheating
notebooks?
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 990731

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
summary: - Serious overheating issues: Scale back CPU and GPU to avoid overheating
- notebooks?
+ Request: Scale back CPU and GPU to avoid notebook overheating
description: updated
tags: added: apport-collected precise staging
description: updated
Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : AcpiTables.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : AlsaDevices.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : AplayDevices.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : BootDmesg.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : CRDA.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : Card1.Codecs.codec.0.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : IwConfig.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : Lspci.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : Lsusb.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : PciMultimedia.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : ProcEnviron.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : ProcModules.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : PulseList.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : RfKill.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : UdevDb.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : UdevLog.txt

apport information

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Daniel Letzeisen (dtl131) wrote :

Ben L-B, what changes do you make to cpufreq to reduce heat? A "before and after" output from cpufreq-info would be helpful. The ondemand governor should be used by default to control the CPU, scaling it down when idle and running at full voltage/speed.

Oh, and side note: it is kind of taboo to confirm one's own bug ;P

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → Incomplete
Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote :
Download full text (6.9 KiB)

cpufreq-indicator shows a range of frequency options between 1.60GHz and 0.93GHz. I found that changing cpufreq-indicator from 'Ondemand' (default) to 'Powersave' or 0.93GHz seems to help the most.

I think my system is using 'Ondemand' as it is supposed to but I thought it would be a good to add a feature (automatic, behind the scenes and integrated with the kernal? ) where the CPU frequency is capped when sensors report heat near critical. Then returning to normal when temperatures are sensible again.

Here is the output from cpufreq-info:
ben@mr-shinyface:~$ cpufreq-info
cpufrequtils 007: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to <email address hidden>, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1 2 3 4 5 6 7
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 933 MHz - 1.60 GHz
  available frequency steps: 1.60 GHz, 1.60 GHz, 1.47 GHz, 1.33 GHz, 1.20 GHz, 1.07 GHz, 933 MHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 933 MHz and 1.60 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 933 MHz.
  cpufreq stats: 1.60 GHz:4.32%, 1.60 GHz:0.06%, 1.47 GHz:0.08%, 1.33 GHz:0.10%, 1.20 GHz:0.16%, 1.07 GHz:0.30%, 933 MHz:94.97% (19910)
analyzing CPU 1:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1 2 3 4 5 6 7
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency: 10.0 us.
  hardware limits: 933 MHz - 1.60 GHz
  available frequency steps: 1.60 GHz, 1.60 GHz, 1.47 GHz, 1.33 GHz, 1.20 GHz, 1.07 GHz, 933 MHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 933 MHz and 1.60 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 933 MHz.
  cpufreq stats: 1.60 GHz:3.69%, 1.60 GHz:0.03%, 1.47 GHz:0.03%, 1.33 GHz:0.05%, 1.20 GHz:0.06%, 1.07 GHz:0.10%, 933 MHz:96.04% (13017)
analyzing CPU 2:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1 2 3 4 5 6 7
  CPUs which need to have their frequency coordinated by software: 2
  maximum transition latency: 10.0 us.
  hardware limits: 933 MHz - 1.60 GHz
  available frequency steps: 1.60 GHz, 1.60 GHz, 1.47 GHz, 1.33 GHz, 1.20 GHz, 1.07 GHz, 933 MHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 933 MHz and 1.60 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 933 MHz.
  cpufreq stats: 1.60 GHz:3.42%, 1.60 GHz:0.04%, 1.47 GHz:0.05%, 1.33 GHz:0.04%, 1.20 GHz:0.05%, 1.07 GHz:0.09%, 933 MHz:96.32% (10669)
analyzing CPU 3:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1 2 3 4 5 6 7
  CPUs which need to have their...

Read more...

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote :

Also sorry, I'm not very familiar with how things are done around here :)

Revision history for this message
Daniel Letzeisen (dtl131) wrote :

Ah, I see what you're proposing. It's a cool idea, but I think it would be difficult because different systems/CPU's have different thermal limits. Also, it's sort of a band-aid solution. A system should really be able to run at 100% load indefinitely without overheating (overclocking notwithstanding). If it can't, then the cooling is simply insufficient.
I think the best thing in this situation is to figure out what is causing the overheating. I didn't know systems would automatically shut down because of HD temps, but I could very well be wrong. So if you suspect overheating of the hard disk, it would probably be wise to look at powertop and see what is causing the HD wake-ups/activity. htop is a useful tool as well.

Changed in linux (Ubuntu):
status: Incomplete → New
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a kernel version where you were not having this particular problem? This will help determine if the problem you are seeing is the result of the introduction of a regression, and when this regression was introduced.

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Also, would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc4-precise/

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also attach your /var/log/syslog file, so we can see what thermal events are in there?

Revision history for this message
Ben Linsey-Bloom (ben-kitserve) wrote :

I don't have time for changing my kernel at the moment (will it break anything like wireless that I need for work?). Hopefully I'll be able to do that in a few weeks though.

I've attached my /var/log/syslog file. I haven't had any "emergency shut-offs" for several days though because I haven't been doing such CPU-intensive tasks and I've been using indicator-cpufreq to keep the heat down since I discovered it.

penalvch (penalvch)
tags: added: kernel-therm needs-upstream-testing
Revision history for this message
madbiologist (me-again) wrote :

On the graphics side of things, your AMD/ATI Radeon Mobility HD 5650M "Madison" may be getting too hot. You can enable power management for it by using one of the methods described at http://xorg.freedesktop.org/wiki/RadeonFeature/#index3h2

The first two methods are only rudimentary but are available now. The 3rd (dpm) method is much better and will debut in the upcoming 3.11 linux kernel. The first release candidate (3.11-rc1) of the 3.11 kernel should be available any day now at http://kernel.ubuntu.com/~kernel-ppa/mainline/ and instructions on how to install and uninstall it are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds
Or you could wait and see whether the currently under development Ubuntu 13.10 "Saucy Salamander" updates to the 3.11 kernel before it's final release on 17th October 2013.

If you do update to the 3.11 kernel, to use the dpm method you will need to select it at boot by adding radeon.dpm=1 to your GRUB kernel boot options as described at https://help.ubuntu.com/community/Grub2/Troubleshooting#Editing_the_GRUB_2_Menu_During_Boot

Revision history for this message
madbiologist (me-again) wrote :

See the blog post at http://www.botchco.com/agd5f/?p=57 for further information.

Unlike the older dynpm method, the new DPM method works with multiple monitors and there shouldn't be any flickering as the performance level changes are handled by dedicated hardware rather than the driver.

Revision history for this message
madbiologist (me-again) wrote :

I neglected to mention that to use the new power management feature on R700 and newer hardware (other than APUs) requires installation of the latest AMD graphics microcode (ucode) files to /lib/firmware/radeon
These are available at http://people.freedesktop.org/~agd5f/radeon_ucode/
Get the version ending in "smc".

R700 basically means Radeon HD 4000 series and newer. However note that according to Wikipedia and http://xorg.freedesktop.org/wiki/RadeonFeature/#index5h2 the Mobility Radeon HD 4225/4250 is a RV620 chip, so anyone with one of those shouldn't need the updated firmware files.

Revision history for this message
Daniel Letzeisen (dtl131) wrote :

@madbiologist: good info on the radeon open-source improvements, but the OP is (was?) using fglrx on Precise. I'm marking this Incomplete because it is dependent on the OP's specific BIOS, and it's difficult to tell whether the overheating has anything to do with the OS without more info, such as whether this happens on other systems (i.e. Windows).

@OP: As best I can tell, the Core i7 doesn't start thermal throttling until 100C, and at that temp OEM's may just initiate ACPI shutdown. As far as automating the switch to powersave based on temp, I'm pretty sure that could be done with a script of some sort (and one may very well exist).

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.