Comment 180 for bug 22336

Revision history for this message
scott (sahendrickson) wrote :

Hi all,

I'm not sure if this is related, but my computer shuts down every few days because it "overheats". However, I wasn't sure that it was actually overheating, so I set up a cron job to "cat /proc/acpi/thermal_zone/*/*" to a file every minute. Last time it shut down, I had the following entries in the messages log file ...

Jun 10 20:23:13 scott-server kernel: [46118.903677] ACPI: Critical trip point
Jun 10 20:23:13 scott-server kernel: [46118.956637] ACPI: Unable to turn cooling device [df85e798] 'on'
Jun 10 20:23:14 scott-server gconfd (scott-17284): Received signal 15, shutting down cleanly
Jun 10 20:23:14 scott-server gconfd (scott-17284): Exiting

... and the following information from my cron job output ...

cooling mode: active
<polling disabled>
state: ok
temperature: 38 C
critical (S5): 70 C
passive: 55 C: tc1=4 tc2=3 tsp=60 devices=0xdf855338
active[0]: 55 C: devices=0xdf85e798

It doesn't seem that the CPU is actually overheating to me. I modified the cron job to store more information each minute so that next time I'll have more to work with.

date
cat /proc/acpi/thermal_zone/*/*
acpitool -e
smartctl -d ata /dev/sda -a The
smartctl -d ata /dev/sdb -a

Any ideas on what I could log to figure out what is happening?
I have cleaned the fans, and I know that they are turning on and off when the system gets hot.
Finally, I'm also pretty sure that the system didn't actually overheat, as nothing was being done when it overheated. Is it possible that there's a bug that misreports the temperature or a trip point sometimes?

I'd appreciate any input or help.

Thanks,
-- Scott