runaway kacpi_notify

Bug #75174 reported by Akkana Peck
48
This bug affects 6 people
Affects Status Importance Assigned to Milestone
acpi
Invalid
High
acpi (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: acpi

On my Vaio SR17 laptop, when I run a CPU-intensive process (like compiling a large program), eventually kacpi_notify always begins to take 85% or more of the machine's CPU. I can't kill kacpi, even with -9, and I can't even shut down gracefully: the machine gets through most of the shutdown sequence and then sits there with the CPU still spinning, and I have to pull the power plug.

This machine worked fine under Breezy. Please, is there any way I can disable kacpid, or control its function? I don't think it's doing anything I need (CPU speeds and the CPU fan are controlled in BIOS on this machine). Google finds tons of people having similar problems, but nobody seems to know a fix. I don't want to disable acpi entirely since ubuntu's excellent acpi suspend/hibernate support (in breezy, anyway) is the biggest reason I prefer ubuntu on this laptop.

Revision history for this message
Akkana Peck (akkzilla) wrote :

Answering my own plea: it looks like killing acpid before starting anything CPU intensive prevents the runaway kacpi_notify process. At least, I made it through a complete gimp build, which is a lot farther than I ever got when acpid was running.

I'd welcome suggestions as to how to pin this down more specifically -- perhaps there's some configuration option I could change to avoid the runaway. I'm not sure where to start in /etc/acpi.

I did find some speculation (but no confirmation) that the problem has something to do with monitoring temperature. Maybe if the temp goes too high, that freaks kacpi out for some reason, ironically sending it into a CPU-spinning feedback loop that ensures the temperature stays high?

Revision history for this message
jan-teichmann (teichmann-jan) wrote :

i have the same problem on my latitude notebook. the process kacpi_notify take 99% of the cpu. the problem happens when the notebook is running for a long time and the temperature is staying high.

Revision history for this message
Micah Abbott (micah-abbott) wrote :

I have just observed this with a fresh Feisty install on a Sun Java Workstation (w1100z).

I started a remote scp transfer to my desktop system and the desktop appeared to lock up. I ended up killing my X session and discovering that kapci_notify was consuming half of the CPU resources.

Revision history for this message
Michael Hirsch (mdhirsch) wrote :

For me it is kacpid that is a runaway using 95% of a cpu and kacpi_notify uses only 5%. It seems to happen even if I don't run anything CPU intensive--just loggign in to KDE starts it. I guess I should try logging in to a console and see what happens.

Revision history for this message
Michael Hirsch (mdhirsch) wrote :

More data: It isn't logging in that causes it for me, but starting kpilot. If kpilot doesn't start, kacpid doesn't start using all my CPU. If kpilot starts, then kacpid uses all my CPU.

Revision history for this message
paddyponchero (paddy-oherlihy) wrote :

Turning off dynamic CPUFreq seems to have eliminated the lockups for me for the moment , I just set it to performance or powersave using powersave.

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

I've had it (35% acpid) when heavily loaded, but just now I got it by accidentally clicking the undock button on my docking station.

dmesg-c says:

[91067.276000] ACPI Exception (pci_bind-0299): AE_NOT_FOUND, Unable to get data from device DCKS [20060707]
[91067.276000] ACPI: undocking

Needless to say I'm still docked, my monitor, keyboard and mouse all still work

(I'm using a Dell precision M70)

Revision history for this message
Akkana Peck (akkzilla) wrote :

This problem has gotten worse with feisty, because kacpid and kacpid_notify aren't killable. I tried getting rid of /etc/init.d/acpid, and that stopped kacpi_notify from running, but kacpid was still there and still runs out of control whenever I run a long cpu-intensive process like a long compile. Without a way to disable kacpid, feisty can't be used on this machine.

Revision history for this message
Håkan W (hwaara-gmail-deactivatedaccount) wrote :

I have the same problem, but I'm using a stationary computer.

Compiling emacs seems to be what triggered it. Even after the heavy task is done, acpi is still making the whole computer sluggish and almost unusable.

Changed in acpi:
status: New → Confirmed
Revision history for this message
Akkana Peck (akkzilla) wrote :

See also kernel bug http://bugzilla.kernel.org/show_bug.cgi?id=8274 (how do I set a remote bug watch on that?)

Revision history for this message
Håkan W (hwaara-gmail-deactivatedaccount) wrote :

Does anyone know of a workaround, when the bug is already there? I.e., when acpi is hogging my CPU, how do I fix that without rebooting the computer?

Changed in acpi:
status: Unknown → Invalid
Revision history for this message
Björn Lindqvist (bjourne) wrote :

Akkan, I sat the watch to #6944 because I thought that was the real bug. But it seems like #6944 and #8274 are duplicates and #6944 is rejected for some reason, so I changed the watch to #8274. The only workaround I know of is to give the kernel the argument acpi=off when it boots which disables power management. But that is a very poor workaround. :(

Revision history for this message
E.Lanoe (elouen-lanoe) wrote :

I have exactly the same behavior as the one described by Akkana Peck.
My system is a "Compaq Presario 2805 EA" laptop running Feisty. The bug was also present with Dapper or Edgy.

The only workaround I have is to manually set the CPU Frequency to its lowest value (1.2GHz instead of 1.4GHz) using the Gnome CPU Frequency Monitor applet (by default the profile is set to "OnDemand" and I set it to "Powersave").
With this workaround I'm able to run any CPU intensive task as long as needed.

Maybe the problem deals with CPU frequency change when running intensive tasks to prevent CPU overheating ? I'm also following bug #22336 which also seems to deal with CPU frequency change problem.

Changed in acpi:
status: Unknown → Confirmed
Revision history for this message
Adna rim (adnarim) wrote :

Just one question: why is this "undecided" are you not going to fix that?? Under edgy everything was alright but with this bug feisty starts to get totally unuseable. Try to download a torrent and watch a movie: kacpi_notify gets 98% the cpu heats and the system makes a security shutdown :( really I can't use Ubuntu on my laptop anymore through this stupid thing... did you at least figured out in the meanwhile what goes wrong with it so I can try to fix it myself?

greets

Revision history for this message
Björn Lindqvist (bjourne) wrote :

There are two workarounds:

1. Disable acpi. Add acpi=off to the kernel line in /boot/grub/menu.lst
2. Downgrade the kernel. I only suffer the bug in 2.6.20-16-generic, but NOT in 2.6.17-10-generic.

Changed in acpi:
status: Confirmed → Incomplete
Revision history for this message
Joey Adams (joeyadams3-14159) wrote :

A really quick workaround is to issue:

sudo killall klogd

This will stop the hard drive grinding (if you experience that), but that still doesn't fix the infinite loop (most likely) in kacpi_notify which hogs a lot of CPU power. You can renice it to priority 19 to make it not as bad.

If you don't mind recompiling your kernel, you can go to drivers/acpi/thermal.c, find this:

static void acpi_thermal_check(void *data)
{

and add right after it:

printk(KERN_WARNING PREFIX "Bypassing acpi_thermal_check\n");
return;
...

This will disable thermal checking, meaning the fan won't come on when the processor gets busy (don't blame me if your CPU overheats, which it probably won't). I did this myself, and it got rid of the kacpi_notify thrashing completely.

If you're a kernel developer, see if you can find the infinite loop in acpi_thermal_check :)

Changed in acpi:
status: Incomplete → Confirmed
Changed in acpi:
status: Confirmed → In Progress
Changed in acpi:
status: In Progress → Confirmed
Changed in acpi:
status: Confirmed → Incomplete
Changed in acpi:
status: Incomplete → Invalid
Revision history for this message
Peter Cordes (peter-cordes) wrote :

This bug happened for me on an Acer desktop machine: Veriton 7200 (mobo S81M, even after upgrading to latest bios revision: R01-F3).

The machine has Debian on its hard drive, and Debian's 2.6.18-6-686 doesn't have the problem.
Debian's 2.6.24-etchnhalf.1-686 _does_ have the problem. It's very repeatable by running burnP6 or burnK7. yacpi hangs until burnK7 is stopped and the temp comes down, while kacpi_notify uses lots of CPU...
Debian's 2.6.25 backport (linux-image-2.6.25-2-68 2.6.25-6~bpo40+1) doesn't have the problem, so that's what I'm going to run here.

Intrepid's i386 Desktop alpha 4 (kernel 2.6.26-5.15-generic) doesn't have the problem.
I don't think I've tried Hardy. (The machine's CDROM drive door is stuck closed, and it doesn't boot from its USB1.1 ports. I booted intrepid by copying vmlinuz and initrd.gz to the HD, and loading them with GRUB, with iso-scan/filename=... Whoever though of and implemented iso-scan/filename=, nice job!)

 So this is a good sign that this bug is in fact going away in newer kernels. If I had found the problem with Intrepid's kernel, I would have followed up on bugzilla.kernel.org.

Revision history for this message
Jos Dehaes (jos-dehaes) wrote :

I have this bug every time I resume from suspend on a dell latitude D820 (core duo). Hardy on same laptop did not have this problem. Intrepid current 2.6.27-3 kernel.

Revision history for this message
Christian Assing (chassing) wrote :

i have this bug too. after resume kacpi_notify consumes all cpu :(
in hardy suspend/resume works fine, but after dist-upgrade to intrepid i have this bug. i use a dell d620 core duo with 32bit.

Revision history for this message
Peter Schüller (schueller-p) wrote :

I use hardy with Kernel and have the problem with both kernels linux-image-2.6.24-19-386 and linux-image-2.6.24-19-generic.

The problem happens sporadically but mostly when some process is using the harddisk or the CPU "more than just a bit". The problem also disappears sporadically - to come back later again.

I'm looking forward to testing this with Intrepid.

Revision history for this message
Gabriel Thörnblad (gabriel-thornblad) wrote :

Jos and Christian, I think this is a new bug. The symptoms are the same with kacpid or kacpi_notify consuming all CPU but it appears after resuming from suspend (to ram or disk) and not while running the system. It is also interesting that at all three of us have Dell laptops. I filed a new report as bug #280088. If anyone think it is a duplicate, please mark it as such.

Revision history for this message
Jos Dehaes (jos-dehaes) wrote :

This appears to be fixed in latest intrepid.

Revision history for this message
yuppie4ever (yuppie4ever) wrote :

Not really fixed in Intrepid. I'm still seeing it with kernel 2.6.27-9-generic running on a Dell D630.

Revision history for this message
Brett Alton (brett-alton-deactivatedaccount) wrote :

This still effects my HP Pavillion 503n and a200n using Ubuntu 8.04.4.

'kacpid' and 'kacpi_notify' need to be killed using 'sudo killall' while grub needs to load the kernel with the 'acpi=off' and 'apm=off' flags.

Revision history for this message
Erik de Castro Lopo (erikd) wrote : Re: [Bug 75174] Re: runaway kacpi_notify

Brett Alton wrote:

> This still effects my HP Pavillion 503n and a200n using Ubuntu 8.04.4.
>
> 'kacpid' and 'kacpi_notify' need to be killed using 'sudo killall' while
> grub needs to load the kernel with the 'acpi=off' and 'apm=off' flags.

In my case, this turned out to be a motherboard bios bug:

    http://bugzilla.kernel.org/show_bug.cgi?id=12620

After harrassing the motherboard manufacturer we got a bios update
that fixed this issue.

HTH,
Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Revision history for this message
Michael Hirsch (mdhirsch) wrote :

Still a problem with Kubutu 9.10 on my Dual Opteron 64-bit system.

Revision history for this message
rewind (ttanev) wrote :

Still a problem with Ubuntu 10.10 with fglrx 10.11 too. Was the same in 9.10 with fglrx >= 10.1 and 10.04 with all versions. kacpi_notify becomes unresponsive after going from AC to battery and after not more than minute the computer freezes.

Revision history for this message
pacanukeha (c-launchpad-pacanukeha-net) wrote :

I just got this on an up-to-date 10.10, on an Alienware R17x (intel quad-core) when the AC power went out for 2 seconds.

Changed in acpi:
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.