[HP ProBook 4530s / Lenovo G580] Intel Powerclamp is Slowing CPU

I upgraded from KUbuntu 14.04 to 14.10. Immediately, while playing minecraft with my daughter, I noticed the game was running very slowly. Even after closing the game, my web browsing slowed to a crawl.

Looking at the processes in top, I noticed there were 4 tasks called kidle_inject running at 50% CPU each. After spending a long time figuring out what caused this, I found out this is caused by the Intel Powerclamp driver. I never did anything to enable this, and haven't observed this slowdown before the upgrade.

I found the powerclamp driver in the /sys/class/thermal/cooling_device5 directory:

root@bkat-HP-ProBook-4530s:/sys/class/thermal/cooling_device5# cat type
root@bkat-HP-ProBook-4530s:/sys/class/thermal/cooling_device5# cat cur_state
root@bkat-HP-ProBook-4530s:/sys/class/thermal/cooling_device5# cat max_state

I tried to echo 0 to cur_state, but it still reported -1 when I catted the value.

There seems to be no obvious way to disable this, and this should not have been enabled in the first place.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: linux-image-3.16.0-24-generic 3.16.0-24.32
ProcVersionSignature: Ubuntu 3.16.0-24.32-generic 3.16.4
Uname: Linux 3.16.0-24-generic x86_64
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
 /dev/snd/controlC0: bkat 2780 F.... pulseaudio
CurrentDesktop: KDE
Date: Mon Nov 3 23:16:04 2014
HibernationDevice: RESUME=UUID=35e1d874-a3bd-4227-acab-16f3e929d993
InstallationDate: Installed on 2012-04-22 (925 days ago)
InstallationMedia: Kubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
MachineType: Hewlett-Packard HP ProBook 4530s
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.16.0-24-generic root=UUID=743f1686-bfe7-4e32-911d-5371633ccec8 ro quiet splash vt.handoff=7
 linux-restricted-modules-3.16.0-24-generic N/A
 linux-backports-modules-3.16.0-24-generic N/A
 linux-firmware 1.138
SourcePackage: linux
UpgradeStatus: Upgraded to utopic on 2014-11-03 (0 days ago)
dmi.bios.date: 05/13/2011
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68SRR Ver. F.09
dmi.board.name: 167C
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 22.1A
dmi.chassis.asset.tag: CNU1284V1X
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr68SRRVer.F.09:bd05/13/2011:svnHewlett-Packard:pnHPProBook4530s:pvrA0001D02:rvnHewlett-Packard:rn167C:rvrKBCVersion22.1A:cvnHewlett-Packard:ct10:cvr:
dmi.product.name: HP ProBook 4530s
dmi.product.version: A0001D02
dmi.sys.vendor: Hewlett-Packard

William Katcher (katcherw) wrote:

This change was made by a bot.

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-rc3-vivid/

Joseph Salisbury (jsalisbury) wrote:

Can you also review your syslog file and see if there are any warnings in there about thermal temperature?

tags: added: kernel-bug-exists-upstream
William Katcher (katcherw) wrote :

There are no warnings whatsoever in my syslog file about thermal temperature.

I do see this message that seems to correspond with the slowdown:

    intel_powerclamp: Start idle injection to reduce power

I have tried the 3.18.0-031800rc3-generic kernel. After running for a few minutes, I did not see a problem. However, my PC soon froze. I rebooted and it froze again. There may not have been enough time for the idle injection to kick in, so I can't say if that kernel fixed the problem or not.

I then installed the 3.16.0-031600-generic kernel. I did not experience a noticeable slowdown, and I didn't observe any kidle_inject threads using top. However, looking at my dmesg, I see the following messages:

[ 353.440226] intel_powerclamp: Start idle injection to reduce power
[ 357.442949] intel_powerclamp: Stop forced idle injection
[ 389.455697] intel_powerclamp: Start idle injection to reduce power
[ 393.455967] intel_powerclamp: Stop forced idle injection
[ 397.478979] intel_powerclamp: Start idle injection to reduce power
[ 401.479785] intel_powerclamp: Stop forced idle injection
[ 548.344665] chrome[3265]: segfault at 1f8 ip 00007f7e8ac2863f sp 00007fffd9265480 error 4 in i965_dri.so[7f7e8a8d5000+51c000]
[ 548.954533] chrome[3372]: segfault at 1f8 ip 00007fede229f63f sp 00007ffff1dc0e90 error 4 in i965_dri.so[7fede1f4c000+51c000]
[ 549.152357] chrome[3382]: segfault at 1f8 ip 00007fd000a4163f sp 00007fff3537ea60 error 4 in i965_dri.so[7fd0006ee000+51c000]
[ 768.798349] intel_powerclamp: Start idle injection to reduce power
[ 772.800659] intel_powerclamp: Stop forced idle injection
[ 776.827334] intel_powerclamp: Start idle injection to reduce power
[ 780.826651] intel_powerclamp: Stop forced idle injection
[ 796.853437] intel_powerclamp: Start idle injection to reduce power
[ 800.852951] intel_powerclamp: Stop forced idle injection
[ 804.972279] intel_powerclamp: Start idle injection to reduce power
[ 808.972072] intel_powerclamp: Stop forced idle injection
[ 812.992079] intel_powerclamp: Start idle injection to reduce power
[ 816.993884] intel_powerclamp: Stop forced idle injection
[ 821.115992] intel_powerclamp: Start idle injection to reduce power
[ 841.124582] intel_powerclamp: Stop forced idle injection

Because I don't believe the idle injection should be enabled at all, I tagged this with the kernel-bug-exists-upstream tag.

One additional datapoint: I tested with the vmlinuz-3.13.0-39-generic kernel, and I saw the same problem.

tags: added: bios-outdated-f.41
I found a workaround by rmmod intel_powerclamp. This removed the kidle_inject threads and the speed was returned to normal.

However, I looked at my CPU temperature, and it was 87 deg C, over the high threshold. It is possible that the idle injection is a new kernel feature that is kicking in to protect my CPU from overheating. Perhaps in older kernels I didn't have any protection or there was a different mechanism. Or perhaps something in the new kernel is causing the CPUs to run hot.

I suppose this bug could be closed, since it is possible that the symptom is actually a feature.

Joseph Salisbury (jsalisbury) wrote:

Yes, intel_powerclamp is set to default to 85C as the default thermal limit. It spawns these kidle_inject threads to help slow down and cool the cpu. I believe there is a way to set the threshold higher, but it's probably not a good idea. It would be better to see what is causing the cpu to get that hot. Is the fan working as expected, maybe it's gotten really dusty and needs a cleaning. I know that minecraft is pretty graphics intensive, maybe run top before kidle_inject kicks in and see what the top processes are.

Lars Hansson (romabysen) wrote:

I have noticed the same thing. If I start playing a game in wine, say ETS2, on my laptop (Intel HD 4000+Nvidia 820M) it runs fine for maybe a minute and after that the FPS drops like a rock. The "problem", if you can call it that, is that the GPU temperature goes above 85C and thus powerclamp kicks in and tries to reduce the temperature.

Lars Hansson (romabysen) wrote:

Addition: the CPU temperature never goes above 80C when this happens and I don't think the GPU temperature is a problem, I think the driver or GPU itself will slow down if it gets too hot (but I could be wrong).

Lars Hansson (romabysen) wrote:

So after some more tests I got some interesting results. The problem seems to be that the NVidia card runs much hotter in Linux than in Windows. Playing "The Talos Principle" on Steam in Ubuntu the GPU temperature reaches 95-97C while in Windows 8 it hovers around 75-80C. Pretty much the same thing happens when running ETS2 in Windows and in Wine on Ubuntu.
However, while idle (not running a game) Ubuntu has a lower GPU temperature (~55C) than Windows (~66C).

I guess I'll be creating a new bug report about this.

William Katcher (katcherw) wrote:

The fans are working fine, although maybe not as fast as before, and the top tasks are java as expected.

There is something here that is new behavior, but I can't quite pin what it is. Either:

1. The older kernels did not protect the CPU from overheating.
2. The older kernels used a different form of thermal protection such as reducing the clock frequency.
3. My system is suddenly running the cpu hotter.

I do wonder why the idle injection has to max out at 50%, instead of slowly ramping up until the temperature goes below the threshold.

Lars Hansson (romabysen) wrote:

Hi William,
How is your GPU temperature when this happens? Btw, as a more permanent temporary fix than rmmod you can blacklist it by creating the file /etc/modprobe.d/blacklist-intel_powerclamp.conf and adding the line "blacklist intel_powerclamp" to it.

As for my problem, I think it's a bit counter-intuitive that powerclamp starts to inject idle time when it's the *GPU* that is hot and using 50% of the CPU for this idle stuff almost right away is certainly excessive. It also takes a really long time after the temp has gone down for powerclamp to stop and during this time the system is very sluggish.
I have to say, it seems this is not working as it should.

William Katcher (katcherw) wrote:

My GPU seems fine, well under the high threshold. I can see what the logic is of throttling the CPU because the GPU is overheating--if the CPU is throttled then less data is being fed to the GPU so the GPU is indirectly slowed. But you're right, reducing the CPU by half is overkill. And that seems to be my problem as well. Without the powerclamp, my temperature goes slightly over the high threshold, maybe 87 deg with a threshold of 85 degrees, but the idle goes at 50% for quite some time.

I don't want to remove the powerclamp driver, since I don't want my CPU to overheat. But 50% CPU reduction to control temperature that is slightly exceeding the max makes my machine unusable.

Looking at the documentation for the powerclamp technology and the proc values, it looks like any level under 50% can be selected. I don't think it is powerclamp itself choosing the value, so there must be some thermal protection module that is choosing all or nothing. It really should gradually start injecting until the temperature is acceptable, IMO.

Lars Hansson (romabysen) wrote:

Yes, you have a point there about injecting idle time to the CPU to cool down the GPU. The thing is that modern GPU's (and CPU's) already have thermal protection so I don't need intel_powerclamp to protect anything. Now, if I was interested in keeping the power usage down then powerclamp would be a good thing but when I am plaing a game I couldnt care less.
Perhaps there should be a way to disable powerclamp from userspace in a better way than rmmod.

William Katcher (katcherw) wrote:

As an experiment, I removed the powerclamp driver and monitored the temp sensors. The GPU never got close to high. The CPU has two thresholds, high of 86 C and critical of 100 C. With the powerclamp driver, the idle injection always kicked in at 86. Without, the temp stayed at around 99 C, never ever going above 100. So you look like you're right, the CPU must be throttling itself. The PC behaved as it used to, no noticeable slowdown whatsoever.

I will now blacklist the driver and have my PC like it was before.

I recommend that the powerclamp driver be disabled by default, as there must be many people who are experiencing unnecessary and dramatic performance reduction, and have no clue what is causing it.

Lars Hansson (romabysen) wrote:

I would think it a good idea if setting the pstate to "performance" would disable powerclamp.

I'd like to report that blacklisting intel_powerclamp and intel_rapl on my HP Probook 4530s appears to have mostly solved my "random I/O freezes" (mouse, keyboard, display, or audio cutting out for a few seconds at random), which also seem related to bug #1386721 .

I didn't see the syslog messages mentioned above, but I might just have a lower debug level.

René Jochum (pcdummy) wrote:

Had the same trouble, blacklisting intel_powerclamp and intel_rapl fixed it. Thanks to @krzysdrewniak.

sudo sh -c 'cat <<EOF > /etc/modprobe.d/blacklist-power.conf
# See: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1389077
blacklist intel_powerclamp
blacklist intel_rapl


Bathroom Humor (bafroomhumor) wrote:

I can see the value in this sort of thing, as well, when power or heat constraints exist.
Which is good for me since my laptop has bad heat dissipation. But this implementation is pretty bad. I've noticed that the idle injection doesn't stop for me until the process that caused it is killed. So if it starts up while playing a video on youtube, it won't stop injecting until I kill plugin container, even if the temps go below 60C. So that's truly annoying.
Also, my high temp threshold seems to be around 70C, which I feel is much too low. Is there any way to adjust that parameter, for future reference? I will likely just disable powerclamp until it works better on my machine.

derek (denc716) wrote:

I am running 15.10/beta-2/ on laptop and close laptop everyday (power suspend / resume only, not shutdown / reboot), and see the problem here again, once in a while (a few days), the laptop becomes hot, fan start spinning, then I started top, see kidle_inejct spinning up cpu, this state can last for a few minute,

➸ uname -a
Linux ubuntu-gnome 4.2.0-11-generic #13-Ubuntu SMP Mon Sep 21 21:33:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

top - 10:52:31 up 12 days, 2:20, 28 users, load average: 4.72, 2.12, 1.08
Tasks: 392 total, 4 running, 388 sleeping, 0 stopped, 0 zombie
%Cpu0 : 7.6 us, 31.0 sy, 0.0 ni, 50.9 id, 5.7 wa, 0.0 hi, 4.7 si, 0.0 st
%Cpu1 : 11.3 us, 33.4 sy, 0.0 ni, 49.5 id, 5.1 wa, 0.0 hi, 0.6 si, 0.0 st
%Cpu2 : 8.5 us, 34.1 sy, 0.0 ni, 53.8 id, 2.6 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 9.5 us, 32.8 sy, 0.0 ni, 55.7 id, 1.6 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu4 : 4.0 us, 32.8 sy, 0.0 ni, 62.3 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 9.1 us, 30.6 sy, 0.0 ni, 58.6 id, 1.3 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu6 : 7.6 us, 33.7 sy, 0.0 ni, 58.1 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 10.2 us, 33.3 sy, 0.0 ni, 56.1 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 16339328 total, 14138952 used, 2200376 free, 282156 buffers
KiB Swap: 4194300 total, 2974696 used, 1219604 free. 5395296 cached Mem
Change delay from 3.0 to
 4951 root -51 0 0 0 0 S 31.3 0.0 0:42.40 [kidle_inject/6]
 4949 root -51 0 0 0 0 S 29.4 0.0 0:42.90 [kidle_inject/4]
30912 mint 20 0 2260932 589184 147688 R 28.7 3.6 171:07.25 /opt/google/chrome/chrome --user-data-dir=chrome-data-
 4947 root -51 0 0 0 0 S 28.0 0.0 0:40.94 [kidle_inject/2]
 4952 root -51 0 0 0 0 S 28.0 0.0 0:42.15 [kidle_inject/7]
 2517 mint 20 0 3835964 1.512g 535776 R 27.7 9.7 2:00.10 /opt/google/chrome/chrome --type=gpu-process --channel=30912.1563.1851560169 -+
 4950 root -51 0 0 0 0 S 27.4 0.0 0:41.86 [kidle_inject/5]
 4948 root -51 0 0 0 0 S 27.1 0.0 0:41.37 [kidle_inject/3]
 4946 root -51 0 0 0 0 S 26.7 0.0 0:39.94 [kidle_inject/1]
 4945 root -51 0 0 0 0 S 25.4 0.0 0:35.95 [kidle_inject/0]
 1742 mint 20 0 2090384 314432 45564 S 15.3 1.9 112:23.19 /usr/bin/gnome-shell
 1452 root 20 0 927420 123368 73024 R 9.1 0.8 74:23.03 /usr/bin/X vt7 -displayfd 3 -auth /run/user/999/gdm/Xauthority -nolisten tcp -background+
10128 mint 20 0 880868 130472 53444 S 5.9 0.8 16:29.60 /opt/google/chrome/chrome --type=renderer --enable-deferred-image-decoding --l+

Magnum (salazar-bruno) wrote:

Same here. running kubuntu 15.10 on a LG laptop.

I disabled the intel_powerclamp using this method:

 - Opened the "/etc/thermald/thermal-cpu-cdev-order.xml" file.
 - Removed the line referencing the intel_powerclamp.
 - Issued an "service thermald stop/start".

This is probably easier than removing the whole kernel module.

Lucas (lucas75) wrote:

I have a similar problem. As an workarround I changed the order of the thermal devices in the thermald configuration:

gedit /etc/thermald/thermal-cpu-cdev-order.xml


I tested with the following orders, both woked fine:
   rapl, pstate, cpufreq
   cpufreq, rapl, pstate

Thiago Martins (martinx) wrote:

I'm seeing this problem on Ubuntu 16.10. Not too frequent.

Михаил (sqwoteg) on 2017-11-20
