Ubuntu

Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled

Reported by Flávio Etrusco on 2010-02-19
This bug affects 181 people
Affects Status Importance Assigned to Milestone
Arch Linux
New
Undecided
Unassigned
linux-2.6 (Debian)
Fix Released
Unknown
linux (Ubuntu)
Low
Unassigned
Declined for Lucid by Jeremy Foshee
Declined for Maverick by Jeremy Foshee
xorg (Ubuntu)
Undecided
Unassigned
Declined for Lucid by Jeremy Foshee
Declined for Maverick by Jeremy Foshee

Bug Description

powertop reports many wakes per second (quantity depending on system) in "[kernel scheduler] Load balancing tick" task, rising with little load, on many kinds of multi-core (?) systems (original report was on a Core 2 Duo processor (T6500) with a single core enabled (multicore disabled in BIOS)).

Cause of the problem:
With kernel 2.6.32, there came a patch to the scheduler that introduced this problem (that was backported to some other versions as well). Even though this problem occurred first in Lucid, it is NOT specific to Lucid or Ubuntu at all (Debian bug report at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=521944, reproducable in Arch Linux as well). Work is ongoing to get things straight in kernel, but it will take a long time until this reaches Ubuntu (see http://lkml.org/lkml/2010/7/6/172).

Workarounds that DO NOT work (may improve situation but not solve it):
- maxcpus=1
- noapic
- nosmp
- nolapic
- use mainline kernel

Workarounds that DO (probably) work:
- tip version of kernel (git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git, from http://lkml.org/lkml/2010/7/8/75)
- use maverick's kernel with applied patches (https://launchpad.net/~brian-rogers/+archive/power, from comment #80)

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: etrusco 1606 F.... pulseaudio
                      etrusco 15151 F.... foobar2000.exe
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
   Mixer name : 'Realtek ALC269'
   Components : 'HDA:10ec0269,1b0a4009,00100004 HDA:11c11040,1b0a4007,00100200'
   Controls : 19
   Simple ctrls : 11
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100100'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Fri Feb 19 05:25:42 2010
DistroRelease: Ubuntu 10.04
EcryptfsInUse: Yes
MachineType: Philco PHN10XXX.
Package: linux-image-2.6.32-13-generic 2.6.32-13.18
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-13-generic root=UUID=d482e94f-9370-4ad2-9536-986541003db5 ro acpi.power_nocheck=1 acpi_osi=linux radeon.blacklist=yes
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-13.18-generic
Regression: No
RelatedPackageVersions: linux-firmware 1.29
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
TestedUpstream: No
Uname: Linux 2.6.32-13-generic i686
dmi.bios.date: 06/01/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1.01
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.vendor: PEGATRON CORP.
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 10
dmi.chassis.vendor: PEGATRON CORP.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1.01:bd06/01/2009:svnPhilco:pnPHN10XXX.:pvr1.01:rvnPEGATRONCORP.:rn:rvrTobefilledbyO.E.M.:cvnPEGATRONCORP.:ct10:cvrToBeFilledByO.E.M.:
dmi.product.name: PHN10XXX.
dmi.product.version: 1.01
dmi.sys.vendor: Philco

Flávio Etrusco (etrusco) wrote :
description: updated
Jeffrey Baker (jwbaker) wrote :

Confirmed on a ThinkPad X61. This is new in Lucid Alpha 3, wasn't there in Alpha 2.

Changed in linux (Ubuntu):
status: New → Confirmed
Jeffrey Baker (jwbaker) wrote :

I should also mention that I don't have a core disabled in the BIOS, I am using both cores. It shouldn't matter.

Jeffrey Baker (jwbaker) wrote :
Download full text (4.6 KiB)

# powertop -d
PowerTOP 1.12 (C) 2007, 2008 Intel Corporation

Collecting data for 15 seconds

Your CPU supports the following C-states : C1 C2 C3 C4
Your BIOS reports the following C-states : C1 C2 C3
Cn Avg residency
C0 (cpu running) (31.1%)
C0 0.0ms ( 0.0%)
C1 mwait 0.0ms ( 0.0%)
C2 mwait 0.1ms ( 0.0%)
C3 mwait 4.8ms (68.9%)
P-states (frequencies)
Turbo Mode 15.7%
  2.00 Ghz 0.1%
  1.60 Ghz 0.1%
  1200 Mhz 0.2%
   800 Mhz 84.0%
Disk accesses:
The application 'firefox-bin' is writing to file 'sessionstore-1.js' on /dev/sda1
The application 'firefox-bin' is writing to file 'sessionstore-1.js' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
Wakeups-from-idle per second : 145.1 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  71.1% (296.3) [kernel scheduler] Load balancing tick
   7.7% ( 32.2) [Rescheduling interrupts] <kernel IPI>
   3.7% ( 15.3)D firefox-bin
   3.9% ( 16.3) [acpi] <interrupt>
   3.3% ( 13.9) [iwlagn] <interrupt>
   2.5% ( 10.5) [i915@pci:0000:00:02.0] <interrupt>
   2.3% ( 9.5) PS/2 keyboard/mouse/touchpad interrupt
   1.0% ( 4.3) Xorg
   1.0% ( 4.0) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
   0.9% ( 3.6) [ahci] <interrupt>
   0.6% ( 2.6) [kernel core] hrtimer_start (tick_sched_timer)
   0.3% ( 1.2) gnome-terminal
   0.2% ( 1.0) gvfs-afc-volume
   0.2% ( 0.7) top
   0.2% ( 0.7) compiz
   0.1% ( 0.5) python
   0.1% ( 0.4) update-notifier
   0.1% ( 0.3) [eth1] <interrupt>
   0.1% ( 0.3) events/0
   0.1% ( 0.3) [kernel core] inc_rt_group (sched_rt_period_timer)
   0.1% ( 0.3) clock-applet
   0.0% ( 0.2) gnome-settings-
   0.0% ( 0.2) indicator-apple
   0.0% ( 0.2) gnome-panel
   0.0% ( 0.2) gnome-power-man
   0.0% ( 0.2) bdi-default
   0.0% ( 0.2) flush-8:0
   0.0% ( 0.2) rtkit-daemon
   0.0% ( 0.1) [kernel core] sk_reset_timer (tcp_delack_timer)
   0.0% ( 0.1) [kernel core] arm_supers_timer (sync_supers_timer_fn)
   0.0% ( 0.1) NetworkManager
   0.0% ( 0.1) rmmod
   0.0% ( 0.1) sshd
   0.0% ( 0.1) [kernel core] neigh_add_timer (neigh_timer_handler)
   0.0% ( 0.1) khungtaskd
   0.0% ( 0.1) [kernel core] add_timer (addrconf_verify)
   0.0% ( 0.1) events/1
   0.0% ( 0.1) ssh-agent
   0.0% ( 0.1) gnome-volume-ma
   0.0% ( 0.1) [kernel core] add_timer (sta_info_cleanup)
   0.0% ( 0.1) kerneloops
   0.0% ( 0.1) [kernel core] fib6_run_gc (fib6_gc_timer_cb)
   0.0% ( 0.1) rsyslogd

A USB device is active 100.0% of the time:
USB device 3-1 : BCM2045B (Broadcom Corp)

Suggestion: Enable USB autosuspend for non-input devices by pressing the U key

Suggestion: increase the VM dirty writeback time from 5.00 to 15 seconds with:
  echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
This wakes the disk up less frequently for background VM activity

Suggestion: Enable SATA ALPM link power management via:
  echo min_power > /sys/class/scsi_...

Read more...

Jeffrey Baker (jwbaker) wrote :

Notably, events/0 and events/1 are getting tons of CPU time:

# ps -fe | grep events
root 9 2 5 20:58 ? 00:04:46 [events/0]
root 1016 1 0 20:58 ? 00:00:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root 3473 2 49 22:19 ? 00:05:16 [events/1]
root 4370 4289 0 22:29 pts/1 00:00:00 grep events

Rephrasing the summary.
Indeed the problem is much worse with the 2 cores enabled, the report is just that i was expecting no wake up at all with only 1 core.
nosmp, noapic and nolapic made no difference. Actually will all of these enabled the system was bogged down with not apparent explanation.

summary: Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
- Core 2 Duo with 1 core disabled in BIOS
+ Core 2 Duo even if only 1 core enabled (1 disabled in BIOS)
davidb (davidb-csh) wrote :

I'd just like to report that I am having the same issue. I also have a Thinkpad x61, Core2. I just upgraded to Lucid a few hours ago. I tried disabling a core and saw the same problem. As Flávio wrote, I would have expected that to significantly reduce the number of wakeups. If I can be of any help in debugging this let me know.

Ryan Kavanagh (ryanakca) wrote :

Linked to Debian bug 521944 based on comment 84 . I can confirm this happens under Ubuntu Lucid with an Intel Atom N280, so I don't think this is restricted to Core 2 Duo.

Changed in linux-2.6 (Debian):
status: Unknown → Incomplete
Leif Walsh (leif.walsh) wrote :

What is incomplete about this bug? I am getting 500 wakeups consistently from this load balancing tick, on an x200s, with latest Lucid.

Flávio Etrusco (etrusco) wrote :

Funny how the comments in the Debian tracker suggests says "worksforme" and suggests powertop is outdated without any data.
Latest powertop here lists this:

Top causes for wakeups:
  44,1% (199,2) <kernel core> : hrtimer_start_range_ns (tick_sched_timer)
  26,5% (119,6) firefox-bin : hrtimer_start_range_ns (hrtimer_wakeup)
  12,0% ( 54,2) <interrupt> : extra timer interrupt
   7,8% ( 35,2) <interrupt> : ath, HDA Intel
   1,8% ( 8,0) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func)
   1,3% ( 6,0) <interrupt> : ata_piix, ata_piix, uhci_hcd:usb5, uhci_hcd:
usb7 Segmentation fault (core dumped)

And yes, it coredumps.

Flávio Etrusco (etrusco) wrote :

Leif: what was marked incomplete is the Debian bug entry.

I observe this problem on a single-core Intel Atom Z520 CPU with HyperThreading enabled using Ubuntu 10.04 Lucid beta 2. Based on that and on comment #8 here, I edited the bug summary to remove the CPU-specific part.

summary: - Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
- Core 2 Duo even if only 1 core enabled (1 disabled in BIOS)
+ Tens of wakes per second in "[kernel scheduler] Load balancing tick"
Flávio Etrusco (etrusco) wrote :

Problem is I reported this bug because I don't expect "load-balancing" wake-ups on a single-cpu setup. I don't know what is the expected number of wake-ups with multiple CPUs or HyperThreading. If you can reproduce the problem of running without HyperThreading, then maybe this is the same bug or related.

Flávio Etrusco (etrusco) wrote :

It would be nice to know if this (apparent) bug also occurs on non-Intel CPUs...

Flávio Etrusco (etrusco) wrote :

Similar test-case on a Athlon64 cpu shows much lower wake-ups:

Top causes for wakeups:
  38.3% (170.0)D firefox-bin
  22.4% ( 99.4) pulseaudio
  13.8% ( 61.2) [nvidia] <interrupt>
  10.8% ( 48.0) PS/2 keyboard/mouse/touchpad interrupt
   7.8% ( 34.5) [kernel scheduler] Load balancing tick
   2.5% ( 11.3) [pata_via] <interrupt>

summary: - Tens of wakes per second in "[kernel scheduler] Load balancing tick"
+ Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
+ Core 2 Duo even with only 1 core enabled
tags: added: upstream
removed: needs-upstream-testing
lukefeil (lukefeil88) wrote :

Similiar on a ASUS 1005PE with a Intel N450

Top causes for wakeups:
  42.9% (263.6) [kernel scheduler] Load balancing tick

Andrew Henry (adhenry) wrote :

I had a HP laptop with Intel Core 2 Duo 7200 and this was never an issue. With Ubuntu 10.04 on a Thinkpad Edge 13 with Intel Core2Duo CULV CPU this is an issue.

Leif Walsh (leif.walsh) wrote :

Is there any way to just turn off load balancing? I'd be eager to sacrifice a little performance for a large (almost 100%) gain in battery life.

permalloy (permalloy) wrote :

Eventually bug 552020 is a duplicate of this one ?

mihai007 (mihai-ile) wrote :

dell xps m1330, intel t8300 on ubuntu final 10.04 gives:

Top causes for wakeups:
  50.6% ( 63.5) [kernel scheduler] Load balancing tick
  12.1% ( 15.1) [ata_piix] <interrupt>
  10.7% ( 13.4) [iwlagn] <interrupt>
   7.1% ( 8.9) [extra timer interrupt]
   4.0% ( 5.0) syndaemon
   1.6% ( 2.0) [nvidia] <interrupt>

This problem is huge, 50% of cpu wakeups!?

Andrew Henry (adhenry) wrote :

In post #17 I said I had a HP that I never had an issue with. I didn't...with Ubuntu 9.10. I did a fresh install of 10.04 and have exactly the same issue. Different CPU, different wireless adapter etc. Obviously, this is a kernel issue and not hardware specific.

verwa (laurent-arsonore) wrote :

Similiar on a ASUS V1SN Intel Core 2 Duo T7700 (ubuntu 10.4)

Top causes for wakeups:
  47.9% ( 63.5) [kernel scheduler] Load balancing tick
  12.2% ( 15.1) [ata_piix] <interrupt>
  9.7% ( 13.4) [iwlagn] <interrupt>
   8.1% ( 8.9) [extra timer interrupt]

---
+ acpi errors

UBUNTU kernel: [ 0.186131] ACPI Error: ACPI path has too many parent prefixes (^) - reached beyond root node (20090903/nsaccess-429)
..
UBUNTU kernel: [ 0.222562] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

lores (lores000) wrote :

Same problem here: HP 6720s Intel(R) Pentium(R) Dual CPU T2370, Ubuntu 10.04 LTS

On 08/05/10 11:16, Pawel wrote:
> It seems the Phoronix does confirm this issue:
>
> http://www.phoronix.com/scan.php?page=article&item=linux_windows_part2&num=1
>

Thats general power consumption isn't it? They compare graphics card
drivers and power consumption...I do not see any mention of this
particular issue with kernel tick wakeups?

--Andrew

Michael Christensen (conphara) wrote :

I can confirm this bug on a Core 2 Duo (E6500) and a Pentium D (standard D, not Extreme). Tested with Powertop 1.12. Seems to wake up more under load and wake up less when idle.
Both with Lucid installed (clean installs), fully updated.

stephen (hlxshady) wrote :

Same issue for my Lenovo T400 with Core 2 Duo (P8400),
a brief description of my case is posted here
http://ubuntuforums.org/showthread.php?t=1390055#7

There was a daily update today ... my kernel was just updated to 2.6.32-22, but its appetite on power doesn't change at all ... :(

Michael Christensen (conphara) wrote :

I have tested the following boot options adding "idle=halt" and "processor.max_cstate=1" or cstate=2, not adding all at the same time but one after the other. None of the options aren't fixing the wakeups on a Core 2 Duo (E6500) or on a Pentium D (both desktop PCs).

This is what is looks like in Powertop when scrolling a page in Firefox:
Top causes for wakeups:
42,3% (115,9) [kernel scheduler] Load balancing tick

This is what it looks like when idle:
8,2% (7,0) [kernel scheduler] Load balancing tick

Andrew Henry (adhenry) wrote :

On 08/05/10 20:42, Michael Christensen wrote:
> This is what is looks like in Powertop when scrolling a page in Firefox:
> Top causes for wakeups:
> 42,3% (115,9) [kernel scheduler] Load balancing tick
>
> This is what it looks like when idle:
> 8,2% (7,0) [kernel scheduler] Load balancing tick
>

Im getting min 25% on Load balancing tick even when idle. In fact, it
decreases when im actively using the CPU!

Flávio Etrusco (etrusco) wrote :

We're going nowhere with this bug, we didn't even get word on whether this is expected or a powertop bug (the discussion in debian doesn't hold up) or whatever.
Since this happens is mainstream kernels, I guess somebody will have to get the balls to post to LKML or the kernel bugzilla 8-)

LKML knows, there's even a patch somewhere:
http://lkml.org/lkml/2010/5/9/20

Luca Aluffi (aluffilu) wrote :

Maybe the problem is wider as it extends to intel atom too: here is my N270 from ASUS 1201NL:

Wakeup-da-idle al secondo: 130,6 intervallo: 5,0s
Utilizzo energetico (stima ACPI): 9,4W (2,9 ore)

Cause principali di wakeup:
  22,0% ( 38,0) [kernel scheduler] Load balancing tick
  13,7% ( 23,6) [ath9k] <interrupt>
  13,0% ( 22,4) firefox-bin
  12,2% ( 21,0) [extra timer interrupt]

Flávio Etrusco (etrusco) wrote :

The post(s) linked by Michael clearly show this is a general problem with multi-core, not specific to any CPU model. See: http://lkml.org/lkml/2010/4/26/249

FWIW, the patch I linked to (which I just got around to actually trying) doesn't seem to help on my netbook at all.

I've had some success Daniel Hollocher's linux-ck ppa: https://launchpad.net/~chogydan/+archive/ppa

It doesn't fix everything, but the BFS making such a difference does make it clear this is a kernel issue and not just a few of us having strange hardware...

--bornagainpenguin

so in all liklihood, were going to have to shut up and put up until next
ubuntu release when we get a new kernel? or is there any chance whatsoever
that this will be backported??

On 14 May 2010 03:21, bornagainpenguin <email address hidden> wrote:

> I've had some success Daniel Hollocher's linux-ck ppa:
> https://launchpad.net/~chogydan/+archive/ppa<https://launchpad.net/%7Echogydan/+archive/ppa>
>
> It doesn't fix everything, but the BFS making such a difference does
> make it clear this is a kernel issue and not just a few of us having
> strange hardware...
>
> --bornagainpenguin
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Bartosz Skowron (getxsick) wrote :

Can't believe this bug is for months...

Michael Christensen (conphara) wrote :

I have installed two mainline kernels (http://kernel.ubuntu.com/~kernel-ppa/mainline/), .31 & 33, just to see if those kernels made any change in the number of wakeup calls. Both kernels had about the same number of wakeups as the Lucid kernel (.32), which led me to wonder whether or not this could be an userspace bug.

Both kernels made Powertop report :
   28,1% ( 7,2) [kernel scheduler] Load balancing tick
   27,7% ( 7,1) [kernel core] hrtimer_start (tick_sched_timer)

The fact of the mather is that mainline kernels (at least the ones from the ppa) are not making things better.
Luckily kernel developer Suresh Siddha is working on a patch, that simply has to be backported to Lucid.

These kernel wakeups have to be sorted out, since they are one of the reasons why Linux power consumption is much higher than on other platforms.

Kimmo Ahola (kimmo-ahola) wrote :

Is there a workaround? Currently my laptop is burning my legs off..

Here's my output of "sudo powertop -t60 -d"

skhawam (s-khawam) wrote :

Was having this problem as well on my Dell Mini 10v.

Seems like a kernel patch has been posted yesterday:

http://lkml.org/lkml/2010/5/17/350

Rocko (rockorequin) wrote :

I applied that patch to the 2.6.34 amd64 kernel and it does reduce the load balancing wakeups somewhat on my dual core PC. It went from around 50 per second to 35 after the patch (and that's running Skype and iwlagn, which average 10 and 15 wakeups per second respectively - the load balancing wakeups are higher when more things are running: when I am not running X, the load balancing wakeups are much fewer (under 10)).

I also tried compiling a 2.6.34 kernel *without* multi-core support for my 32 bit single core PC, and it does reduce the number of load balancing wakeups (although to my surprise they still happen).

@Flávio Etrusco (comment 15): it looks like your nvidia driver is causing a lot of wakeups, which you can fix with the xorg.conf option:

Option "OnDemandVBlankInterrupts" "True"

and if you're worried about power usage, the nvidia driver defaults to max performance, which means max power usage. You can put an entry in xorg.conf that makes it reduce performance (and power) when on battery but still give max performance on AC for gaming, etc (see eg http://linux.aldeby.org/nvidia-powermizer-powersaving.html for details).

Orcie (mef-vanschalkwijk) wrote :
Download full text (4.6 KiB)

Can someone give me a hand with applying the patch, please?

On Thu, May 20, 2010 at 3:04 PM, Rocko <email address hidden> wrote:

> I applied that patch to the 2.6.34 amd64 kernel and it does reduce the
> load balancing wakeups somewhat on my dual core PC. It went from around
> 50 per second to 35 after the patch (and that's running Skype and
> iwlagn, which average 10 and 15 wakeups per second respectively - the
> load balancing wakeups are higher when more things are running: when I
> am not running X, the load balancing wakeups are much fewer (under 10)).
>
> I also tried compiling a 2.6.34 kernel *without* multi-core support for
> my 32 bit single core PC, and it does reduce the number of load
> balancing wakeups (although to my surprise they still happen).
>
> @Flávio Etrusco (comment 15): it looks like your nvidia driver is
> causing a lot of wakeups, which you can fix with the xorg.conf option:
>
> Option "OnDemandVBlankInterrupts" "True"
>
> and if you're worried about power usage, the nvidia driver defaults to
> max performance, which means max power usage. You can put an entry in
> xorg.conf that makes it reduce performance (and power) when on battery
> but still give max performance on AC for gaming, etc (see eg
> http://linux.aldeby.org/nvidia-powermizer-powersaving.html for details).
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Confirmed
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports above 70 wakes per second in "[kernel scheduler] Load
> balancing tick" task, and above 200 when there's any little load, running on
> a Core 2 Duo processor (T6500) with a single core enabled (multicore
> disabled in BIOS).
> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
> and reproduce it with the mainline kernel.
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: etrusco 1606 F.... pulseaudio
> etrusco 15151 F.... foobar2000.exe
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
> Mixer name : 'Realtek ALC269'
> Components : 'HDA:10ec0269,1b0a4009,00100004
> HDA:11c11040,1b0a4007,00100200'
> Controls : 19
> Simple ctrls : 11
> Card1.Amixer.info:
> Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
> Mixer name : 'ATI R6xx HDMI'
> Components : 'HDA:1002aa01,00aa0100,00100100'
> Controls : 4
> Simple ctrls : 1
> Card1.Amixer.values:
> Simple mixer control 'IEC958',0
> Capabilities: pswitch pswitch-joined penum
> Playback channels: Mono
> Mono: Playb...

Read more...

Rocko (rockorequin) wrote :

@Orcie: Well, applying the patches was slightly complicated, especially as one of the patches didn't work, so I had to edit a file by hand. But if you're game to try, I created a thread for it at:

http://ubuntuforums.org/showthread.php?p=9334524#post9334524

Hmmm...the Canonical developers could've fixed this bug, the systemtray icon background bug, several other major bus that plague Lucid, showing they actually do give a flying Shuttleworth about their users...or they could pull at their roots and fap to Windicators...

Guess which one they chose to do?

Brian Rogers (brian-rogers) wrote :

This is a kernel bug that needs to be (and is being) fixed by upstream kernel developers.

Leif Walsh (leif.walsh) wrote :

How soon after this patch gets accepted do you think we can expect a
backport? Anything I can do to make that estimate shorter?

alx5000 (alx5000) wrote :

> Brian Rogers #45
>
> This is a kernel bug that needs to be (and is being) fixed by upstream kernel developers.

I'm aware of that, but in the meantime, I'd rather be able to work on my laptop for longer than an hour, and without burning my legs off.

>> I'm aware of that, but in the meantime, I'd rather be
>> able to work on my laptop for longer than an hour,
>> and without burning my legs off.

LOL! "Ubuntu: Upstream will get to it eventually..." was there a motto change recently I was not aware of? What happened to "Ubuntu: Linux for Human Beings" that one was good, back when it was accurate! Does Canonical actually do anything any more other than make themes and re-implement failed window controls from the 90s?

Brian Rogers (brian-rogers) wrote :

If somebody has a fix, it needs to be sent upstream to be vetted and committed anyway. It's good to do quality control before distributing an update to everyone that only some people will need.

Also, it isn't a matter of "upstream will get to it eventually". There is already a patch. I plan to set up a PPA with the patch soon, so people can easily try it out. I'm not sure if this will get backported to stable, since it's a change to how the scheduler works. Somebody will have to inquire upstream.

Richard Kleeman (kleeman) wrote :

I have this problem as well in a Lenovo X300 laptop. This has a core 2 duo and an intel graphics driver. I tried powertop under 3 different system states and got some interesting results.

Regular full boot with X running (desktop):

Number of wakeups per second was around 220 and [kernel scheduler] Load balancing tick was at about 80 wakeups per second and main culprit.

Full Boot without X running

Number of wakeups per second was around 70 and [kernel scheduler] Load balancing tick was at about 25 wakeups and the main culprit

Root shell (recovery mode boot)

Number of wakeups per second was around 10 (!!!!) and [kernel scheduler] Load balancing tick was at about 0.5 wakeups

So it looks like this problem occurs when cpu stress particularly from video drivers is applied but can occur due to many sources. Sounds like a bad kernel bug to me.

My laptop runs about 5C warmer with Lucid compared to Karmic so this is very annoying.

Richard Kleeman (kleeman) wrote :

This is definitely a kernel issue. I ran powertop with the latest Lucid kernel (2.6.32) and for comparison the last Karmic kernel (2.6.31) and under the Karmic kernel the number of wakeups from a standard desktop with no apps open is HALVED (!!!) from 220 down to 100. I am switching back to the Karmic kernel until they fix this.

KK (karldialal) wrote :

I have exactly the same experience (going back to the 2.6.31 kernel makes everything better)... so I agree that it is definitely a kernel issue.

Richard Kleeman (kleeman) wrote :

Confirmed on two laptops now. Thinkpad X300 and T60. Wakeups drop by over 50% from this source reverting to Karmic kernel and are running at least 5C cooler. I notice though that even the old kernel has these wakeups as an issue which suggests even more room for improvement. I hope the kernel devs notice this and push the bug upstream.

 Judging from the forums and google searches this is affecting a LOT of people...............

Orcie (mef-vanschalkwijk) wrote :
Download full text (4.0 KiB)

This bug goes on for quit some time. As Richard Kleeman indicated the
problem is not not only affecting in the kernels from lucid. What is a
realistic time frame that this bug is being solved? I really love
ubuntu/linux but this makes me hestitate reverting to m$....

On Mon, May 31, 2010 at 6:27 PM, Richard Kleeman <email address hidden>wrote:

> Confirmed on two laptops now. Thinkpad X300 and T60. Wakeups drop by
> over 50% from this source reverting to Karmic kernel and are running at
> least 5C cooler. I notice though that even the old kernel has these
> wakeups as an issue which suggests even more room for improvement. I
> hope the kernel devs notice this and push the bug upstream.
>
> Judging from the forums and google searches this is affecting a LOT of
> people...............
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Confirmed
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports above 70 wakes per second in "[kernel scheduler] Load
> balancing tick" task, and above 200 when there's any little load, running on
> a Core 2 Duo processor (T6500) with a single core enabled (multicore
> disabled in BIOS).
> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
> and reproduce it with the mainline kernel.
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: etrusco 1606 F.... pulseaudio
> etrusco 15151 F.... foobar2000.exe
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
> Mixer name : 'Realtek ALC269'
> Components : 'HDA:10ec0269,1b0a4009,00100004
> HDA:11c11040,1b0a4007,00100200'
> Controls : 19
> Simple ctrls : 11
> Card1.Amixer.info:
> Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
> Mixer name : 'ATI R6xx HDMI'
> Components : 'HDA:1002aa01,00aa0100,00100100'
> Controls : 4
> Simple ctrls : 1
> Card1.Amixer.values:
> Simple mixer control 'IEC958',0
> Capabilities: pswitch pswitch-joined penum
> Playback channels: Mono
> Mono: Playback [on]
> Date: Fri Feb 19 05:25:42 2010
> DistroRelease: Ubuntu 10.04
> EcryptfsInUse: Yes
> MachineType: Philco PHN10XXX.
> Package: linux-image-2.6.32-13-generic 2.6.32-13.18
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-13-generic
> root=UUID=d482e94f-9370-4ad2-9536-986541003db5 ro acpi.power_nocheck=1
> acpi_osi=linux radeon.blacklist=yes
> ProcEnviron:
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSignature: Ubuntu 2.6.32-13.18-generic
> Regression: No
> RelatedPackageVersions: linux-firmware 1.29
> Reproduci...

Read more...

Leif Walsh (leif.walsh) wrote :

Dear all who think this bug is taking too long:

You're right. It's been outstanding for quite some time now, and
really should have been fixed before release, or else the 2.6.32
kernel shouldn't have been accepted.

As it is, the bug is not fixed. The best people for the job are the
linux kernel devs, not ubuntu employees/contributors. They have a
patch and are in the process of reviewing it. Believe me when I tell
you that you want them to finish reviewing it.

Once it's accepted, it'll be pushed in the next 2.6.32 maintenance
release. I expect to see this in .15, if it's not already in .14
(which got pushed a last week and since I haven't been following this
patch, I don't know if it's in there). If it goes in .15, it'll be
another few weeks to couple of months. Once that happens, the ubuntu
people can package it up and release it as a maintenance update.

Depending on the climate, you may also see this patch in a newer,
backported kernel (2.6.33 and above), so if you enable backports you
might see it sooner.

Either way, there's nothing you can do except either wait for a newer
kernel or downgrade to an older kernel (2.6.31 is a good choice, as
evidenced by some people above).

Another way to soften the blow of this bug is simply to give the load
balancer less to do. I've noticed that, while it certainly causes far
more wakeups than it should, the wakeups *scale with the load* of the
computer. Therefore, if you do all the things you normally do to
reduce wakeups, the load balancer will be less of an issue as well.

Whatever you choose to do (I'm just sticking with the kernel and
living with the bug...it's not so bad with my usage pattern), be
patient. You'll get the fix soon enough, and soon after that you'll
forget it was ever a problem.

<wink>If that's not enough for you, go install slackware.</wink>

Orcie (mef-vanschalkwijk) wrote :
Download full text (5.3 KiB)

Very encouraging Leif. My impatience is more at rest now.

On Mon, May 31, 2010 at 7:39 PM, Leif Walsh <email address hidden> wrote:

> Dear all who think this bug is taking too long:
>
> You're right. It's been outstanding for quite some time now, and
> really should have been fixed before release, or else the 2.6.32
> kernel shouldn't have been accepted.
>
> As it is, the bug is not fixed. The best people for the job are the
> linux kernel devs, not ubuntu employees/contributors. They have a
> patch and are in the process of reviewing it. Believe me when I tell
> you that you want them to finish reviewing it.
>
> Once it's accepted, it'll be pushed in the next 2.6.32 maintenance
> release. I expect to see this in .15, if it's not already in .14
> (which got pushed a last week and since I haven't been following this
> patch, I don't know if it's in there). If it goes in .15, it'll be
> another few weeks to couple of months. Once that happens, the ubuntu
> people can package it up and release it as a maintenance update.
>
> Depending on the climate, you may also see this patch in a newer,
> backported kernel (2.6.33 and above), so if you enable backports you
> might see it sooner.
>
> Either way, there's nothing you can do except either wait for a newer
> kernel or downgrade to an older kernel (2.6.31 is a good choice, as
> evidenced by some people above).
>
> Another way to soften the blow of this bug is simply to give the load
> balancer less to do. I've noticed that, while it certainly causes far
> more wakeups than it should, the wakeups *scale with the load* of the
> computer. Therefore, if you do all the things you normally do to
> reduce wakeups, the load balancer will be less of an issue as well.
>
> Whatever you choose to do (I'm just sticking with the kernel and
> living with the bug...it's not so bad with my usage pattern), be
> patient. You'll get the fix soon enough, and soon after that you'll
> forget it was ever a problem.
>
> <wink>If that's not enough for you, go install slackware.</wink>
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Confirmed
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports above 70 wakes per second in "[kernel scheduler] Load
> balancing tick" task, and above 200 when there's any little load, running on
> a Core 2 Duo processor (T6500) with a single core enabled (multicore
> disabled in BIOS).
> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
> and reproduce it with the mainline kernel.
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: etrusco 1606 F.... pulse...

Read more...

Brian Rogers (brian-rogers) wrote :

I put a 2.6.34 kernel with the patches for this issue in this PPA: https://launchpad.net/~brian-rogers/+archive/power

It's not perfect, but it does reduce load balancer wakeups some. A mainline build with fully up-to-date scheduler code might do better.

Leif Walsh (leif.walsh) wrote :

Thanks for your work.

I tested this, and it actually looks to have made things worse, if anything.

Here are four powertop logs, each run with -d -t 300. Prepatch is the
stock ubuntu kernel as of today, postpatch is your build. Without
load is standard gnome stuff and xmonad running, plus daemons, and
nothing else. With load adds several browser tabs, one of which
downloading playing a long youtube video throughout.

You'll notice that things get worse postpatch in both scenarios, but
probably not to a statistically significant degree.

Perhaps I don't suffer from this bug?

If you get the chance, since you already have the machinery to make
debian builds (and I've never actually figured out the proper way to
do this for the kernel), do you think you could build me a copy of the
same thing, but with the timer frequency set to 100Hz? Feel free to
just mail me debs rather than uploading them to the ppa, if you can do
this for me.

Brian Rogers (brian-rogers) wrote :

Leif, what was your pre-patch kernel? I'd be interested in a comparison with this kernel:
https://launchpad.net/ubuntu/maverick/+source/linux/2.6.34-5.12

It is the same as my PPA kernel except for the patch, so we can look at the effect of just the patch.

Also, I took a look at Lucid and Maverick's current kernel config and found that for both, -generic kernels on i386 have CONFIG_HZ=250 and amd64 has CONFIG_HZ=100. Seems like an odd choice to have two different values depending on architecture. It could be a mistake... Also, all the kernels have CONFIG_NO_HZ set.

If we want to test different configs, I can set them up as flavours and do -100hz and -250hz kernels on my next upload.

Brian Rogers (brian-rogers) wrote :

OK, I found the reason for 250 Hz on i386:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/438234

Brian, the kernel from your PPA does indeed seem to help a little bit. I was getting an average of something like 140 wakes per second and now the average is around 100.

My computer still seems to spend a lot of time doing housekeeping... for example a recent powertop output:

 31,2% ( 78,4) [extra timer interrupt]
  21,1% ( 53,0) [kernel scheduler] Load balancing tick
  18,5% ( 46,4) [iwlagn] <interrupt>
   9,3% ( 23,5) firefox-bin
   4,0% ( 10,0) [kernel core] r600_audio_update_hdmi (r600_audio_update_hdmi)
   4,0% ( 10,0) ubuntuone-syncd
   3,7% ( 9,4) [kernel core] hrtimer_start (tick_sched_timer)

70 wakeups doing something I care about (wireless and Firefox) and 160 doing... other stuff.

Anyway thanks for the kernel, at least it's something until a fix gets backported officially! (Which I hope will happen, this being LTS and all.)

Leif Walsh (leif.walsh) wrote :

On Tue, Jun 1, 2010 at 4:54 PM, Brian Rogers <email address hidden> wrote:
> Leif, what was your pre-patch kernel? I'd be interested in a comparison with this kernel:
> https://launchpad.net/ubuntu/maverick/+source/linux/2.6.34-5.12

I'll check this tonight if I get a chance, thanks.

> It is the same as my PPA kernel except for the patch, so we can look at
> the effect of just the patch.
>
> Also, I took a look at Lucid and Maverick's current kernel config and
> found that for both, -generic kernels on i386 have CONFIG_HZ=250 and
> amd64 has CONFIG_HZ=100. Seems like an odd choice to have two different
> values depending on architecture. It could be a mistake... Also, all the
> kernels have CONFIG_NO_HZ set.
>
> If we want to test different configs, I can set them up as flavours and
> do -100hz and -250hz kernels on my next upload.

Yeah, I checked your kernel config and it's 250Hz so I don't think
that's the cause of the extra wakeups, but I'll make sure the stock
kernel is also 250 tonight. I don't think it's 100, and I really hope
it isn't 1000.

--
Cheers,
Leif

Rocko (rockorequin) wrote :

fwiw, I built a 64-bit 1000Hz kernel (with CONFIG_NO_HZ also set) and found I got a lot more load-balancing wakeups compared to the stock 100Hz kernel.

Trevor Walker (trebawa) wrote :

I had this problem too, and I found that checking ~/.xsession-errors showed that vino-server was restarting very, very often. Killing it cut my wakeups from the load balancing tick dramatically. I read on the Ubuntu forums that someone had a similar situation with Metacity, so I would suggest checking that log.

Arvid Norlander (anmaster) wrote :

So when can we expect a fix for this in the stock lucid kernel (amd64)? This reduces battery life with more than half compared to jaunty on my Thinkpad R500 (Core 2 Duo P8400). Didn't stay on karmic for long enough to check this (only for about 2 days).

Changed in linux (Ubuntu):
status: Confirmed → Triaged
tags: added: kernel-core kernel-needs-review
Kristofer (kbitner) wrote :

This bug confirmed on an eeepc 1005HA with intel atom N280.

Can anyone confirm this bug on Debian??

Did some powertop reports on battery with both the lucid kernel and the mainline 2.6.34 kernel.

NOTE: This file contains both tests, the lucid kernel on top, and the mainline kernel below.

James Ward (jamesward) wrote :

This is happening to me on the latest maverick with 2.6.35-7-generic-pae.

Andy Whitcroft (apw) on 2010-07-12
tags: added: kernel-candidate kernel-reviewed
removed: kernel-needs-review
tags: removed: kernel-candidate
Andy Whitcroft (apw) on 2010-07-15
Changed in linux (Ubuntu):
importance: Undecided → Low
mabawsa (mabawsa) wrote :

If this effects the time the user can use the laptop on battery so much surely the importance should be set to critical (I am getting double the life using Windows 7 even with all that powertop can do).
Its difficult to convince users to try linux if you say it will decrease the performance of the machine.

alx5000 (alx5000) wrote :
Download full text (3.8 KiB)

I wholeheartedly agree with mabawsa. I've downgraded to Karmic for this very
reason. My battery lasts as long as it should, and my laptop very rarely
overheats (it ran hot as hell with Lucid).

2010/7/15 mabawsa <email address hidden>

> If this effects the time the user can use the laptop on battery so much
> surely the importance should be set to critical (I am getting double the
> life using Windows 7 even with all that powertop can do).
> Its difficult to convince users to try linux if you say it will decrease
> the performance of the machine.
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Triaged
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports above 70 wakes per second in "[kernel scheduler] Load
> balancing tick" task, and above 200 when there's any little load, running on
> a Core 2 Duo processor (T6500) with a single core enabled (multicore
> disabled in BIOS).
> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
> and reproduce it with the mainline kernel.
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: etrusco 1606 F.... pulseaudio
> etrusco 15151 F.... foobar2000.exe
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
> Mixer name : 'Realtek ALC269'
> Components : 'HDA:10ec0269,1b0a4009,00100004
> HDA:11c11040,1b0a4007,00100200'
> Controls : 19
> Simple ctrls : 11
> Card1.Amixer.info:
> Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
> Mixer name : 'ATI R6xx HDMI'
> Components : 'HDA:1002aa01,00aa0100,00100100'
> Controls : 4
> Simple ctrls : 1
> Card1.Amixer.values:
> Simple mixer control 'IEC958',0
> Capabilities: pswitch pswitch-joined penum
> Playback channels: Mono
> Mono: Playback [on]
> Date: Fri Feb 19 05:25:42 2010
> DistroRelease: Ubuntu 10.04
> EcryptfsInUse: Yes
> MachineType: Philco PHN10XXX.
> Package: linux-image-2.6.32-13-generic 2.6.32-13.18
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-13-generic
> root=UUID=d482e94f-9370-4ad2-9536-986541003db5 ro acpi.power_nocheck=1
> acpi_osi=linux radeon.blacklist=yes
> ProcEnviron:
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSignature: Ubuntu 2.6.32-13.18-generic
> Regression: No
> RelatedPackageVersions: linux-firmware 1.29
> Reproducible: Yes
> RfKill:
> 0: phy0: Wireless LAN
> Soft blocked: no
> Hard blocked: no
> SourcePackage: linux
> TestedUpstream: No
> Uname: Linux 2.6.32-13-generic i686
> dmi.bios.date: 06/01/2009
> dmi.bios.vendor: American Megatrends Inc....

Read more...

Antonio (tritemio) wrote :
Download full text (7.5 KiB)

I agree with the last two comments. Battery lasts half of the time it
lasts on windows 7 and the laptop overheats significantly on a large
spectrum of machines.

This should be a critical bug.

BTW, is there an "official" kernel we can test against this bug? If
yes can anyone post a link please?

Thanks,
Antonio

Il 15 luglio 2010 13.24, alx5000 <email address hidden> ha scritto:
> I wholeheartedly agree with mabawsa. I've downgraded to Karmic for this very
> reason. My battery lasts as long as it should, and my laptop very rarely
> overheats (it ran hot as hell with Lucid).
>
> 2010/7/15 mabawsa <email address hidden>
>
>> If this effects the time the user can use the laptop on battery so much
>> surely the importance should be set to critical (I am getting double the
>> life using Windows 7 even with all that powertop can do).
>> Its difficult to convince users to try linux if you say it will decrease
>> the performance of the machine.
>>
>> --
>> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
>> Core 2 Duo even with only 1 core enabled
>> https://bugs.launchpad.net/bugs/524281
>> You received this bug notification because you are a direct subscriber
>> of the bug.
>>
>> Status in “linux” package in Ubuntu: Triaged
>> Status in “linux-2.6” package in Debian: Incomplete
>>
>> Bug description:
>> powertop reports above 70 wakes per second in "[kernel scheduler] Load
>> balancing tick" task, and above 200 when there's any little load, running on
>> a Core 2 Duo processor (T6500) with a single core enabled (multicore
>> disabled in BIOS).
>> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
>> and reproduce it with the mainline kernel.
>>
>> ProblemType: Bug
>> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
>> Architecture: i386
>> ArecordDevices:
>>  **** List of CAPTURE Hardware Devices ****
>>  card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
>>    Subdevices: 1/1
>>    Subdevice #0: subdevice #0
>> AudioDevicesInUse:
>>  USER        PID ACCESS COMMAND
>>  /dev/snd/controlC0:  etrusco    1606 F.... pulseaudio
>>                       etrusco   15151 F.... foobar2000.exe
>> CRDA: Error: [Errno 2] No such file or directory
>> Card0.Amixer.info:
>>  Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
>>    Mixer name   : 'Realtek ALC269'
>>    Components   : 'HDA:10ec0269,1b0a4009,00100004
>> HDA:11c11040,1b0a4007,00100200'
>>    Controls      : 19
>>    Simple ctrls  : 11
>> Card1.Amixer.info:
>>  Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
>>    Mixer name   : 'ATI R6xx HDMI'
>>    Components   : 'HDA:1002aa01,00aa0100,00100100'
>>    Controls      : 4
>>    Simple ctrls  : 1
>> Card1.Amixer.values:
>>  Simple mixer control 'IEC958',0
>>    Capabilities: pswitch pswitch-joined penum
>>    Playback channels: Mono
>>    Mono: Playback [on]
>> Date: Fri Feb 19 05:25:42 2010
>> DistroRelease: Ubuntu 10.04
>> EcryptfsInUse: Yes
>> MachineType: Philco PHN10XXX.
>> Package: linux-image-2.6.32-13-generic 2.6.32-13.18
>> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-13-generic
>> root=UUID=d482e94f-9370-4ad2-9536-986541003db5 ro acpi.power_nocheck...

Read more...

Download full text (11.0 KiB)

I also agree... compared to Karmic, battery life reduces to half

--- On Thu, 7/15/10, Antonio <email address hidden> wrote:

From: Antonio <email address hidden>
Subject: Re: [Bug 524281] Re: Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled
To: <email address hidden>
Date: Thursday, July 15, 2010, 6:52 AM

I agree with the last two comments. Battery lasts half of the time it
lasts on windows 7 and the laptop overheats significantly on a large
spectrum of machines.

This should be a critical bug.

BTW, is there an "official" kernel we can test against this bug? If
yes can anyone post a link please?

Thanks,
Antonio

Il 15 luglio 2010 13.24, alx5000 <email address hidden> ha scritto:
> I wholeheartedly agree with mabawsa. I've downgraded to Karmic for this very
> reason. My battery lasts as long as it should, and my laptop very rarely
> overheats (it ran hot as hell with Lucid).
>
> 2010/7/15 mabawsa <email address hidden>
>
>> If this effects the time the user can use the laptop on battery so much
>> surely the importance should be set to critical (I am getting double the
>> life using Windows 7 even with all that powertop can do).
>> Its difficult to convince users to try linux if you say it will decrease
>> the performance of the machine.
>>
>> --
>> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
>> Core 2 Duo even with only 1 core enabled
>> https://bugs.launchpad.net/bugs/524281
>> You received this bug notification because you are a direct subscriber
>> of the bug.
>>
>> Status in “linux” package in Ubuntu: Triaged
>> Status in “linux-2.6” package in Debian: Incomplete
>>
>> Bug description:
>> powertop reports above 70 wakes per second in "[kernel scheduler] Load
>> balancing tick" task, and above 200 when there's any little load, running on
>> a Core 2 Duo processor (T6500) with a single core enabled (multicore
>> disabled in BIOS).
>> Will still try noapic, nolapic, maxcpus and nosmp in the boot parameters
>> and reproduce it with the mainline kernel.
>>
>> ProblemType: Bug
>> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
>> Architecture: i386
>> ArecordDevices:
>>  **** List of CAPTURE Hardware Devices ****
>>  card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
>>    Subdevices: 1/1
>>    Subdevice #0: subdevice #0
>> AudioDevicesInUse:
>>  USER        PID ACCESS COMMAND
>>  /dev/snd/controlC0:  etrusco    1606 F.... pulseaudio
>>                       etrusco   15151 F.... foobar2000.exe
>> CRDA: Error: [Errno 2] No such file or directory
>> Card0.Amixer.info:
>>  Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
>>    Mixer name   : 'Realtek ALC269'
>>    Components   : 'HDA:10ec0269,1b0a4009,00100004
>> HDA:11c11040,1b0a4007,00100200'
>>    Controls      : 19
>>    Simple ctrls  : 11
>> Card1.Amixer.info:
>>  Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
>>    Mixer name   : 'ATI R6xx HDMI'
>>    Components   : 'HDA:1002aa01,00aa0100,00100100'
>>    Controls      : 4
>>    Simple ctrls  : 1
>> Card1.Amixer.values:
>>  Simple mixer control 'IEC958',0
>>    Capabilities: pswitch pswitch-joined ...

timo skrempel (timoskrempel) wrote :

I am considering switching to karmic or another distro because of this bug. It is open for half a year now, right?

Jeremy Davis (jedmeister) wrote :

I would think that in this age of mobile computing - with so many notebooks and netbooks around that this would be considered a very serious bug for this large (and growing) user group. I haven't done enough testing to confirm but I think this bug is also affecting many other (perhaps all) Lucid systems (I think it may just more apparent on laptops because of noticable reduced battery life). Some adhoc testing has shown that a very basic 10.04 server install (running on KVM) is chewing noticeably more cpu cycles Also I think the ever growing focus on more efficient power and concerns for reducing environmental footprint would also be factors

As a workaround I have found using the Karmic kernel restores battery life (and reduces overheating). Whilst it works, its hardly an ideal situation and not really useful for newb users. This sort of serious regression, especially when left unaddressed, is a really unfortunate blight for such a promising notable and user friendly distro. Even moreso when this is a LTS. I was hoping to run 10.04 on a number of work netbooks that I administer. Unfortunately it looks like that may not be a realistic option.

Is there anything ppl like myself can do to assist getting this bug fixed asap?

Jeremy Davis (jedmeister) wrote :

Sorry, please disregard the comment re this bug affecting a server system - I don't think that is related.

Richard Kleeman (kleeman) wrote :

I would also agree that this really needs fixing. I have tried a large range of kernels now ranging from 2.6.31 through to 2.6.35 (maverick kernel) and compared wakeups using powertop. The only sensitivity in this list is that the earliest kernel (Karmic kernel 2.6.31) shows half the wakeups of all the other kernels. Such a result is consistent with the many reports above of a halving of battery life between Karmic and Lucid.

 I also note that this issue was raised on the kernel mailing list in April and May and a patch was suggested but there was no follow up. Note that this is also a Debian bug as well. My strong suspicion is that this is a basic problem with the kernel and switching to another distro would not help. If this is not the case I would welcome evidence that another distro with a kernel in the range 2.6.32-2.6.35 DOES NOT have this issue.

If it is an upstream kernel problem I think that someone needs to lodge a critical bug on the kernel bugzilla.

Richard Kleeman (kleeman) wrote :

Some further searching turned up this relevant thread on the kernel mailing list:

http://www.listware.net/201007/linux-kernel/16253-high-power-consumption-in-recent-kernels.html

A patch is tested there which resulted in a very big reduction in wakeups and power consumption. There is a lot of technical discussion in that thread which I did not understand but seems important. Comments by a Ubuntu kernel expert would be very welcome.....

Richard Kleeman (kleeman) wrote :

Here is the relevant thread on lkml:

http://lkml.org/lkml/2010/7/6/172

Notice that is is very recent (<10 days ago) and that Arjan van de Ven the author
of powertop and a lead kernel developer working for intel is involved in the thread.

Brian Rogers (brian-rogers) wrote :

For anyone who wants to test a kernel with the scheduling changes that will be merged in 2.6.36, I've uploaded a kernel to this PPA:
https://launchpad.net/~brian-rogers/+archive/power

This is Maverick's v2.6.35-rc5 kernel with the sched/core branch from the tip tree merged in.

The build system is backlogged right now, so it might be about 24 hours before the builds are actually available.

Jan-Philipp Litza (jplitza) wrote :

Well, I was about to rant half an hour ago about people complaining here without understanding the issue, but Richard clearly explained the issue - thanks for that.

As for other ditros, I just booted my Arch Linux I had installed some time ago to test this. The older powertop version 1.11 shows the wakeups as coming from "hrtimer_start_range_ns (tick_sched_timer)" instead of "[kernel scheduler] Load balancing tick", but has exactly the same numbers as the newer powertop 1.12. Arch was running 2.6.34-ARCH, my Ubuntu is running 2.6.32-24 as well as 2.6.34.1, and the figures are identical, so NO, switching the distro doesn't help!

I'll rework the description in a moment to better describe the cause of the problem, list possible workarounds (namely downgrading to 2.6.31 or upgrading to -tip/Brian's ppa) and linking in the lkml threads.

description: updated
Richard Kleeman (kleeman) wrote :

I tested Brian Roger's 2.6.35-rc5 kernel with the scheduling changes and saw little difference on a Intel(R) Core(TM)2 Duo CPU (Thinkpad X300). Here is the powertop output with a firefox browser open:

Wakeups-from-idle per second : 308.9 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  30.6% ( 79.3) [kernel scheduler] Load balancing tick
  19.2% ( 49.9) [extra timer interrupt]
  16.5% ( 42.9) [kernel core] hrtimer_start (tick_sched_timer)
   0.1% ( 0.3)D upowerd
   5.9% ( 15.2) [iwlagn] <interrupt>
   0.3% ( 0.8)D gnome-settings-
   4.6% ( 11.9) firefox-bin
   4.2% ( 11.0) nautilus
   3.9% ( 10.0) syndaemon
   3.1% ( 8.1) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
   2.2% ( 5.7) [ahci] <interrupt>
   1.6% ( 4.3) [TLB shootdowns] <kernel IPI>
   0.0% ( 0.0)D flush-8:0
   1.2% ( 3.1) Xorg

Compared with the standard Lucid kernel, the Load balancing tick wakeups are down about 10% while the hrtimer_start (tick_sched_timer) are up some.

These results compare with using the Karmic kernel where wakeups were halved.

Brian: Did you apply the patch noted in the lkml thread I posted above? The claim there is a substantial wakeups reduction with that applied.

Brian Rogers (brian-rogers) wrote :

I didn't include that before, so now I'm uploading a new kernel (2.6.35rc5-power2.1) with the patch from http://lkml.org/lkml/2010/7/8/122

On my netbook (single cpu Atom N450) the -power1 improves considerably the situation: idling there is a trascurable amount of wake ups again (very good) while working the situation is somehow better but there are still a lot of balancing tick (a bit better).

If I understand correctly, on a single core system they should be totally absent, isn't it?

(BTW: I'd really like if solving this bug would double battery life, but sadly this is quite pessimistic, or optimistic depending on the point of view)

Thanks for the backporting/patching! I'm going to try the new one.

mabawsa (mabawsa) wrote :

I already have a patched kernel on y system for another issue. Could anybody point me to the right patches to test?

skhawam (s-khawam) wrote :

Hi Brian,

I tried the 2.6.35rc5-power2 kernel and it's definitely better than even the Karmic 2.6.31-20 I was using. With no load it reduces the wakes and power (I get 6ms C4 state as opposed to 3ms). However, with simple load on (chrome, thunderbird, emapthy and ssh client running) I get around ~300 wakes per second and 3ms C4 state. The power consumption is much better but I think there is still room for improvements. This is the best kernel till now.

I have a Dell Mini 10v, with Atom N270 processor.

Sami

Richard Kleeman (kleeman) wrote :

OK I tried the power2 kernel and there is a definite improvement. Here are the latest powertop numbers:

Wakeups-from-idle per second : 207.9 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  27.3% ( 33.8) [kernel scheduler] Load balancing tick
   0.7% ( 0.9)D gnome-settings-
  12.3% ( 15.3) [extra timer interrupt]
   9.8% ( 12.1) [iwlagn] <interrupt>
   9.8% ( 12.1) firefox-bin
   8.9% ( 11.0) nautilus
   8.1% ( 10.0) syndaemon
   3.4% ( 4.2) [TLB shootdowns] <kernel IPI>
   2.5% ( 3.1) Xorg
   2.4% ( 3.0) [kernel core] hrtimer_start (tick_sched_timer)

You can see substantial reductions in the wakeups from all three kernel sources as well as the total number of wakeups. The reductions are pretty much consistent with the numbers reported in the kernel thread.
Also subjectively the laptop does appear to be running a bit cooler (maybe 4C).

Just for clarification Brian, you patched with the second patch in the thread right? (two were mentioned and the second was better).

Looks like there is still room for improvement but at least this is progress. Pretty simple patch too by the look of it.

Brian Rogers (brian-rogers) wrote :

Yeah, I used the second patch. I applied it on top of the tip/sched/core branch. I'll attach it for convenience.

I've got such results in Arch Linux:

Top causes for wakeups:
  25,0% ( 25,1) <kernel IPI> : Rescheduling interrupts
  19,0% ( 19,1) <kernel core> : hrtimer_start_range_ns (tick_sched_timer)
  11,3% ( 11,3) firefox : hrtimer_start_range_ns (hrtimer_wakeup)
  11,2% ( 11,2) <interrupt> : ath
  10,2% ( 10,3) <kernel core> : hrtimer_start (tick_sched_timer)
   9,9% ( 9,9) radeon/1 : queue_delayed_work (delayed_work_timer_fn)
   8,1% ( 8,1) <interrupt> : ahci
   1,0% ( 1,0) powernowd : hrtimer_start_range_ns (hrtimer_wakeup)
   0,5% ( 0,5) <interrupt> : ohci_hcd:usb4, ohci_hcd:usb6, radeon
   0,5% ( 0,5) kwin : hrtimer_start_range_ns (hrtimer_wakeup)

2.6.34-ARCH #1 SMP PREEMPT Mon Jul 5 22:12:11 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ AuthenticAMD GNU/Linux

tags: added: patch
Jorge O. Castro (jorge) wrote :

Brian Rogers,

Thanks for the PPA, it's made this easier to test. My wake ups are down from ~250 to about ~45. I asked Jeremy Foshee from the kernel team on how best to continue testing this:

 - We should probably see how the LKML discussion evolves and see if there's a final fix. When the patch is accepted upstream we can put it in Maverick.
 - At some point after testing since this is affecting Lucid users we can cherrypick it into an update.
 - We can use this bug to give feedback on the patch, so if it's improved your situation (or worsened it) then we can help collectively test.

Caleb Case (calebcase) wrote :

Before: ~80 [kernel scheduler] Load balancing tick
After: ~20

2.6.35rc5-power2-generic #1-Ubuntu SMP Sat Jul 17 08:39:54 UTC 2010 x86_64 GNU/Linux
Intel(R) Core(TM) i5 CPU M 450

Much improved, thanks!

Richard Kleeman (kleeman) wrote :

Jorge and Brian,

Thanks for your input and ppa. Much appreciated. My reading of the kernel LKML discussion is that something rather serious was screwed up in the implementation of the nohz part of the kernel scheduler and that the patch was a first test of this idea. The results there and in this bug thread seem to confirm that suspicion but I would be more comfortable if the two kernel developers in that thread (Peter Zijlstra and Arjan Van de Ven) had further responded on the issue. Perhaps a Ubuntu kernel developer should engage with these devs on LKML to clear this issue up...

Brian Rogers (brian-rogers) wrote :

It looks like nohz_ratelimit will be reverted for 2.6.35, which is essentially the change included in the power2 kernel. But the power2 kernel also includes changes scheduled for 2.6.36. So to ensure that the revert alone is enough to solve the problem, I'm uploading a power3 kernel with only the revert.

If the power3 kernel is fine, then this will be solved in the final 2.6.35 kernel and therefore Maverick.

Now the bad news: this fix can't be backported to Lucid's 2.6.32 because it's simply the removal of something added after 2.6.32. In other words, Lucid suffers from this problem for a different reason. On the bright side, it's apparently something that was fixed later, so it should be possible to look at post-2.6.32 scheduler changes and find a fix.

I installed power2 and noticed some issues where, under small load (playing
a video), the mouse cursor would move very slowly. Has anyone else seen
this? I can try to reproduce later this week.

On Jul 19, 2010 12:26 PM, "Brian Rogers" <email address hidden> wrote:

It looks like nohz_ratelimit will be reverted for 2.6.35, which is
essentially the change included in the power2 kernel. But the power2
kernel also includes changes scheduled for 2.6.36. So to ensure that the
revert alone is enough to solve the problem, I'm uploading a power3
kernel with only the revert.

If the power3 kernel is fine, then this will be solved in the final
2.6.35 kernel and therefore Maverick.

Now the bad news: this fix can't be backported to Lucid's 2.6.32 because
it's simply the removal of something added after 2.6.32. In other words,
Lucid suffers from this problem for a different reason. On the bright
side, it's apparently something that was fixed later, so it should be
possible to look at post-2.6.32 scheduler changes and find a fix.

--
Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core
2 Duo even with on...

timo skrempel (timoskrempel) wrote :

I have success with power2: better battery usage (about 2 watts less according to powertop) and no other problems (no slow mouse as above). No backport to 2.6.32 is indeed bad news. Many users might not even know they have this problem and have to wait quite a while for a fix. Anyway, I am happy now. Thanks.

Allan Pratt (apratt-) wrote :

Leif, what you wrote is either encouraging or not, depending on how I'm reading it. I think you're saying that the upstream 2.6.32 doesn't have this problem, so the fix won't come from that direction. But Lucid's 2.6.32 (somehow) does have this problem. Where does it come from, and is anybody looking at it?

The bottom line is user support. I hope the responsible people on the Ubuntu project are keeping this in mind. If there's a heat and power problem in Ubuntu Lucid LTS, then there should be a fix for Ubuntu Lucid LTS. Desktop users don't know or care about what problems or fixes exist "upstream" or in Maverick. LTS means "We'll take care of you with updates." Right?

Leif Walsh (leif.walsh) wrote :

I installed the latest (at the time) build from Brian's ppa. I saw
some power improvements (not a lot IIRC), but gained the mouse bug.
This is in contrast with the default lucid kernel (up to date as of
yesterday I think).

I don't really know what you mean by encouraging, and I definitely
wasn't saying anything about upstream.

I completely agree, fixes need to be backported into lucid when they
come in. LTS should include the guarantee that a big issue like this
doesn't leave an LTS release out cold for another three years while
"less-stable" releases charge ahead without the bug.

2010/7/19 Allan Pratt <email address hidden>:
> Leif, what you wrote is either encouraging or not, depending on how I'm
> reading it. I think you're saying that the upstream 2.6.32 doesn't have
> this problem, so the fix won't come from that direction. But Lucid's
> 2.6.32 (somehow) does have this problem. Where does it come from, and is
> anybody looking at it?
>
> The bottom line is user support. I hope the responsible people on the
> Ubuntu project are keeping this in mind. If there's a heat and power
> problem in Ubuntu Lucid LTS, then there should be a fix for Ubuntu Lucid
> LTS. Desktop users don't know or care about what problems or fixes exist
> "upstream" or in Maverick. LTS means "We'll take care of you with
> updates." Right?
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Cheers,
Leif

timo skrempel (timoskrempel) wrote :

update: I get a jerky mouse pointer under high load with power2 too.

Al Sutton (al-sutton) wrote :

Can anyone confirm if this is the cause of the high load on EC2?

I've 2 instances which have been "upgraded" to 10.04 LTS and both sit around 1.0 and peak at 4 or 5. When the same software is run on the same instance type but under 8.04 LTS the load rarely goes above 0.5.

That problem has been open for nearly 3 months at https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910 so I'm hoping there's some activity here that will at least ease the situation if not resolve it all together.

I tried power2 in the Eee with Atom N450 and, while I can't be sure that there are less wakeups, I feel a generic slugginess everywhere in the system, not only with the pointer. I'm going back to power1 for now.

Brian Rogers (brian-rogers) wrote :

Have you tried power3?

You are really fast! :D

I'm going to report again asap. Thanks!

Leif Walsh (leif.walsh) wrote :

Not yet, have been busy. Making a mental note to try it out, though
I'll be occupied again at least tonight, maybe longer.

Unfortunately, I have no clue how to reproduce the mouse thing
reliably. I will also try to do this when I get a chance.

2010/7/21 Brian Rogers <email address hidden>:
> Have you tried power3?
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Cheers,
Leif

timo skrempel (timoskrempel) wrote :

quick update:I get the mouse lag under load with power2 and power3.

Con:
Using power2 and power3, I get a very high xorg CPU usge (80-90% just for xorg) when watching a video, which does not happen using 2.6.32. The mouse lag goes without saying when watching a video.

Pro:
Very few Load Balancing Ticks when idle (maybe 7-9 per. sec.). Under average load to heavy load, those ticks do rise.

Leif Walsh (leif.walsh) wrote :

-power3 has the same mouse problem I saw earlier.

If anyone can tell me what to do to log something meaningful, I'd be happy
to try to get you a log of the offending behavior. Let me know.

Brian Rogers (brian-rogers) wrote :

I've uploaded a pair of rc6-based kernels, rc6-nopatch1 and rc6-power3. The 'nopatch' kernel is the baseline to compare against, and rc6-power3 has the proposed revert, the same as rc5-power3. Between these two kernels, the only difference is the proposed change.

I'd like to see some testing for the responsiveness issue with these two kernels. Try to quantify the issue in some way if you can. For example, if the CPU usage is different between the two kernels, then I'd like to see the numbers from top for both kernels.

If we confirm that the patch is indeed the cause of the problem, I'll relay the message to the developers and they'll figure out what to do next.

Leif Walsh (leif.walsh) wrote :

Thanks for setting these up. Sorry I haven't been able to give a better bug
report than "annoying behavior" so far, I should haber time to investigate
tomorrow.

For what it's worth, there's a weird thing that happens on some android
phones where the touch screen is "coarsely quantized" if you try to
overclock it to a speed incompatible with the chip. It's at best weak
speculation, but maybe something funky is happening where X is thinking the
cpu it's on is at a different speed than it's really running at. Honestly, I
should try to entertains the patch before making such claims.

On Jul 24, 2010 1:11 AM, "Brian Rogers" <email address hidden> wrote:

I've uploaded a pair of rc6-based kernels, rc6-nopatch1 and rc6-power3.
The 'nopatch' kernel is the baseline to compare against, and rc6-power3
has the proposed revert, the same as rc5-power3. Between these two
kernels, the only difference is the proposed change.

I'd like to see some testing for the responsiveness issue with these two
kernels. Try to quantify the issue in some way if you can. For example,
if the CPU usage is different between the two kernels, then I'd like to
see the numbers from top for both kernels.

If we confirm that the patch is indeed the cause of the problem, I'll
relay the message to the developers and they'll figure out what to do
next.

--
Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core
2 Duo even with on...

Leif Walsh (leif.walsh) wrote :

*understand, not entertain. Damn autocomplete.

On Jul 24, 2010 1:11 AM, "Brian Rogers" <email address hidden> wrote:

I've uploaded a pair of rc6-based kernels, rc6-nopatch1 and rc6-power3.
The 'nopatch' kernel is the baseline to compare against, and rc6-power3
has the proposed revert, the same as rc5-power3. Between these two
kernels, the only difference is the proposed change.

I'd like to see some testing for the responsiveness issue with these two
kernels. Try to quantify the issue in some way if you can. For example,
if the CPU usage is different between the two kernels, then I'd like to
see the numbers from top for both kernels.

If we confirm that the patch is indeed the cause of the problem, I'll
relay the message to the developers and they'll figure out what to do
next.

--
Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core
2 Duo even with on...

Richard Kleeman (kleeman) wrote :

I tried the rc6-power3 kernel and these are the powertop numbers

Wakeups-from-idle per second : 258.3 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  35.2% ( 70.1) [kernel scheduler] Load balancing tick
  20.7% ( 41.1) [extra timer interrupt]
   6.5% ( 12.9) [iwlagn] <interrupt>
   6.0% ( 11.9) firefox-bin
   5.5% ( 11.0) nautilus
   5.0% ( 9.9) syndaemon
   4.1% ( 8.1) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
   2.4% ( 4.9) [kernel core] hrtimer_start (tick_sched_timer)
   2.2% ( 4.3) [TLB shootdowns] <kernel IPI>

So the results are worse than with the rc5-power2 kernel but better than the original rc5 Brian Rogers kernel in that there are no wakeups from

[kernel core] hrtimer_start (tick_sched_timer)

which amounts to about 40 wakeups.

These changes are clearly influencing this problem but I feel like we need an expert (kernel dev) to interpret what is going on with the benchmarks from various code changes.....

Brian Rogers (brian-rogers) wrote :

It's expected that the new kernel won't work as well as rc5-power2, since that kernel had additional scheduler enhancements aside from reverting nohz_ratelimit. But those other scheduler changes won't go into the kernel until 2.6.36 so they can get more testing. The important thing to determine right now is whether there are any regressions going from rc6-nopatch1 to rc6-power3. If there are, they need to be dealt with. But if not, this can go into 2.6.35 with no downsides.

Richard Kleeman (kleeman) wrote :

Brian, thanks for the clarification. One thing I am still a little unclear on is what the -tip
kernel is that is mentioned in the lkml thread. Does this have a different set of patches?

Brian Rogers (brian-rogers) wrote :

It's just a branch where changes to the scheduler and other related code go before they are merged into the mainline kernel. By running a -tip kernel, you get those changes ahead of time.

FWIW, I applied the whole bunch of http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c4efd6b569b2646e1346a08a4c40286f8bcb5f11 to the stable 2.6.35.1 and it fixes the wake ups mostly. I still have ~20 wakeup/s from load balance tick but it's way down from ~200 on my system.

Specifically the "Revert nohz_ratelimit() for now" commit makes it better, I had a kernel with only that one applied and it helped as well.

Thanks Brian,

linux-image-2.6.35-power3-generic dropped my Wakeups from 250 per second to 9.

Michael Steel (boozezela) wrote :

The patched kernel is no good for me: powertop is still showing 369 wakeups per second and my CPU runs constantly at 73 degrees when idle (and yes, the heatsink is clean, the governor is set to powersafe and so on).

Long story short, I have a 15 degrees difference between Windows XP and Lucid.

lofas (henrik-lofas) wrote :
Download full text (5.0 KiB)

Brian, I have tried the power3-generic patch and it improves some from the nopatch kernel. The wakeups are halved but the load average of the system is still running high. Output from powertop and top.

power3:

top - 10:22:59 up 14 min, 4 users, load average: 0.13, 0.28, 0.23
Tasks: 183 total, 1 running, 182 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.8%us, 1.4%sy, 0.0%ni, 92.5%id, 0.8%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 4008656k total, 1358760k used, 2649896k free, 80588k buffers
Swap: 3534264k total, 0k used, 3534264k free, 519024k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 1230 root 20 0 160m 28m 12m S 4 0.7 0:29.74 Xorg
 2663 lofas 20 0 928m 160m 14m S 3 4.1 0:34.86 chrome
 2563 lofas 20 0 599m 57m 26m S 2 1.5 0:11.25 chrome
 2683 lofas 20 0 59272 14m 9436 S 2 0.4 0:13.09 npviewer.bin

 PowerTOP version 1.12 (C) 2007 Intel Corporation

Cn Avg residency P-states (frequencies)
C0 (cpu running) ( 7.7%) Turbo Mode 1.9%
polling 0.0ms ( 0.0%) 2.40 Ghz 0.2%
C1 mwait 0.2ms ( 0.0%) 1.60 Ghz 0.8%
C2 mwait 0.7ms ( 7.8%) 800 Mhz 97.1%
C6 mwait 2.0ms (84.5%)

Wakeups-from-idle per second : 539.4 interval: 10.0s
Power usage (5 minute ACPI estimate) : 0.3 W (244.4 hours left)

Top causes for wakeups:
  21.8% (140.4) npviewer.bin
  16.7% (107.6) [iwlagn] <interrupt>
  13.9% ( 89.5) [extra timer interrupt]
  11.9% ( 76.4) [kernel scheduler] Load balancing tick
  10.1% ( 65.2) [i915] <interrupt>
   5.1% ( 32.6) USB device 1-3.2 : USB-PS/2 Optical Mouse (Logitech)

NOPATCH:

top - 10:29:56 up 3 min, 3 users, load average: 0.17, 0.37, 0.17
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.2%us, 4.6%sy, 0.0%ni, 88.6%id, 0.0%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 4008656k total, 1287232k used, 2721424k free, 75072k buffers
Swap: 3534264k total, 0k used, 3534264k free, 458580k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 1171 root 20 0 172m 30m 12m S 8 0.8 0:12.70 Xorg
 1994 lofas 20 0 210m 15m 11m S 4 0.4 0:00.78 gnome-terminal
 2417 lofas 20 0 59384 14m 9424 S 3 0.4 0:04.12 npviewer.bin
 1814 lofas 20 0 325m 35m 8696 S 2 0.9 0:01.84 compiz
 2038 root 20 0 12172 2540 1344 S 1 0.1 0:00.42 powertop

PowerTOP version 1.12 (C) 2007 Intel Corporation

Cn Avg residency P-states (frequencies)
C0 (cpu running) (10.1%) Turbo Mode 1.0%
polling 0.0ms ( 0.0%) 2.40 Ghz 0.1%
C1 mwait 0.0ms ( 0.0%) 1.60 Ghz 0.3%
C2 mwait 0.5ms ( 5.5%) 800 Mhz 98.6%
C6 mwait 1.6ms (84.4%)

Wakeups-from-idle per second : 641.8 interval: 10.0s
Power usage (5 minute ACPI estimate) : 0.4 W (160.6 hours left)

Top causes for wakeups:
  22.2% (173.8) [kernel ...

Read more...

Rocko (rockorequin) wrote :

FWIW, the mainline 2.6.35.2 kernel has reverted the nohz_ratelimit patch now (http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.35.y.git;a=commit;h=1dc89aec877583e3e42421be77b063724a4bbb07) so Maverick should get this improvement (ie have a reduced number of load balancing ticks).

I tried 2.6.35.2 with the other load balancing patches (that convert to a 'push' model IIRC) and on my laptop the number of load balancing ticks drop from around 50 to perhaps 35 per second. Power usage drops approximately one watt.

Michael Steel (boozezela) wrote :

Wakeups-from-idle per second : 628.5 interval: 10.0s
Power usage (ACPI estimate): 28.0W (1.8 hours)

There are good chances that I will have to bin the CPU and replace it very soon since now it overheats even during boot and randomly spikes to 80 degrees Celsius when idle. :/

devsk (funtoos) wrote :

This because of CONFIG_SCHED_DEBUG and CONFIG_SCHEDSTATS. Get rid of these and wakeups shown in "[kernel scheduler] Load balancing tick" go away. I just ran into by accident and compared my config with previous version where I wasn't seeing this.

This was basically triggered by CONFIG_LATENCYTOP selection in the menuconfig.

I think that this bug is affecting my machine too (xps 1330m) and in few minutes after the boot the "load balancing tick" spikes between 400 and 500 and everything gets unusable. it's too slow even to surf or watch a move. i've been using ubuntu since 06.10 but untill this gets fixed i'm enough desperate to use vista instead.

Right now i'm using 2.6.35-power3-generic and this is my powertop result:

63.6% (404.1) [kernel scheduler] Load balancing tick
   2.7% ( 16.9)D chrome
   5.2% ( 33.0) docky
   3.7% ( 23.6) [Rescheduling interrupts] <kernel IPI>
   3.0% ( 19.1) desktopcouch-se

Brian Rogers (brian-rogers) wrote :

Based on devsk's comment, I've uploaded two kernels for lucid to my power-saving PPA:
https://launchpad.net/~brian-rogers/+archive/power

The first, 2.6.35-0unpatched+18.24~lucid, is essentially just maverick's 2.6.35-18.24 built for lucid. Note that despite the name 'unpatched', it still has the patch for this bug because that has been included in maverick's standard kernel.

The second kernel, 2.6.35-power+18.24~lucid, should be identical except that CONFIG_LATENCYTOP, CONFIG_SCHED_DEBUG, and CONFIG_SCHEDSTATS have been removed from the configuration.

Does anyone else observe the same thing as devsk, that on a kernel with the latencytop and scheduler statistics options disabled, wakes per second are lower? Or do they remain the same?

(BTW, I also uploaded a 'power' kernel for maverick, which can be compared against 2.6.35-18.24.)

TomasHermosilla (thermosilla) wrote :

running powertop in maverick gives the following results:

Causas principales de despertares:
  22,5% (387,2) [planificador del núcleo] Tick del balanceo de carga
  22,3% (384,4) VirtualBox
   9,4% (162,6) [Function call interrupts] <núcleo IPI>
   8,6% (148,2) [tiempo de interrupción extra]
   8,1% (138,8) chromium-browse
   7,5% (128,4) [i915] <interrupción>

CPU: Intel Core 2 Duo T6570 @2.10GHz (x2)
Kernel: Linux 2.6.35-17-generic-pae

Brian,

Installing you 2.6.35-power+18.24~lucid kernel package actually increased the number of wakes in my case. Also, it broke brightness adjustment for my laptop. Or am I installing the wrong kernel? I basically added your PPA, and then did sudo apt-get install linux-image-2.6.35-power+18-generic along with the header files.

I just installed the new version of 2.6.35-power+18-generic and i saw a performance improvement. the load balancing tick do not peak as with the previuos version.

Top causes for wakeups:
  23.6% (152.9)D chrome
  25.6% (166.4) [kernel scheduler] Load balancing tick
  13.8% ( 89.7) [extra timer interrupt]
   4.4% ( 28.6)D notify-osd
   5.4% ( 35.1) docky
   0.0% ( 0.0)D flush-8:0

Brian Rogers (brian-rogers) wrote :

I want to do a controlled test.

On lucid, I'm interested in how 2.6.35-0unpatched+18.24~lucid and 2.6.35-power+18.24~lucid compare.
On maverick, I'm interested in how 2.6.35-18.24 and 2.6.35-power+18.24 compare.

Basically, I'm investigating devsk's comment about the effect of certain config settings.

I suggest the following test on a freshly booted idling system:
sudo powertop -d -t 300

That will aggregate data for five minutes.

My results for the "power" kernel:

Wakeups-from-idle per second : 249.2 interval: 300.0s
no ACPI power usage estimate available
Top causes for wakeups:
  29.7% ( 72.9) [kernel scheduler] Load balancing tick
  14.2% ( 34.9) docky
   8.6% ( 21.0) [extra timer interrupt]
   8.1% ( 19.9) desktopcouch-se
   6.3% ( 15.5) [ata_piix] <interrupt>
   5.6% ( 13.8) [iwlagn] <interrupt>
   4.9% ( 12.1) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)

And for the "unpatched":

Wakeups-from-idle per second : 228.8 interval: 300.0s
no ACPI power usage estimate available
Top causes for wakeups:
  31.7% ( 85.6) [kernel scheduler] Load balancing tick
  12.9% ( 34.9) docky
   9.1% ( 24.5) [extra timer interrupt]
   7.4% ( 19.9) desktopcouch-se
   5.8% ( 15.5) [ata_piix] <interrupt>
   5.0% ( 13.4) [iwlagn] <interrupt>
   4.5% ( 12.1) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)

Rocko (rockorequin) wrote :

FWIW, the 2.6.36-rc3 kernel (modified only with some desktop responsiveness patches) shows a much lower rate of load balancing ticks. On my system 2.6.35.4 (unmodified) was showing 50-60 ticks per second, or 35-40 with the load balancing patches, while 2.6.36-rc3 is showing around 15-22.

Richard Kleeman (kleeman) wrote :

Rocko,
Is that kernel available for testing anywhere? Sounds interesting...

Rocko (rockorequin) wrote :

@Richard: The easiest way is to try 2.6.36-rc3 is to grab Ubuntu's deb packages from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.36-rc3-maverick. I don't think the desktop responsiveness patches will affect the load balancing tick, but if you're interested in building a kernel with them, I got the kernel sources from http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.36-rc3.tar.bz2 and the responsiveness patches from the email trail at http://lkml.org/lkml/2010/8/26/327 (it wasn't straightforward; patches 8 and 9 needed modification to work in 2.6.36, and also note that by default the enhancements are disabled).

Richard Kleeman (kleeman) wrote :

Rocko,
Tried it and there wasn't a great deal of difference overall on my Thinkpad X300

The Load balancing ticks dropped some but
were replaced by a new item called kworker/0:0

Here is the result from powertop -d

Wakeups-from-idle per second : 226.3 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  20.0% ( 28.1) kworker/0:0
  13.5% ( 19.0) [kernel scheduler] Load balancing tick
  13.0% ( 18.3) firefox-bin
  12.4% ( 17.4) [extra timer interrupt]
   7.1% ( 10.0) nautilus
   5.4% ( 7.6) [iwlagn] <interrupt>
   3.5% ( 4.9) syndaemon
   3.2% ( 4.5) [TLB shootdowns] <kernel IPI>
   3.2% ( 4.5) [kernel core] hrtimer_start (tick_sched_timer)
   2.9% ( 4.1) apt-check
   2.5% ( 3.5) [acpi] <interrupt>
   2.1% ( 3.0) Xorg

Rocko (rockorequin) wrote :

I get the kworker wakeups as well, although they are not always higher than the load balancing tick - I found if I left things alone they dropped much lower.

The most interesting thing for me was that applying the load-balancing patches to 2.6.35 saved 1-2 W in the estimated ACPI power usage, and I get that drop in 2.6.36 without having to apply patches. I presume that's because they have been incorporated into the newer kernel.

Jeremy Foshee (jeremyfoshee) wrote :

Declining the Maverick specific nomination for now and leaving this open against the actively developed Ubuntu kernel (which happens to be Maverick at this time). Will re-open the nomination should a fix be narrowed down which we can confirm specifically resolves this issue in Maverick.

Sidnei da Silva (sidnei) wrote :

The thread at:

  https://groups.google.com/group/zen_kernel/browse_thread/thread/71a306f1a3a7b318?pli=1

Mentions this patch which seems slightly meaningful:

   http://lkml.indiana.edu/hypermail/linux/kernel/1007.3/01096.html

And this other patch, which seems like it's the real thing:

  http://lkml.org/lkml/2010/5/17/350

Still trying to find the actual patch that got applied to 2.6.36...

Andy Whitcroft (apw) wrote :

@Sidnei -- the meat of those changes seems to be the first one. This fix has already hit the Maverick kernel via stable. Could you test the Maverick kernel (this should work on Lucid no problem) and report if that helps at all. Please report back here.

Sidnei da Silva (sidnei) wrote :

@Andy:

I'm running a fully-up-to-date Maverick, and still getting tons of wakeups from load balancing ticks. It's consistently the top one, at:

  28.1% (200.1) [kernel scheduler] Load balancing tick

That's with:

  Linux sidnei-laptop 2.6.35-20-generic #29-Ubuntu SMP Fri Sep 3 14:55:28 UTC 2010 x86_64 GNU/Linux

Download full text (4.9 KiB)

Also running a self-compiled 2.6.35-rc3, it is amazing on my Atom netbook, but
Load Balancing ticks are still skyrocketing on my Athlon II dual core. Do I need
to enable any config option I have missed? (Used the mainline kernel PPA on
the Atom, compiled myself for the Athlon II to include a custom fix.) Maybe I
need to retry with the Maverick config.

2010/9/7 Sidnei da Silva <email address hidden>:
> @Andy:
>
> I'm running a fully-up-to-date Maverick, and still getting tons of
> wakeups from load balancing ticks. It's consistently the top one, at:
>
>  28.1% (200.1)   [kernel scheduler] Load balancing tick
>
> That's with:
>
>  Linux sidnei-laptop 2.6.35-20-generic #29-Ubuntu SMP Fri Sep 3
> 14:55:28 UTC 2010 x86_64 GNU/Linux
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Triaged
> Status in Arch Linux: New
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports many wakes per second (quantity depending on system)  in "[kernel scheduler] Load balancing tick" task, rising with little load, on many kinds of multi-core (?) systems (original report was on a Core 2 Duo processor (T6500) with a single core enabled (multicore disabled in BIOS)).
>
> Cause of the problem:
> With kernel 2.6.32, there came a patch to the scheduler that introduced this problem (that was backported to some other versions as well). Even though this problem occurred first in Lucid, it is NOT specific to Lucid or Ubuntu at all (Debian bug report at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=521944, reproducable in Arch Linux as well). Work is ongoing to get things straight in kernel, but it will take a long time until this reaches Ubuntu (see http://lkml.org/lkml/2010/7/6/172).
>
> Workarounds that DO NOT work (may improve situation but not solve it):
> - maxcpus=1
> - noapic
> - nosmp
> - nolapic
> - use mainline kernel
>
> Workarounds that DO (probably) work:
> - tip version of kernel (git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git, from http://lkml.org/lkml/2010/7/8/75)
> - use maverick's kernel with applied patches (https://launchpad.net/~brian-rogers/+archive/power, from comment #80)
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
>  **** List of CAPTURE Hardware Devices ****
>  card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
>    Subdevices: 1/1
>    Subdevice #0: subdevice #0
> AudioDevicesInUse:
>  USER        PID ACCESS COMMAND
>  /dev/snd/controlC0:  etrusco    1606 F.... pulseaudio
>                       etrusco   15151 F.... foobar2000.exe
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
>  Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
>    Mixer name   : 'Realtek ALC269'
>    Components   : 'HDA:10ec0269,1b0a4009,00100004 HDA:11c11040,1b0a4007,00100200'
>    Controls      : 19
>    Simple ctrls  : 11
> Card1.Amixer.info:
...

Read more...

Fabio Albieri (chareos) wrote :

Suffering this problem too.
Maverick with latest updates (maverick-security / maverick-updates enabled only)

Makes my laptop unusable and ubuntu not fitting its purpose.

Elias Julkunen (eliasj) wrote :

Hi! I'm using Ubuntu 10.10 with the latest updates. Here's my results for "sudo powertop -d -t 300" with three different kernels; default, mainline and Brian's. I ran that command right after I got to the desktop, so there's no programs opened.

Elias Julkunen (eliasj) wrote :
Elias Julkunen (eliasj) wrote :

The latest kernel update, 2.6.32-25-generic, has changed things a bit. At least in my case. Granted, I have changed some things to prevent cpu wakeups by adding hpet=force nohz=off highres=off to the GRUB CMDLINE since the kernel update 2.6.32-24 update. At that time Load Balancing ticks were still appearing, even when idle. With the 2.6.32-25 update there are wakeups but under a different name: [Rescheduling interrupts] <kernel IPI>. What is more surprising is that the wakeups are not appearing while idle which they were with the 2.6.32-24 and others before.
I use a selfmade powersaving script but tested wakeups in Powertop before I activated the script and here are the results. NB! I have turned off some services. This was done on a desktop C2D E6550 with Lucid installed.

When idle (no mouse movement or anything):
Wakeups-from-idle per second : 7,6 interval: 10,0s
no ACPI power usage estimate available

Top causes for wakeups:
  30,5% ( 4,0) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
  15,3% ( 2,0) [kernel core] clocksource_watchdog (clocksource_watchdog)
   7,6% ( 1,0) [uhci_hcd:usb3, nvidia] <interrupt>
   7,6% ( 1,0) gvfs-afc-volume
   7,6% ( 1,0) [kernel core] nv_kern_rc_timer (nv_kern_rc_timer)
   3,1% ( 0,4) nautilus
   2,3% ( 0,3) clock-applet
   2,3% ( 0,3) gnome-panel
   2,3% ( 0,3) gedit
   1,5% ( 0,2) rtkit-daemon
   1,5% ( 0,2) [kernel core] inc_rt_group (sched_rt_period_timer)
   1,5% ( 0,2) gnome-terminal
   1,5% ( 0,2) gnome-settings-

Here is when watching a HD movie:
Wakeups-from-idle per second : 150,8 interval: 10,0s
no ACPI power usage estimate available

Top causes for wakeups:
  42,3% (126,7) [ata_piix, ata_piix, HDA Intel] <interrupt>
  20,9% ( 62,5) [Rescheduling interrupts] <kernel IPI>
  16,4% ( 49,0) [uhci_hcd:usb3, nvidia] <interrupt>
   8,5% ( 25,6) gnome-mplayer
   8,1% ( 24,4) mplayer
   1,3% ( 4,0) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
   0,7% ( 2,0) [kernel core] clocksource_watchdog (clocksource_watchdog)
   0,3% ( 1,0) [kernel core] inc_rt_group (sched_rt_period_timer)
   0,3% ( 1,0) gvfs-afc-volume
   0,3% ( 1,0) [kernel core] nv_kern_rc_timer (nv_kern_rc_timer)
   0,1% ( 0,3) gnome-terminal
   0,1% ( 0,3) gedit
   0,1% ( 0,3) gnome-settings-
   0,1% ( 0,3) clock-applet
   0,1% ( 0,3) gnome-panel
   0,1% ( 0,2) rtkit-daemon

DH (dave-higherform) wrote :

I am seeing similar results as Elias Julkunen with 2.6.26-rc7 kernel from mainline kernels. My system is an Acer 3820T-5246 with Core i3-350M (no GPU). Kernel scheduler Load balancing ticks have dropped from about 120-230 on 2.6.35.x to a range of about 11 to 28. Even with the added wakeups from kworker/0:0, /0:1, and Rescheduler interrupts, total system wakeups are down by a factor of 2 to 3. ACPI power estimates are also down from ~16W to ~12W with a similarly loaded system.

Also seeing a side bonus in stability when switching multiple screen outputs, and input lag is still present but probably improved by 60% or so. So, all in all I recommend anyone worried about wakeups try 2.6.36 at your earliest convenience.

DH (dave-higherform) wrote :

Err that was supposed to be "with 2.6.36-rc7 kernel from mainline kernels"...

Also noticing that laptopmode is keeping the hard drive spun down for the ~5min its supposed to, whereas under 2.6.35 it was spinning back up multiple times each minute...

iiegn (g-launchpad-iiegn-de) wrote :

Results on a X200s with recent Maverick are in favor of the 2.6.35-23-generic stock kernel rather than Brian's 2.6.35-power+18-generic:

Peter Sasi (peter-sasi) wrote :

I have found that this bug is related to X: without that there is no problem!

I have accidentally deleted the nvidia kernel module, thus my laptop booted into text mode. Then I run the powertop dump and the [kernel scheduler] Load balancing tick is down to 1,6 from 64,2 with X with the same official Ubuntu 10.10 2.6.35-22 kernel!
Logs attached, hope this finally helps to fix it!
Should you need more testing contact me.

I think that the importance of the bug should be high, since it makes a vast difference in the battery runtime for all laptop users!

Peter Sasi (peter-sasi) wrote :
Peter Sasi (peter-sasi) wrote :

I also did the test with mainline kernel 2.6.36 final. It is slightly better than 2.6.35-22 with X and much worse than that without, because in both cases it has more than 100 wakeups from kworker/0:0.

Logs attached.

Stephan Diestelhorst (syon) wrote :
Download full text (5.0 KiB)

Xcausing wakeups may coincide with enabled composing. On my Intel
chipset graphics, this
causes approx. an additional 50 wakeups per second. If composing is
disabled, everything is
back to normal. On another notebook with a Radeon 5470 (using fglrx),
there was no such
difference.

In addition, it also seems that X increases the load to rather large
levels, despite being idle.
With Ubuntu 10.10 (and kernel newer than 2.6.31) I have observed loads
around 0.7, without
anything showing up in top.

2010/11/4 Peter Sasi <email address hidden>:
>
> ** Attachment added: "powertop-dump-5min-2.6.36-020636-generic.txt"
>   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/524281/+attachment/1722665/+files/powertop-dump-5min-2.6.36-020636-generic.txt
>
> ** Also affects: xorg (Ubuntu)
>   Importance: Undecided
>       Status: New
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Triaged
> Status in “xorg” package in Ubuntu: New
> Status in Arch Linux: New
> Status in “linux-2.6” package in Debian: Incomplete
>
> Bug description:
> powertop reports many wakes per second (quantity depending on system)  in "[kernel scheduler] Load balancing tick" task, rising with little load, on many kinds of multi-core (?) systems (original report was on a Core 2 Duo processor (T6500) with a single core enabled (multicore disabled in BIOS)).
>
> Cause of the problem:
> With kernel 2.6.32, there came a patch to the scheduler that introduced this problem (that was backported to some other versions as well). Even though this problem occurred first in Lucid, it is NOT specific to Lucid or Ubuntu at all (Debian bug report at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=521944, reproducable in Arch Linux as well). Work is ongoing to get things straight in kernel, but it will take a long time until this reaches Ubuntu (see http://lkml.org/lkml/2010/7/6/172).
>
> Workarounds that DO NOT work (may improve situation but not solve it):
> - maxcpus=1
> - noapic
> - nosmp
> - nolapic
> - use mainline kernel
>
> Workarounds that DO (probably) work:
> - tip version of kernel (git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git, from http://lkml.org/lkml/2010/7/8/75)
> - use maverick's kernel with applied patches (https://launchpad.net/~brian-rogers/+archive/power, from comment #80)
>
> ProblemType: Bug
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: i386
> ArecordDevices:
>  **** List of CAPTURE Hardware Devices ****
>  card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
>    Subdevices: 1/1
>    Subdevice #0: subdevice #0
> AudioDevicesInUse:
>  USER        PID ACCESS COMMAND
>  /dev/snd/controlC0:  etrusco    1606 F.... pulseaudio
>                       etrusco   15151 F.... foobar2000.exe
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
>  Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
>    Mixer name   : 'Realtek ALC269'
>    Components ...

Read more...

Fabio Albieri (chareos) wrote :

On my Intel 945GM
with no compiz or effects
tried to explicitly disable composite in xorg.conf as per previous comment

makes NO difference to me.
hundreds of wakeups, and my netbook is now at 55 minutes uptime by battery.
It feels like having an... umbilical cable.

Ubuntu 10.10 with latest updates, 2.6.35-22-generic

Pako (elektrobank01) wrote :

Here is the output from "sudo powertop -t60 -d" on Lenovo 3000 n200 with intel centrino 550 @ 2.00 GHz with nVidia Geforce Go 7300 and nVidia Proprietary driver enabled on Maverick UNE Unity fully updated.
No application was running in background!

Pako (elektrobank01) wrote :
Peter Sasi (peter-sasi) wrote :

Please note, that X is not causing wakeups itself, but it induces "[kernel scheduler] Load balancing tick"s to occour, in my tests.
I have compiz disabled.

Péter (pepe-ezkell) wrote :

Linux shadow 2.6.35-22-generic #35-Ubuntu SMP Sat Oct 16 20:45:36 UTC 2010 x86_64 GNU/Linux

kuba (jzalas) wrote :
ginkgo (davy-renaud) on 2010-11-16
Changed in linux (Ubuntu):
status: Triaged → Confirmed
Bryce Harrington (bryce) wrote :

[Looks to be a kernel bug rather than xorg. If there is actual work to be done on xorg in relation to this issue, please file a new bug report about it, since this one has gotten too long to grok.]

Changed in xorg (Ubuntu):
status: New → Invalid
Gurmeet (gurmeet1109) wrote :
Download full text (3.7 KiB)

Just a me-too report. i am also getting this as the top of the list when seen with powertop.

Average around 43 is the number of wakeups/sec from [Kernel Scheduler] Load Balancing Tick.

I don't want to disable compiz, so haven't yet tried out the workaround as at-least for me, that's not a solution.

Pls. let me know if anything more is needed in terms of details about the system and I will gladly provided.

# uname -a
Linux 2.6.35-23-server #40-Ubuntu SMP x86_64 GNU/Linux

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
stepping : 11
cpu MHz : 800.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida dts tpr_shadow vnmi flexpriority
bogomips : 4788.90
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
stepping : 11
cpu MHz : 800.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida dts tpr_shadow vnmi flexpriority
bogomips : 4791.02
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

# cat /proc/interrupts

          CPU0 CPU1
  0: 2278570 1 IO-APIC-edge timer
  1: 7805 186 IO-APIC-edge i8042
  8: 1 0 IO-APIC-edge rtc0
  9: 34270 0 IO-APIC-fasteoi acpi
 12: 2759224 38989 IO-APIC-edge i8042
 14: 65199 0 IO-APIC-edge ata_piix
 15: 0 0 IO-APIC-edge ata_piix
 16: 13302 0 IO-APIC-fasteoi uhci_hcd:usb3, nvidia
 18: 283 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb7
 19: 0 0 IO-APIC-fasteoi uhci_hcd:usb6
 20: 2 1 IO-APIC-fasteoi firewire_ohci
 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb4, r852, mmc0
 23: 23 4 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb5
 44: 52 133 PCI-MSI-edge eth0
 45: 127186 0 PCI-MSI-edge ahci
 46: 273211 0 PCI-MSI-edge iwlagn
 47: 514 0 PCI-MSI-edge hda_intel
NMI: 0 0 Non-maskable interrupts
LOC: 184938 1121100 Local timer interrupts
SPU: 0 ...

Read more...

Gurmeet (gurmeet1109) wrote :

Following up on one of the earlier mails containing a patch from Brian Rogers
Patch Details:
http://launchpadlibrarian.net/52089149/0001-Apply-patch-from-http-lkml.org-lkml-2010-7-8-122.patch

OK, in 2.6.35, linux-source, on line 328 in /kernel/time/tick-sched.c, this is what I see:
         if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
<328> arch_needs_cpu(cpu)) {
                 next_jiffies = last_jiffies + 1;
                 delta_jiffies = 1;

whereas in 2.6.36 linux-source, I see this:
         if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
<328> arch_needs_cpu(cpu)) {
                 next_jiffies = last_jiffies + 1;
                 delta_jiffies = 1;

OK. so this does not help. -:(
Still sharing the results of my findings.

Cristian KLEIN (cristiklein) wrote :

Gurmeet, please try kernel 2.6.36.

I tried the mainline 2.6.36 kernel and it seems to work better, but I do get a lot of kworker wakeups.
I also cannot get the ATI driver to work (downloaded from ati.amd.com) when running this kernel, so it's pretty useless.

Finally got the 2.6.36 mainline kernel to work with both ATI and Nvidia drivers!
Wrote a little howto here: http://www.bjortvedtdata.net/?p=199

Gurmeet (gurmeet1109) wrote :

Just tried the v2.6.37-rc2-maverick from Ubuntu Mainline.

The load balancing ticks are down to 2 (in words, two) from 60 or so per second.
Ran it for a few minutes. M/c is less noisy and temperatures are down by a bit (1-2 deg), but that can be very well be within a margin of error.

I run VMWare player extensively and I could not find a kernel patch for VMWare player for this version of the kernel and hence switched back to the official 2.6.35-23. Once VMWare releases (officially or otherwise) a patch for this version and 2.37 enters a stable state officially from Ubuntu, I am going to upgrade to 2.37, but am holding off till it is officially supported.

Gurmeet (gurmeet1109) wrote :

For the learned .....
Adding as a point to start with .. no guarantees that this is the Saviour .... just a lead to who can get the heads and tails out of it

# diff -cp tick-sched.c(2.6.37-rc4) tick-sched.c(2.6.25-23)

...

*** tick.sched-2.6.37-rc4.c 2010-12-06 17:50:03.960025002 +0530
--- tick-sched-2.6.35-23.c 2010-11-18 03:45:19.000000000 +0530
*************** void tick_nohz_stop_sched_tick(int inidl
*** 405,411 ****
     * the scheduler tick in nohz_restart_sched_tick.
     */
    if (!ts->tick_stopped) {
! select_nohz_load_balancer(1);

     ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
     ts->tick_stopped = 1;
--- 405,417 ----
     * the scheduler tick in nohz_restart_sched_tick.
     */
    if (!ts->tick_stopped) {
! if (select_nohz_load_balancer(1)) {
! /*
! * sched tick not stopped!
! */
! cpumask_clear_cpu(cpu, nohz_cpu_mask);
! goto out;
! }

     ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
     ts->tick_stopped = 1;
*************** void tick_setup_sched_timer(void)
*** 774,779 ****
--- 780,786 ----
  {
   struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
   ktime_t now = ktime_get();
+ u64 offset;

   /*
    * Emulate tick processing via per-CPU hrtimers:
*************** void tick_setup_sched_timer(void)
*** 783,788 ****
--- 790,799 ----

   /* Get the next period (per cpu) */
   hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
+ offset = ktime_to_ns(tick_period) >> 1;
+ do_div(offset, num_possible_cpus());
+ offset *= smp_processor_id();
+ hrtimer_add_expires_ns(&ts->sched_timer, offset);

   for (;;) {
    hrtimer_forward(&ts->sched_timer, now, tick_period);

Gurmeet (gurmeet1109) wrote :

Exploring the latest code on github. Comparing the two.
Again, no guarantees. This is just a lead, for the brave at heart ....

--- tick-sched-2.6.35-23.c 2010-12-06 22:44:02.821102001 +0530
+++ tick-sched-2.6.37-github.c 2010-12-06 22:42:40.451102001 +0530
@@ -405,13 +405,7 @@ void tick_nohz_stop_sched_tick(int inidl
    * the scheduler tick in nohz_restart_sched_tick.
    */
   if (!ts->tick_stopped) {
- if (select_nohz_load_balancer(1)) {
- /*
- * sched tick not stopped!
- */
- cpumask_clear_cpu(cpu, nohz_cpu_mask);
- goto out;
- }
+ select_nohz_load_balancer(1);

    ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
    ts->tick_stopped = 1;
@@ -780,7 +774,6 @@ void tick_setup_sched_timer(void)
 {
  struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
  ktime_t now = ktime_get();
- u64 offset;

  /*
   * Emulate tick processing via per-CPU hrtimers:
@@ -790,10 +783,6 @@ void tick_setup_sched_timer(void)

  /* Get the next period (per cpu) */
  hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
- offset = ktime_to_ns(tick_period) >> 1;
- do_div(offset, num_possible_cpus());
- offset *= smp_processor_id();
- hrtimer_add_expires_ns(&ts->sched_timer, offset);

  for (;;) {
   hrtimer_forward(&ts->sched_timer, now, tick_period);

Gurmeet (gurmeet1109) wrote :

Just installed the 2.6.36 mainline kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.36-maverick/.

VMWare Player is not working, but that's another story.

[kernel scheduler] Load balancing tick = 2.3
[extra timer interrupt] = 11.7

Total interrupts under idle conditions = 66.7 (earlier it was ~180).

Ok, now we know the story, or at-least a part of it.
We will all be a happy breed of people if the fixes are back ported to the latest stable release of Maverick from the Ubuntu repository. So if we do a 'sudo apt-get update && upgrade" we should get the patches applied and don't have to fiddle with unsupported releases of the core of the OS.

I am keeping my fingers crossed. VMWare is not working as of now, PS/2 interrupts are still very high and might discover a thing or 2 later, but as of now, with the 2.6.36, the issue seems to be resolved. Will feel a lot more satisfied after the fixed version lands up through the official repo on a officially supported release (2.6.35 at the moment).

oldmankit (oldmankit) wrote :

@Gurmeet

It all sounds very positive. For those of us that don't want to get their fingers dirty playing around with different kernel versions, there will be a lot of satisfaction when this fix finds itself into the official repos!

The commit to backport would be 83cd4fe, which has many more changes than just to kernel/time/tick-sched.c You can look at the complete diff at https://github.com/mirrors/linux-2.6/commit/83cd4fe

af5ab27 might also help somewhat, but I believe the other one is the major culprit.

Jeremy Davis (jedmeister) wrote :

That would be awesome Alex. I'm really looking forward to resolving this long standing issue with the LTS version of Ubuntu.

WebNuLL (babciastefa) wrote :

On Intel Celeron 900 mhz i have same problem, powertop reports me "[kernel scheduler] Load balancing tick" at the top (15-30%).

Fixing this bug my Tablet PC's battery life will extend i think.

Benjamin Schmid (benbuntu) wrote :

Same problem here on a Thinkpad Edge 11" AMD Neo II K325:
  Wakeups-from-idle per second : 355.6 interval: 15.0s
    49.5% (313.0) [Rescheduling interrupts] <kernel IPI>
    19.7% (124.3) [kernel scheduler] Load balancing tick

Would be great if this problem can be solved with Natty.

Benjamin Schmid:
As long as Natty uses the 2.6.36 or later, this should not be a problem - it is a kernel issue, not an issue related to Ubuntu alone.

WebNull:
It will most definitely extend your battery life! Using the mainline kernel has helped a lot :)

Really an annoying bug. Why will it not be fixed in LTS? Really a showstopper on mobile systems.

florinn (florinnaidin) wrote :

powertop on Asus U35JC with Intel Core i3

Top causes for wakeups:
  47.5% (228.6) [kernel scheduler] Load balancing tick
  26.9% (129.6) firefox-bin
   3.5% ( 16.9) thunderbird-bin
   2.7% ( 13.0) USB device 2-1.3 : USB Receiver (Logitech)

Tried linux-image-2.6.37-020637rc2-generic kernel and Load balancing tick droppped to 2-3%

Cristian KLEIN (cristiklein) wrote :

@florinn: Please close all applications (especially firefox and thunderbird) before posting such measurements. In your case, firefox is probably running some heavy animations or executing some scripts with many timeouts. It is not the kernel's fault that the user-space is generating useless wakeups.

On my laptop, with kernel 2.6.37, I easily get under 20 wakeups/second.

florinn:
that's my experience too, 2.6.37 makes my laptop silent again, and the load balancing ticks are much, much fewer.

I see that you run the RC2 version of 2.6.37, just thought I'd mention that the final version of 2.6.37 is out on http://kernel.ubuntu.com/~kernel-ppa/mainline/. The name says "Natty", but it runs fine on Maverick as well :)

You may want to try the 2.6.38RC2 Natty kernel as well - It seems very fast, but in my experience this kernel made my laptop run hotter and the fans run all the time - but I guess that's a separate issue ;)

I'm seeing this too on Ubuntu Maverick 32-bit (kernel 2.6.35-27-generic-pae) on an Intel Core Duo quad core 3 GHz processor. Around 50% "[kernel scheduler] Load balancing tick" pretty much continously.

My apologies if this has already been explained somewhere, but exactly what is a "load balancing tick", and is it actually a problem to have a lot of them?

Benjamin Schmid (benbuntu) wrote :

@Captain Chaos: To keep it short: These ticks are imposed by the Linux Kernel while trying to shift & balance the workload over the available CPU cores. The problem here is, that these "ticks" occur during idling phases and therefore inhibit the CPUs to fall into their power-saving states. A little annoying for mobile users as this reduced battery life and increases heat & fan activity. That's all.

This is a Linux kernel (not an Ubuntu) issue, so we just have to wait until Linus integrates the fixes into the mainline kernel to get rid of this annoyance. The Ubuntu & Kernel guys already took great actions to trigger a solution. Many thanks for that!

Peter Sasi (peter-sasi) wrote :

@Benjamin Schmid: I think the fix is done in the Linux tree (practically all versions later than 2.6.35 behave a lot better), it just has not been ported back to the ubuntu 2.6.35 tree...
And it is much annoying...

This problem appeared in 2.6.32 Kernel.
Beginning with 2.6.37 Kernel this is solved.

I observe increased battery life (+15-20%) in my old laptop.
sudo powertop, agrees with this impression

Ubuntu users can use this kernel
http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=M;O=D
but they are going to loose some ubuntu specific customizations (ureadahead
mainly)

For Lucid there is a much better solution that i currently use. No side
effects so far...
https://launchpad.net/~kernel-ppa/+archive/ppa?field.series_filter=lucid

Chad A. Davis (chadadavis) wrote :

This is fixed in the current Natty (beta) with kernel 2.6.38-5-generic (the stock kernel).

On Maverick I was getting several hundred wakeups per second on the stock kernel (2.6.35-something), almost all from the load balancing tick.

Now the load balancing tick is rarely listed in the 'top causes for wakeups' from powertop.

This will improve your battery life, but depending on your system, you may have other things to watch out for (e.g. disable Flash).

ggonlp (oktobermann) wrote :

I remember trying that out a few weeks ago (think it was a 2.6.36 kernel). While the load balancer wakeups were gone, I got just as many from a kworker process, so no real improvement...

Peter Sasi (peter-sasi) wrote :

ggonlp: it might be the case, that 2.6.36 fixes balancing wakeups, but introduces worker wakeups. 2.6.37 and 2.6.38 should be okay on the other hand. Have you tried those?

ggonlp (oktobermann) wrote :

Sorry, not yet - thanks for the hint though. I'm on an HP laptop and new
kernels mean for me every time an odyssey as graphics and wifi will take
a good day's work to get running :-(

On 03/08/2011 10:43 PM, Peter Sasi wrote:
> ggonlp: it might be the case, that 2.6.36 fixes balancing wakeups, but
> introduces worker wakeups. 2.6.37 and 2.6.38 should be okay on the other
> hand. Have you tried those?
>

gcc (chris+ubuntu-qwirx) wrote :

@Jeremy Foshee, this seems to be a regression between Lucid alpha 2 and alpha 3, and it's affecting LTS users with MUCH lower battery life than expected, hence it's a hardware problem.

Please could you reconsider your rejection for Lucid?

I tried Natty, this bug is finally solved with the 2.6.38 kernel

Peter Sasi (peter-sasi) wrote :

This bug I suppose is the root cause of the second power consumption regression found here:
http://www.phoronix.com/scan.php?page=article&item=linux_kernel_regress2

Basically power usage looks like:
- 2.6.24-2.6.34: average of all tests ~21-22W
- 2.6.35-2.6.37: average of all tests ~25W
- 2.6.38-2.6.39: average of all tests ~26W

This translates into laptop operation times under test workload:
mAh V W mA h

6600 10,8 21 1944,44 3,39

6600 10,8 22 2037,04 3,24

6600 10,8 25 2314,81 2,85

6600 10,8 26 2407,41 2,74

Richard Kleeman (kleeman) wrote :

The last two comments on this bug are contradictory. Checking powertop with the 2.6.38 kernel for natty versus earlier kernels shows for me that the *type* of kernel interrupt has changed but the number of them has not and has increased if anything. I think this bug is serious and getting worse and requires the attention of a kernel developer. It reflects badly on linux in general if power consumption is increasing markedly while performance is not markedly increasing at the same time (which appears the case on the phoronix benchmarks).

To solve this requires someone familiar with the kernel to report it on the kernel bugzilla. There has been some discussion of the issue on lmkl but it looks very inconclusive and low priority to me. I don't know why exactly.

Kai Pastor (dg0yt) wrote :

Only a few weeks ago I started looking for the root cause of my laptop's noise and heat. I used Ubuntu Lucid 10.04 on a dual core laptop and eventually found this bug.

I tried newer kernels from https://launchpad.net/~kernel-ppa/+archive/ppa?field.series_filter=lucid
Unfortunately, experience and reports (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/760131) show that 2.6.38 kernels introduce new problems.

I now switched to the latest 2.6.37 from that PPA: This configuration appears to be the most silent in terms of powertop and fan noise in my experience.

I'm really disappointed that the orginal issue has not yet been solved for LTS release users, more than one year after reporting. Doesn't it affect commercial clients (http://www.h-online.com/open/news/item/LVM-insurance-company-switches-10-000-systems-to-Ubuntu-10-04-LTS-1233194.html), too?

oldmankit (oldmankit) wrote :

My two cents is that I was waiting for Ubuntu Natty for a newer kernel and therefore better performance, but I appear to have more wakeups and less battery life than before.

ggonlp (oktobermann) wrote :

I'm with oldmankit on this - the situation is getting worse rather than better. I wrote a script mostly based on lesswatts.org suggestions which, for my HP Probook 4720s halves wakeups (still at 50/sec) and reduces fan usage. Use at your own discretion :-)

oldmankit (oldmankit) wrote :

That script looks useful, thanks for posting it.

I've all turned round on the subject. I ran Ubuntu Classic (no effects) and ran powertop with no apps running. It was significantly better than in Ubuntu 10.10. It was actually really good. The red bar in powertop that shows wakeups-from-idle per second, well I found out it is not always red! It was orange for the first time.

I'll just avoid using Unity, which givers my processor a hard time anyway.

skhawam (s-khawam) wrote :

I have a Dell Mini 10v and managed to get good results with Natty (11.04) after some fiddling around. It's better than any other version I had till now (I have been following this bug since its beginning!). Without wireless I get 20ms C3 state, and with wireless one around 3ms C3 state (with no applications running, and not touching the keyboard or touchpad). The machine doesn't get hot (the mini 10v is fanless, high temperature is easy to detect).

-I have disabled Unity, although I dont think Unity itself is the cause of wakeups
-Compiz generates a lot of wakeups on the GPU, so I removed all the features that I dont use and kept only the ones I use
-Installed the latest intel GPU driver from ppa (the Mini 10v has the i915)
-Disabled the SD-Card reader (using 'rmmod usb_storage') which was generating extra wakeups (I re-enable it when I want to insert a card).

Even though the situation is better now, I'm sure there is many other things that can be improved in the kernel to reduce the wakeup further, say to get 30ms C3 when the wireless is on, or to reduce the wakeups when the touchpad is used.

Phillip Susi (psusi) wrote :

On 5/17/2011 10:59 PM, skhawam wrote:
> -Compiz generates a lot of wakeups on the GPU, so I removed all the features that I dont use and kept only the ones I use

I once found a DRI configuration utility that could disable vsync and
found that got rid of a lot of wakeups from the GPU.

Peter Sasi (peter-sasi) wrote :
Download full text (3.1 KiB)

I have updated my tests for both the latest 2.6.38 kernel of natty and the latest 2.6.34 kernel from mainline as suggested, on my Thinkpad T61 laptop.
After login I have sudoed previously, then run:
sudo powertop -d -t 60 > ~/Desktop/powertop_dump-`uname -r`.log
Running on battery, not having started anything but one terminal window after logon.
Both kernels, results attached: there is a visible difference still! 15,5W versus 17,9W meaning 3,4 hours versus 2,9 hours = a half an hour of time on battery!

I think both power regressions found by Phoronix at 2.6.35 and 2.6.38 are there.
See http://www.phoronix.com/scan.php?page=article&item=linux_kernel_regress2

2.6.34:
PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation

Collecting data for 60 seconds

Cn Avg residency
C0 (cpu running) ( 1,9%)
C0 0,0ms ( 0,0%)
C1 mwait 0,0ms ( 0,0%)
C2 mwait 0,5ms ( 0,4%)
C6 mwait 6,2ms (97,7%)
P-states (frequencies)
Turbo Mode 0,9%
  2,50 Ghz 0,0%
  1,60 Ghz 0,0%
  1200 Mhz 0,0%
   800 Mhz 99,1%
Wakeups-from-idle per second : 164,1 interval: 60,0s
Power usage (ACPI estimate): 15,5W (3,4 hours)
Top causes for wakeups:
  29,5% ( 61,0) [uhci_hcd:usb5, yenta, nvidia] <interrupt>
  24,2% ( 50,0) [kernel core] hdaps_mousedev_poll (hdaps_mousedev_poll)
  19,3% ( 39,9) [kernel core] hrtimer_start (tick_sched_timer)
  13,4% ( 27,8) [kernel scheduler] Load balancing tick
   4,8% ( 9,9) gwibber-service
   2,4% ( 5,0) [ata_piix] <interrupt>
   1,7% ( 3,6) compiz
   1,1% ( 2,2) nautilus
   0,8% ( 1,7) gnome-terminal

2.6.38:
PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation

Collecting data for 60 seconds

Cn Avg residency
C0 (cpu running) ( 2,2%)
polling 0,1ms ( 0,0%)
C1 mwait 0,0ms ( 0,0%)
C2 mwait 0,6ms ( 0,7%)
C6 mwait 4,5ms (97,1%)
P-states (frequencies)
Turbo Mode 0,9%
  2,50 Ghz 0,0%
  2,00 Ghz 0,0%
  1,60 Ghz 0,1%
   800 Mhz 99,0%
Disk accesses:
The application 'gvfsd-metadata' is writing to file 'home-3c698ca6.log' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home-3c698ca6.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home.SF5MWV' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home.SF5MWV' on /dev/sda5
Wakeups-from-idle per second : 227,9 interval: 60,0s
Power usage (ACPI estimate): 17,9W (2,9 hours)
Top causes for wakeups:
  24,9% ( 61,0) [uhci_hcd:usb5, yenta, nvidia] <interrupt>
  20,4% ( 50,0) [kernel core] hdaps_mousedev_poll (hdaps_mousedev_poll)
  15,1% ( 36,9) [extra timer interrupt]
  14,8% ( 36,4) [kernel core] hrtimer_start (tick_sched_timer)
   7,4% ( 18,2) compiz
   6,2% ( 15,2) [kernel scheduler] Load balancing tick
   4,1% ( 10,0) gwibber-service
   2,2% ( 5,5) kworker/0:0
   1,6% ( 4,0) [ata_piix] <interrupt>
   0,0% ( 0,0)D gvfsd-metadata
   0,0% ( 0,0)D rs:main Q:Reg
   0,9% ( 2,3) nautilus
   0,7% ( 1,7) gnome-term...

Read more...

Richard Kleeman (kleeman) wrote :

Yes I have also seen a similar degradation between the two kernels. What is notable in your case and mine is that the kernel interrupts are labelled differently between the two cases. There are actually fewer load balancing ticks in the later kernel but other categories eg [extra timer interrupt] are chewing up power. So something appears to have changed but not in a good way and focussing solely on [kernel scheduler] Load balancing tick is not appropriate.

kecsap (csaba-kertesz) wrote :

I think I had the same problem on a Dell Latitude E4300 (Core 2/P9300) and the interesting thing was that the overload happened mostly when the laptop was being charged and when I removed the AC adapter to run on battery, after some minutes, the laptop was responsible again. In my case, when this bug happened, the whole laptop was not flawless any more, very slow and high CPU load. There is an other problem that the laptop shits on the cpu scaling policy and decides about the current speed without any sane reasons. If this bug happened and the cpu scaling dropped the frequency to 800 Mhz, the laptop was quite useless when it was being charged.

I think I experienced the bug under Maverick first and the upgrade to Natty did not help. Now I upgraded to the latest unofficial natty kernel: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.39-rc4-natty/ and the intel graphics driver from this PPA: https://launchpad.net/~glasen/+archive/intel-driver and after one day of use, the problems seem to be solved expect the cpu scaling crazyness, but I do not have too much hope that a linux will ever work on this laptop without problems.

kecsap (csaba-kertesz) wrote :

A bit off-topic to my previous comment: The PPA version of the intel driver did not make any difference, however, the bluetooth causes the system hang before or after suspend if it is enabled. I did not mind too much just disabled the bluetooth in the Bios.

Now, the system seems to be stable.

kecsap (csaba-kertesz) wrote :

Yuppie, now it "seems" again that I have found the good combination:

- Nothing fixes the cpu frequency flickering. The laptop just shits on the cpufreq settings and decides on its own, what is the best frequency for me. Come on... Sucks.
- This "Load balancing tick" bullshit stopped when I downgraded my bios back to the original version what the laptop had when it was shipped. I upgraded recently to the newest version and it was my last idea like the root cause of these problems. Laptop: Dell Latitude E4300, the BIOS version A06 again.
- But I got back an old bug with v2.6.39-rc4-natty kernel. Namely the backlit of the LCD did not come back after resume. Nice. I switched back to the original natty kernel and this problem seems to be solved also.

So the "workaround recipe" for Dell Latitude E4300 owners with this bug:

1. Downgrade the BIOS to an earlier version. A06 or A07 is a good candidate. (I googled a manual to make a pendrive with bootable DOS and I downloaded the A06 BIOS "upgrade" file from the Dell site to the pendrive.)
2. If this bug still happens -> upgrade to Natty.

Other notes:
(3a. In any way, disable the bluetooth in the BIOS. It just makes problems with suspend/resume and for me, an attempt to send file via bluetooth from my phone made the laptop frozen with a kernel oops.)
(3b. I do not think so that it makes anything better, but I have installed the latest Intel graphics drivers from the mentioned PPA in my previous comments.)

Uff.

The attachment "Patch from mailing list" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

Brad Figg (brad-figg) wrote :

The desired commit has been applied an released in Lucid (and all other stable kernels). Please update with the latest SRU kernel.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu):
assignee: nobody → Matúš Behun (matus-behun)
Phillip Susi (psusi) on 2012-11-18
Changed in linux (Ubuntu):
assignee: Matúš Behun (matus-behun) → nobody
Changed in linux-2.6 (Debian):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.