Ubuntu

Tens of wakes per second in "[kernel scheduler] Load balancing tick" on Core 2 Duo even with only 1 core enabled

Reported by Flávio Etrusco on 2010-02-19
This bug affects 181 people
Affects Status Importance Assigned to Milestone
Arch Linux
New
Undecided
Unassigned
linux-2.6 (Debian)
Fix Released
Unknown
linux (Ubuntu)
Low
Unassigned
Declined for Lucid by Jeremy Foshee
Declined for Maverick by Jeremy Foshee
xorg (Ubuntu)
Undecided
Unassigned
Declined for Lucid by Jeremy Foshee
Declined for Maverick by Jeremy Foshee

Bug Description

powertop reports many wakes per second (quantity depending on system) in "[kernel scheduler] Load balancing tick" task, rising with little load, on many kinds of multi-core (?) systems (original report was on a Core 2 Duo processor (T6500) with a single core enabled (multicore disabled in BIOS)).

Cause of the problem:
With kernel 2.6.32, there came a patch to the scheduler that introduced this problem (that was backported to some other versions as well). Even though this problem occurred first in Lucid, it is NOT specific to Lucid or Ubuntu at all (Debian bug report at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=521944, reproducable in Arch Linux as well). Work is ongoing to get things straight in kernel, but it will take a long time until this reaches Ubuntu (see http://lkml.org/lkml/2010/7/6/172).

Workarounds that DO NOT work (may improve situation but not solve it):
- maxcpus=1
- noapic
- nosmp
- nolapic
- use mainline kernel

Workarounds that DO (probably) work:
- tip version of kernel (git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git, from http://lkml.org/lkml/2010/7/8/75)
- use maverick's kernel with applied patches (https://launchpad.net/~brian-rogers/+archive/power, from comment #80)

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: etrusco 1606 F.... pulseaudio
                      etrusco 15151 F.... foobar2000.exe
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfddf8000 irq 22'
   Mixer name : 'Realtek ALC269'
   Components : 'HDA:10ec0269,1b0a4009,00100004 HDA:11c11040,1b0a4007,00100200'
   Controls : 19
   Simple ctrls : 11
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfebec000 irq 17'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100100'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Fri Feb 19 05:25:42 2010
DistroRelease: Ubuntu 10.04
EcryptfsInUse: Yes
MachineType: Philco PHN10XXX.
Package: linux-image-2.6.32-13-generic 2.6.32-13.18
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-13-generic root=UUID=d482e94f-9370-4ad2-9536-986541003db5 ro acpi.power_nocheck=1 acpi_osi=linux radeon.blacklist=yes
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-13.18-generic
Regression: No
RelatedPackageVersions: linux-firmware 1.29
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
TestedUpstream: No
Uname: Linux 2.6.32-13-generic i686
dmi.bios.date: 06/01/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1.01
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.vendor: PEGATRON CORP.
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 10
dmi.chassis.vendor: PEGATRON CORP.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1.01:bd06/01/2009:svnPhilco:pnPHN10XXX.:pvr1.01:rvnPEGATRONCORP.:rn:rvrTobefilledbyO.E.M.:cvnPEGATRONCORP.:ct10:cvrToBeFilledByO.E.M.:
dmi.product.name: PHN10XXX.
dmi.product.version: 1.01
dmi.sys.vendor: Philco

Flávio Etrusco (etrusco) wrote :
description: updated
Jeffrey Baker (jwbaker) wrote :

Confirmed on a ThinkPad X61. This is new in Lucid Alpha 3, wasn't there in Alpha 2.

Changed in linux (Ubuntu):
status: New → Confirmed
Jeffrey Baker (jwbaker) wrote :

I should also mention that I don't have a core disabled in the BIOS, I am using both cores. It shouldn't matter.

Jeffrey Baker (jwbaker) wrote :
Download full text (4.6 KiB)

# powertop -d
PowerTOP 1.12 (C) 2007, 2008 Intel Corporation

Collecting data for 15 seconds

Your CPU supports the following C-states : C1 C2 C3 C4
Your BIOS reports the following C-states : C1 C2 C3
Cn Avg residency
C0 (cpu running) (31.1%)
C0 0.0ms ( 0.0%)
C1 mwait 0.0ms ( 0.0%)
C2 mwait 0.1ms ( 0.0%)
C3 mwait 4.8ms (68.9%)
P-states (frequencies)
Turbo Mode 15.7%
  2.00 Ghz 0.1%
  1.60 Ghz 0.1%
  1200 Mhz 0.2%
   800 Mhz 84.0%
Disk accesses:
The application 'firefox-bin' is writing to file 'sessionstore-1.js' on /dev/sda1
The application 'firefox-bin' is writing to file 'sessionstore-1.js' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
The application 'firefox-bin' is writing to file '_CACHE_001_' on /dev/sda1
Wakeups-from-idle per second : 145.1 interval: 15.0s
no ACPI power usage estimate available
Top causes for wakeups:
  71.1% (296.3) [kernel scheduler] Load balancing tick
   7.7% ( 32.2) [Rescheduling interrupts] <kernel IPI>
   3.7% ( 15.3)D firefox-bin
   3.9% ( 16.3) [acpi] <interrupt>
   3.3% ( 13.9) [iwlagn] <interrupt>
   2.5% ( 10.5) [i915@pci:0000:00:02.0] <interrupt>
   2.3% ( 9.5) PS/2 keyboard/mouse/touchpad interrupt
   1.0% ( 4.3) Xorg
   1.0% ( 4.0) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
   0.9% ( 3.6) [ahci] <interrupt>
   0.6% ( 2.6) [kernel core] hrtimer_start (tick_sched_timer)
   0.3% ( 1.2) gnome-terminal
   0.2% ( 1.0) gvfs-afc-volume
   0.2% ( 0.7) top
   0.2% ( 0.7) compiz
   0.1% ( 0.5) python
   0.1% ( 0.4) update-notifier
   0.1% ( 0.3) [eth1] <interrupt>
   0.1% ( 0.3) events/0
   0.1% ( 0.3) [kernel core] inc_rt_group (sched_rt_period_timer)
   0.1% ( 0.3) clock-applet
   0.0% ( 0.2) gnome-settings-
   0.0% ( 0.2) indicator-apple
   0.0% ( 0.2) gnome-panel
   0.0% ( 0.2) gnome-power-man
   0.0% ( 0.2) bdi-default
   0.0% ( 0.2) flush-8:0
   0.0% ( 0.2) rtkit-daemon
   0.0% ( 0.1) [kernel core] sk_reset_timer (tcp_delack_timer)
   0.0% ( 0.1) [kernel core] arm_supers_timer (sync_supers_timer_fn)
   0.0% ( 0.1) NetworkManager
   0.0% ( 0.1) rmmod
   0.0% ( 0.1) sshd
   0.0% ( 0.1) [kernel core] neigh_add_timer (neigh_timer_handler)
   0.0% ( 0.1) khungtaskd
   0.0% ( 0.1) [kernel core] add_timer (addrconf_verify)
   0.0% ( 0.1) events/1
   0.0% ( 0.1) ssh-agent
   0.0% ( 0.1) gnome-volume-ma
   0.0% ( 0.1) [kernel core] add_timer (sta_info_cleanup)
   0.0% ( 0.1) kerneloops
   0.0% ( 0.1) [kernel core] fib6_run_gc (fib6_gc_timer_cb)
   0.0% ( 0.1) rsyslogd

A USB device is active 100.0% of the time:
USB device 3-1 : BCM2045B (Broadcom Corp)

Suggestion: Enable USB autosuspend for non-input devices by pressing the U key

Suggestion: increase the VM dirty writeback time from 5.00 to 15 seconds with:
  echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
This wakes the disk up less frequently for background VM activity

Suggestion: Enable SATA ALPM link power management via:
  echo min_power > /sys/class/scsi_...

Read more...

Jeffrey Baker (jwbaker) wrote :

Notably, events/0 and events/1 are getting tons of CPU time:

# ps -fe | grep events
root 9 2 5 20:58 ? 00:04:46 [events/0]
root 1016 1 0 20:58 ? 00:00:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root 3473 2 49 22:19 ? 00:05:16 [events/1]
root 4370 4289 0 22:29 pts/1 00:00:00 grep events

Rephrasing the summary.
Indeed the problem is much worse with the 2 cores enabled, the report is just that i was expecting no wake up at all with only 1 core.
nosmp, noapic and nolapic made no difference. Actually will all of these enabled the system was bogged down with not apparent explanation.

summary: Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
- Core 2 Duo with 1 core disabled in BIOS
+ Core 2 Duo even if only 1 core enabled (1 disabled in BIOS)
davidb (davidb-csh) wrote :

I'd just like to report that I am having the same issue. I also have a Thinkpad x61, Core2. I just upgraded to Lucid a few hours ago. I tried disabling a core and saw the same problem. As Flávio wrote, I would have expected that to significantly reduce the number of wakeups. If I can be of any help in debugging this let me know.

Ryan Kavanagh (ryanakca) wrote :

Linked to Debian bug 521944 based on comment 84 . I can confirm this happens under Ubuntu Lucid with an Intel Atom N280, so I don't think this is restricted to Core 2 Duo.

Changed in linux-2.6 (Debian):
status: Unknown → Incomplete
Leif Walsh (leif.walsh) wrote :

What is incomplete about this bug? I am getting 500 wakeups consistently from this load balancing tick, on an x200s, with latest Lucid.

Flávio Etrusco (etrusco) wrote :

Funny how the comments in the Debian tracker suggests says "worksforme" and suggests powertop is outdated without any data.
Latest powertop here lists this:

Top causes for wakeups:
  44,1% (199,2) <kernel core> : hrtimer_start_range_ns (tick_sched_timer)
  26,5% (119,6) firefox-bin : hrtimer_start_range_ns (hrtimer_wakeup)
  12,0% ( 54,2) <interrupt> : extra timer interrupt
   7,8% ( 35,2) <interrupt> : ath, HDA Intel
   1,8% ( 8,0) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func)
   1,3% ( 6,0) <interrupt> : ata_piix, ata_piix, uhci_hcd:usb5, uhci_hcd:
usb7 Segmentation fault (core dumped)

And yes, it coredumps.

Flávio Etrusco (etrusco) wrote :

Leif: what was marked incomplete is the Debian bug entry.

I observe this problem on a single-core Intel Atom Z520 CPU with HyperThreading enabled using Ubuntu 10.04 Lucid beta 2. Based on that and on comment #8 here, I edited the bug summary to remove the CPU-specific part.

summary: - Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
- Core 2 Duo even if only 1 core enabled (1 disabled in BIOS)
+ Tens of wakes per second in "[kernel scheduler] Load balancing tick"
Flávio Etrusco (etrusco) wrote :

Problem is I reported this bug because I don't expect "load-balancing" wake-ups on a single-cpu setup. I don't know what is the expected number of wake-ups with multiple CPUs or HyperThreading. If you can reproduce the problem of running without HyperThreading, then maybe this is the same bug or related.

Flávio Etrusco (etrusco) wrote :

It would be nice to know if this (apparent) bug also occurs on non-Intel CPUs...

Flávio Etrusco (etrusco) wrote :

Similar test-case on a Athlon64 cpu shows much lower wake-ups:

Top causes for wakeups:
  38.3% (170.0)D firefox-bin
  22.4% ( 99.4) pulseaudio
  13.8% ( 61.2) [nvidia] <interrupt>
  10.8% ( 48.0) PS/2 keyboard/mouse/touchpad interrupt
   7.8% ( 34.5) [kernel scheduler] Load balancing tick
   2.5% ( 11.3) [pata_via] <interrupt>

summary: - Tens of wakes per second in "[kernel scheduler] Load balancing tick"
+ Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
+ Core 2 Duo even with only 1 core enabled
tags: added: upstream
removed: needs-upstream-testing
lukefeil (lukefeil88) wrote :

Similiar on a ASUS 1005PE with a Intel N450

Top causes for wakeups:
  42.9% (263.6) [kernel scheduler] Load balancing tick

Andrew Henry (adhenry) wrote :

I had a HP laptop with Intel Core 2 Duo 7200 and this was never an issue. With Ubuntu 10.04 on a Thinkpad Edge 13 with Intel Core2Duo CULV CPU this is an issue.

Leif Walsh (leif.walsh) wrote :

Is there any way to just turn off load balancing? I'd be eager to sacrifice a little performance for a large (almost 100%) gain in battery life.

permalloy (permalloy) wrote :

Eventually bug 552020 is a duplicate of this one ?

mihai007 (mihai-ile) wrote :

dell xps m1330, intel t8300 on ubuntu final 10.04 gives:

Top causes for wakeups:
  50.6% ( 63.5) [kernel scheduler] Load balancing tick
  12.1% ( 15.1) [ata_piix] <interrupt>
  10.7% ( 13.4) [iwlagn] <interrupt>
   7.1% ( 8.9) [extra timer interrupt]
   4.0% ( 5.0) syndaemon
   1.6% ( 2.0) [nvidia] <interrupt>

This problem is huge, 50% of cpu wakeups!?

Andrew Henry (adhenry) wrote :

In post #17 I said I had a HP that I never had an issue with. I didn't...with Ubuntu 9.10. I did a fresh install of 10.04 and have exactly the same issue. Different CPU, different wireless adapter etc. Obviously, this is a kernel issue and not hardware specific.

verwa (laurent-arsonore) wrote :

Similiar on a ASUS V1SN Intel Core 2 Duo T7700 (ubuntu 10.4)

Top causes for wakeups:
  47.9% ( 63.5) [kernel scheduler] Load balancing tick
  12.2% ( 15.1) [ata_piix] <interrupt>
  9.7% ( 13.4) [iwlagn] <interrupt>
   8.1% ( 8.9) [extra timer interrupt]

---
+ acpi errors

UBUNTU kernel: [ 0.186131] ACPI Error: ACPI path has too many parent prefixes (^) - reached beyond root node (20090903/nsaccess-429)
..
UBUNTU kernel: [ 0.222562] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

lores (lores000) wrote :

Same problem here: HP 6720s Intel(R) Pentium(R) Dual CPU T2370, Ubuntu 10.04 LTS

On 08/05/10 11:16, Pawel wrote:
> It seems the Phoronix does confirm this issue:
>
> http://www.phoronix.com/scan.php?page=article&item=linux_windows_part2&num=1
>

Thats general power consumption isn't it? They compare graphics card
drivers and power consumption...I do not see any mention of this
particular issue with kernel tick wakeups?

--Andrew

Michael Christensen (conphara) wrote :

I can confirm this bug on a Core 2 Duo (E6500) and a Pentium D (standard D, not Extreme). Tested with Powertop 1.12. Seems to wake up more under load and wake up less when idle.
Both with Lucid installed (clean installs), fully updated.

stephen (hlxshady) wrote :

Same issue for my Lenovo T400 with Core 2 Duo (P8400),
a brief description of my case is posted here
http://ubuntuforums.org/showthread.php?t=1390055#7

There was a daily update today ... my kernel was just updated to 2.6.32-22, but its appetite on power doesn't change at all ... :(

Michael Christensen (conphara) wrote :

I have tested the following boot options adding "idle=halt" and "processor.max_cstate=1" or cstate=2, not adding all at the same time but one after the other. None of the options aren't fixing the wakeups on a Core 2 Duo (E6500) or on a Pentium D (both desktop PCs).

This is what is looks like in Powertop when scrolling a page in Firefox:
Top causes for wakeups:
42,3% (115,9) [kernel scheduler] Load balancing tick

This is what it looks like when idle:
8,2% (7,0) [kernel scheduler] Load balancing tick

Andrew Henry (adhenry) wrote :

On 08/05/10 20:42, Michael Christensen wrote:
> This is what is looks like in Powertop when scrolling a page in Firefox:
> Top causes for wakeups:
> 42,3% (115,9) [kernel scheduler] Load balancing tick
>
> This is what it looks like when idle:
> 8,2% (7,0) [kernel scheduler] Load balancing tick
>

Im getting min 25% on Load balancing tick even when idle. In fact, it
decreases when im actively using the CPU!

Flávio Etrusco (etrusco) wrote :

We're going nowhere with this bug, we didn't even get word on whether this is expected or a powertop bug (the discussion in debian doesn't hold up) or whatever.
Since this happens is mainstream kernels, I guess somebody will have to get the balls to post to LKML or the kernel bugzilla 8-)

LKML knows, there's even a patch somewhere:
http://lkml.org/lkml/2010/5/9/20

Luca Aluffi (aluffilu) wrote :

Maybe the problem is wider as it extends to intel atom too: here is my N270 from ASUS 1201NL:

Wakeup-da-idle al secondo: 130,6 intervallo: 5,0s
Utilizzo energetico (stima ACPI): 9,4W (2,9 ore)

Cause principali di wakeup:
  22,0% ( 38,0) [kernel scheduler] Load balancing tick
  13,7% ( 23,6) [ath9k] <interrupt>
  13,0% ( 22,4) firefox-bin
  12,2% ( 21,0) [extra timer interrupt]

Flávio Etrusco (etrusco) wrote :

The post(s) linked by Michael clearly show this is a general problem with multi-core, not specific to any CPU model. See: http://lkml.org/lkml/2010/4/26/249

FWIW, the patch I linked to (which I just got around to actually trying) doesn't seem to help on my netbook at all.

I've had some success Daniel Hollocher's linux-ck ppa: https://launchpad.net/~chogydan/+archive/ppa

It doesn't fix everything, but the BFS making such a difference does make it clear this is a kernel issue and not just a few of us having strange hardware...

--bornagainpenguin

so in all liklihood, were going to have to shut up and put up until next
ubuntu release when we get a new kernel? or is there any chance whatsoever
that this will be backported??

On 14 May 2010 03:21, bornagainpenguin <email address hidden> wrote:

> I've had some success Daniel Hollocher's linux-ck ppa:
> https://launchpad.net/~chogydan/+archive/ppa<https://launchpad.net/%7Echogydan/+archive/ppa>
>
> It doesn't fix everything, but the BFS making such a difference does
> make it clear this is a kernel issue and not just a few of us having
> strange hardware...
>
> --bornagainpenguin
>
> --
> Tens of wakes per second in "[kernel scheduler] Load balancing tick" on
> Core 2 Duo even with only 1 core enabled
> https://bugs.launchpad.net/bugs/524281
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Bartosz Skowron (getxsick) wrote :

Can't believe this bug is for months...

Michael Christensen (conphara) wrote :

I have installed two mainline kernels (http://kernel.ubuntu.com/~kernel-ppa/mainline/), .31 & 33, just to see if those kernels made any change in the number of wakeup calls. Both kernels had about the same number of wakeups as the Lucid kernel (.32), which led me to wonder whether or not this could be an userspace bug.

Both kernels made Powertop report :
   28,1% ( 7,2) [kernel scheduler] Load balancing tick
   27,7% ( 7,1) [kernel core] hrtimer_start (tick_sched_timer)

The fact of the mather is that mainline kernels (at least the ones from the ppa) are not making things better.
Luckily kernel developer Suresh Siddha is working on a patch, that simply has to be backported to Lucid.

These kernel wakeups have to be sorted out, since they are one of the reasons why Linux power consumption is much higher than on other platforms.

Kimmo Ahola (kimmo-ahola) wrote :

Is there a workaround? Currently my laptop is burning my legs off..

Here's my output of "sudo powertop -t60 -d"

skhawam (s-khawam) wrote :

Was having this problem as well on my Dell Mini 10v.

Seems like a kernel patch has been posted yesterday:

http://lkml.org/lkml/2010/5/17/350

Changed in linux (Ubuntu):
status: Confirmed → Triaged
tags: added: kernel-core kernel-needs-review
Andy Whitcroft (apw) on 2010-07-12
tags: added: kernel-candidate kernel-reviewed
removed: kernel-needs-review
tags: removed: kernel-candidate
Andy Whitcroft (apw) on 2010-07-15
Changed in linux (Ubuntu):
importance: Undecided → Low
description: updated
tags: added: patch
ginkgo (davy-renaud) on 2010-11-16
Changed in linux (Ubuntu):
status: Triaged → Confirmed
Bryce Harrington (bryce) on 2010-11-22
Changed in xorg (Ubuntu):
status: New → Invalid
123 comments hidden view all 203 comments

Finally got the 2.6.36 mainline kernel to work with both ATI and Nvidia drivers!
Wrote a little howto here: http://www.bjortvedtdata.net/?p=199

Gurmeet (gurmeet1109) wrote :

Just tried the v2.6.37-rc2-maverick from Ubuntu Mainline.

The load balancing ticks are down to 2 (in words, two) from 60 or so per second.
Ran it for a few minutes. M/c is less noisy and temperatures are down by a bit (1-2 deg), but that can be very well be within a margin of error.

I run VMWare player extensively and I could not find a kernel patch for VMWare player for this version of the kernel and hence switched back to the official 2.6.35-23. Once VMWare releases (officially or otherwise) a patch for this version and 2.37 enters a stable state officially from Ubuntu, I am going to upgrade to 2.37, but am holding off till it is officially supported.

Gurmeet (gurmeet1109) wrote :

For the learned .....
Adding as a point to start with .. no guarantees that this is the Saviour .... just a lead to who can get the heads and tails out of it

# diff -cp tick-sched.c(2.6.37-rc4) tick-sched.c(2.6.25-23)

...

*** tick.sched-2.6.37-rc4.c 2010-12-06 17:50:03.960025002 +0530
--- tick-sched-2.6.35-23.c 2010-11-18 03:45:19.000000000 +0530
*************** void tick_nohz_stop_sched_tick(int inidl
*** 405,411 ****
     * the scheduler tick in nohz_restart_sched_tick.
     */
    if (!ts->tick_stopped) {
! select_nohz_load_balancer(1);

     ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
     ts->tick_stopped = 1;
--- 405,417 ----
     * the scheduler tick in nohz_restart_sched_tick.
     */
    if (!ts->tick_stopped) {
! if (select_nohz_load_balancer(1)) {
! /*
! * sched tick not stopped!
! */
! cpumask_clear_cpu(cpu, nohz_cpu_mask);
! goto out;
! }

     ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
     ts->tick_stopped = 1;
*************** void tick_setup_sched_timer(void)
*** 774,779 ****
--- 780,786 ----
  {
   struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
   ktime_t now = ktime_get();
+ u64 offset;

   /*
    * Emulate tick processing via per-CPU hrtimers:
*************** void tick_setup_sched_timer(void)
*** 783,788 ****
--- 790,799 ----

   /* Get the next period (per cpu) */
   hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
+ offset = ktime_to_ns(tick_period) >> 1;
+ do_div(offset, num_possible_cpus());
+ offset *= smp_processor_id();
+ hrtimer_add_expires_ns(&ts->sched_timer, offset);

   for (;;) {
    hrtimer_forward(&ts->sched_timer, now, tick_period);

Gurmeet (gurmeet1109) wrote :

Exploring the latest code on github. Comparing the two.
Again, no guarantees. This is just a lead, for the brave at heart ....

--- tick-sched-2.6.35-23.c 2010-12-06 22:44:02.821102001 +0530
+++ tick-sched-2.6.37-github.c 2010-12-06 22:42:40.451102001 +0530
@@ -405,13 +405,7 @@ void tick_nohz_stop_sched_tick(int inidl
    * the scheduler tick in nohz_restart_sched_tick.
    */
   if (!ts->tick_stopped) {
- if (select_nohz_load_balancer(1)) {
- /*
- * sched tick not stopped!
- */
- cpumask_clear_cpu(cpu, nohz_cpu_mask);
- goto out;
- }
+ select_nohz_load_balancer(1);

    ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
    ts->tick_stopped = 1;
@@ -780,7 +774,6 @@ void tick_setup_sched_timer(void)
 {
  struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
  ktime_t now = ktime_get();
- u64 offset;

  /*
   * Emulate tick processing via per-CPU hrtimers:
@@ -790,10 +783,6 @@ void tick_setup_sched_timer(void)

  /* Get the next period (per cpu) */
  hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
- offset = ktime_to_ns(tick_period) >> 1;
- do_div(offset, num_possible_cpus());
- offset *= smp_processor_id();
- hrtimer_add_expires_ns(&ts->sched_timer, offset);

  for (;;) {
   hrtimer_forward(&ts->sched_timer, now, tick_period);

Gurmeet (gurmeet1109) wrote :

Just installed the 2.6.36 mainline kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.36-maverick/.

VMWare Player is not working, but that's another story.

[kernel scheduler] Load balancing tick = 2.3
[extra timer interrupt] = 11.7

Total interrupts under idle conditions = 66.7 (earlier it was ~180).

Ok, now we know the story, or at-least a part of it.
We will all be a happy breed of people if the fixes are back ported to the latest stable release of Maverick from the Ubuntu repository. So if we do a 'sudo apt-get update && upgrade" we should get the patches applied and don't have to fiddle with unsupported releases of the core of the OS.

I am keeping my fingers crossed. VMWare is not working as of now, PS/2 interrupts are still very high and might discover a thing or 2 later, but as of now, with the 2.6.36, the issue seems to be resolved. Will feel a lot more satisfied after the fixed version lands up through the official repo on a officially supported release (2.6.35 at the moment).

oldmankit (oldmankit) wrote :

@Gurmeet

It all sounds very positive. For those of us that don't want to get their fingers dirty playing around with different kernel versions, there will be a lot of satisfaction when this fix finds itself into the official repos!

The commit to backport would be 83cd4fe, which has many more changes than just to kernel/time/tick-sched.c You can look at the complete diff at https://github.com/mirrors/linux-2.6/commit/83cd4fe

af5ab27 might also help somewhat, but I believe the other one is the major culprit.

Jeremy Davis (jedmeister) wrote :

That would be awesome Alex. I'm really looking forward to resolving this long standing issue with the LTS version of Ubuntu.

WebNuLL (babciastefa) wrote :

On Intel Celeron 900 mhz i have same problem, powertop reports me "[kernel scheduler] Load balancing tick" at the top (15-30%).

Fixing this bug my Tablet PC's battery life will extend i think.

Benjamin Schmid (benbuntu) wrote :

Same problem here on a Thinkpad Edge 11" AMD Neo II K325:
  Wakeups-from-idle per second : 355.6 interval: 15.0s
    49.5% (313.0) [Rescheduling interrupts] <kernel IPI>
    19.7% (124.3) [kernel scheduler] Load balancing tick

Would be great if this problem can be solved with Natty.

Benjamin Schmid:
As long as Natty uses the 2.6.36 or later, this should not be a problem - it is a kernel issue, not an issue related to Ubuntu alone.

WebNull:
It will most definitely extend your battery life! Using the mainline kernel has helped a lot :)

Really an annoying bug. Why will it not be fixed in LTS? Really a showstopper on mobile systems.

florinn (florinnaidin) wrote :

powertop on Asus U35JC with Intel Core i3

Top causes for wakeups:
  47.5% (228.6) [kernel scheduler] Load balancing tick
  26.9% (129.6) firefox-bin
   3.5% ( 16.9) thunderbird-bin
   2.7% ( 13.0) USB device 2-1.3 : USB Receiver (Logitech)

Tried linux-image-2.6.37-020637rc2-generic kernel and Load balancing tick droppped to 2-3%

Cristian KLEIN (cristiklein) wrote :

@florinn: Please close all applications (especially firefox and thunderbird) before posting such measurements. In your case, firefox is probably running some heavy animations or executing some scripts with many timeouts. It is not the kernel's fault that the user-space is generating useless wakeups.

On my laptop, with kernel 2.6.37, I easily get under 20 wakeups/second.

florinn:
that's my experience too, 2.6.37 makes my laptop silent again, and the load balancing ticks are much, much fewer.

I see that you run the RC2 version of 2.6.37, just thought I'd mention that the final version of 2.6.37 is out on http://kernel.ubuntu.com/~kernel-ppa/mainline/. The name says "Natty", but it runs fine on Maverick as well :)

You may want to try the 2.6.38RC2 Natty kernel as well - It seems very fast, but in my experience this kernel made my laptop run hotter and the fans run all the time - but I guess that's a separate issue ;)

I'm seeing this too on Ubuntu Maverick 32-bit (kernel 2.6.35-27-generic-pae) on an Intel Core Duo quad core 3 GHz processor. Around 50% "[kernel scheduler] Load balancing tick" pretty much continously.

My apologies if this has already been explained somewhere, but exactly what is a "load balancing tick", and is it actually a problem to have a lot of them?

Benjamin Schmid (benbuntu) wrote :

@Captain Chaos: To keep it short: These ticks are imposed by the Linux Kernel while trying to shift & balance the workload over the available CPU cores. The problem here is, that these "ticks" occur during idling phases and therefore inhibit the CPUs to fall into their power-saving states. A little annoying for mobile users as this reduced battery life and increases heat & fan activity. That's all.

This is a Linux kernel (not an Ubuntu) issue, so we just have to wait until Linus integrates the fixes into the mainline kernel to get rid of this annoyance. The Ubuntu & Kernel guys already took great actions to trigger a solution. Many thanks for that!

Peter Sasi (peter-sasi) wrote :

@Benjamin Schmid: I think the fix is done in the Linux tree (practically all versions later than 2.6.35 behave a lot better), it just has not been ported back to the ubuntu 2.6.35 tree...
And it is much annoying...

This problem appeared in 2.6.32 Kernel.
Beginning with 2.6.37 Kernel this is solved.

I observe increased battery life (+15-20%) in my old laptop.
sudo powertop, agrees with this impression

Ubuntu users can use this kernel
http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=M;O=D
but they are going to loose some ubuntu specific customizations (ureadahead
mainly)

For Lucid there is a much better solution that i currently use. No side
effects so far...
https://launchpad.net/~kernel-ppa/+archive/ppa?field.series_filter=lucid

Chad A. Davis (chadadavis) wrote :

This is fixed in the current Natty (beta) with kernel 2.6.38-5-generic (the stock kernel).

On Maverick I was getting several hundred wakeups per second on the stock kernel (2.6.35-something), almost all from the load balancing tick.

Now the load balancing tick is rarely listed in the 'top causes for wakeups' from powertop.

This will improve your battery life, but depending on your system, you may have other things to watch out for (e.g. disable Flash).

ggonlp (oktobermann) wrote :

I remember trying that out a few weeks ago (think it was a 2.6.36 kernel). While the load balancer wakeups were gone, I got just as many from a kworker process, so no real improvement...

Peter Sasi (peter-sasi) wrote :

ggonlp: it might be the case, that 2.6.36 fixes balancing wakeups, but introduces worker wakeups. 2.6.37 and 2.6.38 should be okay on the other hand. Have you tried those?

ggonlp (oktobermann) wrote :

Sorry, not yet - thanks for the hint though. I'm on an HP laptop and new
kernels mean for me every time an odyssey as graphics and wifi will take
a good day's work to get running :-(

On 03/08/2011 10:43 PM, Peter Sasi wrote:
> ggonlp: it might be the case, that 2.6.36 fixes balancing wakeups, but
> introduces worker wakeups. 2.6.37 and 2.6.38 should be okay on the other
> hand. Have you tried those?
>

gcc (chris+ubuntu-qwirx) wrote :

@Jeremy Foshee, this seems to be a regression between Lucid alpha 2 and alpha 3, and it's affecting LTS users with MUCH lower battery life than expected, hence it's a hardware problem.

Please could you reconsider your rejection for Lucid?

I tried Natty, this bug is finally solved with the 2.6.38 kernel

Peter Sasi (peter-sasi) wrote :

This bug I suppose is the root cause of the second power consumption regression found here:
http://www.phoronix.com/scan.php?page=article&item=linux_kernel_regress2

Basically power usage looks like:
- 2.6.24-2.6.34: average of all tests ~21-22W
- 2.6.35-2.6.37: average of all tests ~25W
- 2.6.38-2.6.39: average of all tests ~26W

This translates into laptop operation times under test workload:
mAh V W mA h

6600 10,8 21 1944,44 3,39

6600 10,8 22 2037,04 3,24

6600 10,8 25 2314,81 2,85

6600 10,8 26 2407,41 2,74

Richard Kleeman (kleeman) wrote :

The last two comments on this bug are contradictory. Checking powertop with the 2.6.38 kernel for natty versus earlier kernels shows for me that the *type* of kernel interrupt has changed but the number of them has not and has increased if anything. I think this bug is serious and getting worse and requires the attention of a kernel developer. It reflects badly on linux in general if power consumption is increasing markedly while performance is not markedly increasing at the same time (which appears the case on the phoronix benchmarks).

To solve this requires someone familiar with the kernel to report it on the kernel bugzilla. There has been some discussion of the issue on lmkl but it looks very inconclusive and low priority to me. I don't know why exactly.

Kai Pastor (dg0yt) wrote :

Only a few weeks ago I started looking for the root cause of my laptop's noise and heat. I used Ubuntu Lucid 10.04 on a dual core laptop and eventually found this bug.

I tried newer kernels from https://launchpad.net/~kernel-ppa/+archive/ppa?field.series_filter=lucid
Unfortunately, experience and reports (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/760131) show that 2.6.38 kernels introduce new problems.

I now switched to the latest 2.6.37 from that PPA: This configuration appears to be the most silent in terms of powertop and fan noise in my experience.

I'm really disappointed that the orginal issue has not yet been solved for LTS release users, more than one year after reporting. Doesn't it affect commercial clients (http://www.h-online.com/open/news/item/LVM-insurance-company-switches-10-000-systems-to-Ubuntu-10-04-LTS-1233194.html), too?

oldmankit (oldmankit) wrote :

My two cents is that I was waiting for Ubuntu Natty for a newer kernel and therefore better performance, but I appear to have more wakeups and less battery life than before.

ggonlp (oktobermann) wrote :

I'm with oldmankit on this - the situation is getting worse rather than better. I wrote a script mostly based on lesswatts.org suggestions which, for my HP Probook 4720s halves wakeups (still at 50/sec) and reduces fan usage. Use at your own discretion :-)

oldmankit (oldmankit) wrote :

That script looks useful, thanks for posting it.

I've all turned round on the subject. I ran Ubuntu Classic (no effects) and ran powertop with no apps running. It was significantly better than in Ubuntu 10.10. It was actually really good. The red bar in powertop that shows wakeups-from-idle per second, well I found out it is not always red! It was orange for the first time.

I'll just avoid using Unity, which givers my processor a hard time anyway.

skhawam (s-khawam) wrote :

I have a Dell Mini 10v and managed to get good results with Natty (11.04) after some fiddling around. It's better than any other version I had till now (I have been following this bug since its beginning!). Without wireless I get 20ms C3 state, and with wireless one around 3ms C3 state (with no applications running, and not touching the keyboard or touchpad). The machine doesn't get hot (the mini 10v is fanless, high temperature is easy to detect).

-I have disabled Unity, although I dont think Unity itself is the cause of wakeups
-Compiz generates a lot of wakeups on the GPU, so I removed all the features that I dont use and kept only the ones I use
-Installed the latest intel GPU driver from ppa (the Mini 10v has the i915)
-Disabled the SD-Card reader (using 'rmmod usb_storage') which was generating extra wakeups (I re-enable it when I want to insert a card).

Even though the situation is better now, I'm sure there is many other things that can be improved in the kernel to reduce the wakeup further, say to get 30ms C3 when the wireless is on, or to reduce the wakeups when the touchpad is used.

Phillip Susi (psusi) wrote :

On 5/17/2011 10:59 PM, skhawam wrote:
> -Compiz generates a lot of wakeups on the GPU, so I removed all the features that I dont use and kept only the ones I use

I once found a DRI configuration utility that could disable vsync and
found that got rid of a lot of wakeups from the GPU.

Peter Sasi (peter-sasi) wrote :
Download full text (3.1 KiB)

I have updated my tests for both the latest 2.6.38 kernel of natty and the latest 2.6.34 kernel from mainline as suggested, on my Thinkpad T61 laptop.
After login I have sudoed previously, then run:
sudo powertop -d -t 60 > ~/Desktop/powertop_dump-`uname -r`.log
Running on battery, not having started anything but one terminal window after logon.
Both kernels, results attached: there is a visible difference still! 15,5W versus 17,9W meaning 3,4 hours versus 2,9 hours = a half an hour of time on battery!

I think both power regressions found by Phoronix at 2.6.35 and 2.6.38 are there.
See http://www.phoronix.com/scan.php?page=article&item=linux_kernel_regress2

2.6.34:
PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation

Collecting data for 60 seconds

Cn Avg residency
C0 (cpu running) ( 1,9%)
C0 0,0ms ( 0,0%)
C1 mwait 0,0ms ( 0,0%)
C2 mwait 0,5ms ( 0,4%)
C6 mwait 6,2ms (97,7%)
P-states (frequencies)
Turbo Mode 0,9%
  2,50 Ghz 0,0%
  1,60 Ghz 0,0%
  1200 Mhz 0,0%
   800 Mhz 99,1%
Wakeups-from-idle per second : 164,1 interval: 60,0s
Power usage (ACPI estimate): 15,5W (3,4 hours)
Top causes for wakeups:
  29,5% ( 61,0) [uhci_hcd:usb5, yenta, nvidia] <interrupt>
  24,2% ( 50,0) [kernel core] hdaps_mousedev_poll (hdaps_mousedev_poll)
  19,3% ( 39,9) [kernel core] hrtimer_start (tick_sched_timer)
  13,4% ( 27,8) [kernel scheduler] Load balancing tick
   4,8% ( 9,9) gwibber-service
   2,4% ( 5,0) [ata_piix] <interrupt>
   1,7% ( 3,6) compiz
   1,1% ( 2,2) nautilus
   0,8% ( 1,7) gnome-terminal

2.6.38:
PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation

Collecting data for 60 seconds

Cn Avg residency
C0 (cpu running) ( 2,2%)
polling 0,1ms ( 0,0%)
C1 mwait 0,0ms ( 0,0%)
C2 mwait 0,6ms ( 0,7%)
C6 mwait 4,5ms (97,1%)
P-states (frequencies)
Turbo Mode 0,9%
  2,50 Ghz 0,0%
  2,00 Ghz 0,0%
  1,60 Ghz 0,1%
   800 Mhz 99,0%
Disk accesses:
The application 'gvfsd-metadata' is writing to file 'home-3c698ca6.log' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home-3c698ca6.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'rs:main Q:Reg' is writing to file 'auth.log' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home.SF5MWV' on /dev/sda5
The application 'gvfsd-metadata' is writing to file 'home.SF5MWV' on /dev/sda5
Wakeups-from-idle per second : 227,9 interval: 60,0s
Power usage (ACPI estimate): 17,9W (2,9 hours)
Top causes for wakeups:
  24,9% ( 61,0) [uhci_hcd:usb5, yenta, nvidia] <interrupt>
  20,4% ( 50,0) [kernel core] hdaps_mousedev_poll (hdaps_mousedev_poll)
  15,1% ( 36,9) [extra timer interrupt]
  14,8% ( 36,4) [kernel core] hrtimer_start (tick_sched_timer)
   7,4% ( 18,2) compiz
   6,2% ( 15,2) [kernel scheduler] Load balancing tick
   4,1% ( 10,0) gwibber-service
   2,2% ( 5,5) kworker/0:0
   1,6% ( 4,0) [ata_piix] <interrupt>
   0,0% ( 0,0)D gvfsd-metadata
   0,0% ( 0,0)D rs:main Q:Reg
   0,9% ( 2,3) nautilus
   0,7% ( 1,7) gnome-term...

Read more...

Richard Kleeman (kleeman) wrote :

Yes I have also seen a similar degradation between the two kernels. What is notable in your case and mine is that the kernel interrupts are labelled differently between the two cases. There are actually fewer load balancing ticks in the later kernel but other categories eg [extra timer interrupt] are chewing up power. So something appears to have changed but not in a good way and focussing solely on [kernel scheduler] Load balancing tick is not appropriate.

kecsap (csaba-kertesz) wrote :

I think I had the same problem on a Dell Latitude E4300 (Core 2/P9300) and the interesting thing was that the overload happened mostly when the laptop was being charged and when I removed the AC adapter to run on battery, after some minutes, the laptop was responsible again. In my case, when this bug happened, the whole laptop was not flawless any more, very slow and high CPU load. There is an other problem that the laptop shits on the cpu scaling policy and decides about the current speed without any sane reasons. If this bug happened and the cpu scaling dropped the frequency to 800 Mhz, the laptop was quite useless when it was being charged.

I think I experienced the bug under Maverick first and the upgrade to Natty did not help. Now I upgraded to the latest unofficial natty kernel: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.39-rc4-natty/ and the intel graphics driver from this PPA: https://launchpad.net/~glasen/+archive/intel-driver and after one day of use, the problems seem to be solved expect the cpu scaling crazyness, but I do not have too much hope that a linux will ever work on this laptop without problems.

kecsap (csaba-kertesz) wrote :

A bit off-topic to my previous comment: The PPA version of the intel driver did not make any difference, however, the bluetooth causes the system hang before or after suspend if it is enabled. I did not mind too much just disabled the bluetooth in the Bios.

Now, the system seems to be stable.

kecsap (csaba-kertesz) wrote :

Yuppie, now it "seems" again that I have found the good combination:

- Nothing fixes the cpu frequency flickering. The laptop just shits on the cpufreq settings and decides on its own, what is the best frequency for me. Come on... Sucks.
- This "Load balancing tick" bullshit stopped when I downgraded my bios back to the original version what the laptop had when it was shipped. I upgraded recently to the newest version and it was my last idea like the root cause of these problems. Laptop: Dell Latitude E4300, the BIOS version A06 again.
- But I got back an old bug with v2.6.39-rc4-natty kernel. Namely the backlit of the LCD did not come back after resume. Nice. I switched back to the original natty kernel and this problem seems to be solved also.

So the "workaround recipe" for Dell Latitude E4300 owners with this bug:

1. Downgrade the BIOS to an earlier version. A06 or A07 is a good candidate. (I googled a manual to make a pendrive with bootable DOS and I downloaded the A06 BIOS "upgrade" file from the Dell site to the pendrive.)
2. If this bug still happens -> upgrade to Natty.

Other notes:
(3a. In any way, disable the bluetooth in the BIOS. It just makes problems with suspend/resume and for me, an attempt to send file via bluetooth from my phone made the laptop frozen with a kernel oops.)
(3b. I do not think so that it makes anything better, but I have installed the latest Intel graphics drivers from the mentioned PPA in my previous comments.)

Uff.

The attachment "Patch from mailing list" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

Brad Figg (brad-figg) wrote :

The desired commit has been applied an released in Lucid (and all other stable kernels). Please update with the latest SRU kernel.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu):
assignee: nobody → Matúš Behun (matus-behun)
Phillip Susi (psusi) on 2012-11-18
Changed in linux (Ubuntu):
assignee: Matúš Behun (matus-behun) → nobody
Changed in linux-2.6 (Debian):
status: Incomplete → Fix Released
Displaying first 40 and last 40 comments. View all 203 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.