Kernel 2.6.24-2 causing ~1000 wakeups by "Rescheduling Interrupts"

Bug #177895 reported by robe
58
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Medium
Colin Ian King

Bug Description

Binary package hint: linux-image-2.6.24-2-generic

Today I installed a new Kernel linux-image-2.6.24-2-generic_2.6.24-2.4_i386.deb. Now powertop shows about 200 - 2000 wakeups/sec caused by "Rescheduling Interrupts". The System is completly Idle, for example Opera adds about 1000 wakeups.
Rescheduling Interrupts should balance Tasks between two CPUs, maybe this is a Scheduler Problem?

http://www.bughost.org/pipermail/power/2007-December/001187.html

It seems to be a kernel problem, disabling SMP in the BIOS workarounds the problem.
System is a Dell XPS M1330 Santa Rosa Core 2 Duo Notebook.

This is a critical Bug for Notebooks as it takes about 5W more on my machine.

If you need further information please tell me.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Robe,

Thank you for taking the time to report this bug and helping to make Ubuntu better. Per the kernel team's bug policy, can you please attach the following information:

* uname -a > uname-a.log
* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

Please be sure to attach each file as a separate attachment. For more information regarding the kernel team bug policy, please refer to https://wiki.ubuntu.com/KernelTeamBugPolicies . Thanks again and we appreciate your help and feedback.

Changed in linux:
status: New → Incomplete
Revision history for this message
robe (r-evert) wrote :

uname -a

Revision history for this message
robe (r-evert) wrote :

cat /proc/version_signature

Revision history for this message
robe (r-evert) wrote :

dmesg

Revision history for this message
robe (r-evert) wrote :

sudo lspci -vvnn

Revision history for this message
robe (r-evert) wrote :

And a powertop "Screenshot". The system is idle, only Opera and kopete are running. The 500wakeups caused by the touchpad are another bug, I guess.

     PowerTOP version 1.9 (C) 2007 Intel Corporation

Cn Verweildauer P-States (Frequenzen)
C0 (Prozessor läuft) (24,8%) 1,81 GHz 0,0%
C1 0,0ms ( 0,0%) 1,80 GHz 0,0%
C2 0,1ms ( 1,5%) 1200 MHz 0,0%
C3 0,8ms (73,8%) 800 MHz 100,0%

Aufwachen pro Sekunde : 1090,2 Intervall: 2,0s
Stromverbrauch (nach ACPI): 19,8W (1,1 Std.)

Häufigste Ursachen für das Aufwachen:
  48,6% (917,0) <kernel IPI> : Rescheduling interrupts
  26,7% (504,0) <interrupt> : PS/2 keyboard/mouse/touchpad
   5,6% (106,0) USB Gerät 1-2.1 : BCM2045 (Broadcom Corp)
   5,3% (100,0) <interrupt> : uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb7
   3,2% ( 60,5) <interrupt> : extra timer interrupt
   2,5% ( 47,0) <interrupt> : uhci_hcd:usb2, uhci_hcd:usb4, HDA Intel
   1,9% ( 35,0) amarokapp : schedule_timeout (process_timeout)

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
Revision history for this message
Václav Šmilauer (eudoxos) wrote :

I confirm this on Lenovo Thinkpad x61s with Core2Duo. I have now 530 wakeups/sec, of which 34% rescheduling interrupts and 17% PS/2 keyboard/mouse/touchpad. Attaching required files as well.

Revision history for this message
Václav Šmilauer (eudoxos) wrote :
Revision history for this message
Václav Šmilauer (eudoxos) wrote :
Revision history for this message
Václav Šmilauer (eudoxos) wrote :
Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote :

Confirming on Dell Latitude D830

Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote :
Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote :
Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote :
Revision history for this message
robe (r-evert) wrote :

Anyone tried Kernel linux-image-2.6.24-3-generic?

Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote : Re: [Bug 177895] Re: Kernel 2.6.24-2 causing ~1000 wakeups by "Rescheduling Interrupts"

On Sunday 13 January 2008 21:43:41 robe wrote:
> Anyone tried Kernel linux-image-2.6.24-3-generic?

Yes, I have not quite as many (300-400 now) as before, but then again, I never
quite reached 1000 wake ups before (more like 500-600), either.

Revision history for this message
Erik Andrén (erik-andren) wrote :

This [1] might have some relation to this bug.
[1] http://www.bughost.org/pipermail/power/2007-December/001188.html

Revision history for this message
Matt Price (matt-price) wrote :

confirming on dell latitude d820, with a coreduo chip.

Revision history for this message
Matt Price (matt-price) wrote :
Revision history for this message
Matt Price (matt-price) wrote :
Revision history for this message
Matt Price (matt-price) wrote :
Revision history for this message
Matt Price (matt-price) wrote :
Revision history for this message
Matt Price (matt-price) wrote :

output of powertop. forefox-3 & nvidia are the only competitors to Rescheduling interrupts for Worst Power Hog Award. battery life on this machine is down to a shocking 1.3 hours, with wireless, bluetooth and even usb turned off.

Cn Avg residency P-states (frequencies)
C0 (cpu running) (28.2%) 1.84 Ghz 0.0%
C1 0.0ms ( 0.0%) 1333 Mhz 0.0%
C2 0.1ms ( 0.3%) 1000 Mhz 100.0%
C3 1.2ms (71.4%)

Wakeups-from-idle per second : 598.2 interval: 10.0s
Power usage (ACPI estimate): 22.4W (0.9 hours)

Top causes for wakeups:
  47.8% (378.3) <kernel IPI> : Rescheduling interrupts
  11.0% ( 86.8) firefox-3.0 : schedule_timeout (process_timeout)
  10.6% ( 83.8) <interrupt> : extra timer interrupt
   8.0% ( 63.2) <interrupt> : nvidia
   5.0% ( 39.3) <interrupt> : PS/2 keyboard/mouse/touchpad
   4.6% ( 36.7) trackerd : do_nanosleep (hrtimer_wakeup)

Revision history for this message
Guillermo Pérez (bisho) wrote :

On Ubuntu hardy
Kernel: Linux fenix 2.6.24-4-generic #1 SMP Mon Jan 14 17:30:39 UTC 2008 i686 GNU/Linux

powertop output (after following all sugerences):

Cn Avg residency P-states (frequencies)
C0 (cpu running) (25.7%) 2.21 Ghz 15.4%
C1 0.0ms ( 0.0%) 2.21 Ghz 0.0%
C2 0.2ms ( 0.4%) 1200 Mhz 1.2%
C3 1.5ms (73.9%) 800 Mhz 83.4%

Wakeups-from-idle per second : 512.4 interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  46.8% (368.7) <kernel IPI> : Rescheduling interrupts
  10.6% ( 83.3) <interrupt> : i915@pci:0000:00:02.0
   9.3% ( 73.6) pidgin : schedule_timeout (process_timeout)
   7.2% ( 57.0) <interrupt> : uhci_hcd:usb2, uhci_hcd:usb4, HDA Intel
   6.5% ( 51.0) firefox-3.0 : schedule_timeout (process_timeout)
   5.2% ( 41.3) firefox-3.0 : futex_wait (hrtimer_wakeup)
   3.6% ( 28.0) Xorg : do_setitimer (it_real_fn)
   2.7% ( 20.9) gnome-panel : schedule_timeout (process_timeout)
   2.2% ( 17.4) <interrupt> : extra timer interrupt
   1.3% ( 10.2) USB device 4-2 : USB-PS/2 Optical Mouse (B16_b_02)
   1.0% ( 8.0) <kernel module> : usb_hcd_poll_rh_status (rh_timer_func)
   0.9% ( 7.0) compiz.real : schedule_timeout (process_timeout)
   0.4% ( 2.8) trackerd : schedule_timeout (process_timeout)
   0.2% ( 1.6) xfsbufd : schedule_timeout (process_timeout)
   0.2% ( 1.2) /usr/share/reve : schedule_timeout (process_timeout)
   0.2% ( 1.2) nm-applet : schedule_timeout (process_timeout)
   0.1% ( 1.1) netspeed_applet : schedule_timeout (process_timeout)
   0.1% ( 1.0) <kernel IPI> : TLB shootdowns
   0.1% ( 1.0) ntpd : do_setitimer (it_real_fn)
   0.1% ( 1.0) dhcdbd : schedule_timeout (process_timeout)
   0.1% ( 1.0) NetworkManager : input_open_polled_device (delayed_work_timer_fn)
   0.1% ( 1.0) NetworkManager : tg3_open (tg3_timer)

Revision history for this message
robe (r-evert) wrote :

Problem persists with 2.6.24-4 Kernel.

But somehow it uses less Battery... Screenshot is with 3G Data Card and Music playing, with the "old" 2.6.22 it took about 20-22W in this state... interesting... I wonder how much Battery it will take with less wakeups.

"Screenshot:"

PowerTOP version 1.9 (C) 2007 Intel Corporation

Cn Verweildauer P-States (Frequenzen)
C0 (Prozessor läuft) (11,0%) 1,81 GHz 0,0%
C1 0,0ms ( 0,0%) 1,80 GHz 0,0%
C2 0,0ms ( 1,0%) 1200 MHz 0,0%
C3 1,4ms (88,0%) 800 MHz 100,0%

Aufwachen pro Sekunde : 893,4 Intervall: 5,0s
Stromverbrauch (nach ACPI): 17,1W (2,6 Std.)

Häufigste Ursachen für das Aufwachen:
  87,4% (1464,0) <kernel IPI> : Rescheduling interrupts
   8,7% (146,0) <interrupt> : extra timer interrupt
   2,8% ( 47,0) <interrupt> : uhci_hcd:usb2, uhci_hcd:usb4, HDA Intel
   0,3% ( 4,8) <interrupt> : uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb7
   0,3% ( 4,8) USB Gerät 7-2.1 : Microsoft Wireless Optical Mouse .00 (Microsoft)
   0,1% ( 2,0) <interrupt> : nvidia
   0,1% ( 1,6) amarokapp : schedule_timeout (process_timeout)
   0,0% ( 0,8) <interrupt> : ahci
   0,0% ( 0,6) <kernel core> : ehci_work (ehci_watchdog)
   0,0% ( 0,6) opera : schedule_timeout (process_timeout)
   0,0% ( 0,6) Xorg : do_setitimer (it_real_fn)
   0,0% ( 0,4) wvdial : schedule_timeout (process_timeout)
   0,0% ( 0,2) <interrupt> : PS/2 keyboard/mouse/touchpad

Changed in linux:
assignee: ubuntu-kernel-team → colin-king
Revision history for this message
Colin Ian King (colin-king) wrote :

Enabling multi core power savings could help:

echo 1 > /sys/devices/system/cpu/sched_mc_power_savings

This minimize the number of cores being loaded and hopefully enable idle cores to drop into lower power states. However my
tests show that the savings may be marginal.

The sched_mc_power_savings setting depends on the CONFIG_SCHED_MC being enabled. Currently only the x86_64 kernels have this as default.

Revision history for this message
robe (r-evert) wrote :

Hi,

System is completly idle:

root@robxps:/# grep CONFIG_SCHED_MC /boot/config-2.6.24-5-generic
CONFIG_SCHED_MC=y
root@robxps:~# echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
root@robxps:/sys/devices/system/cpu# cat sched_mc_power_savings
1

Powertop Screenshot:

     PowerTOP version 1.9 (C) 2007 Intel Corporation

Cn Verweildauer P-States (Frequenzen)
C0 (Prozessor läuft) ( 7,4%) 1,81 GHz 0,0%
C1 0,0ms ( 0,0%) 1,80 GHz 0,0%
C2 0,0ms ( 0,4%) 1200 MHz 0,0%
C3 1,7ms (92,2%) 800 MHz 100,0%

Aufwachen pro Sekunde : 671,5 Intervall: 10,0s
Stromverbrauch (nach ACPI): 17,3W (0,6 Std.)

Häufigste Ursachen für das Aufwachen:
  91,4% (1190,8) <kernel IPI> : Rescheduling interrupts
   1,8% ( 23,1) USB Gerät 2-2.1 : Microsoft Wireless Optical Mouse .00 (Microsof
   1,5% ( 18,9) <interrupt> : extra timer interrupt
   0,8% ( 10,3) wvdial : schedule_timeout (process_timeout)
   0,8% ( 10,3) opera : schedule_timeout (process_timeout)
   0,8% ( 10,0) <kernel module> : ehci_work (ehci_watchdog)

Seems like it does not help.
Binding Opera and Xorg to the same core helps somehow for the rescheduling, but as Opera spawns own operapluginwrapper processes this is senseless.

Thanks for investigating.

Revision history for this message
Colin Ian King (colin-king) wrote :

Enabled the CONFIG_SCHED_MC option to allow multi-core scheduling for i386, ia64 and sparc (it is already enabled for x86_64) to allow one to turn on multi core power savings by:

echo 1 > /sys/devices/system/cpu/sched_mc_power_savings

Although grep CONFIG_SCHED_MC /boot/config-2.6.24-5-generic does yields CONFIG_SCHED_MC=y, it seems that it isn't actually enabled.

Rebuilt (2.6.24-7.13) with CONFIG_SCHED_MC enabled. When tested on i386 platform, on an "idle" dual core Centrino powertop shows a drop of "Rescheduling Interrupts" from 11% (not enabled) to ~4% (enabled), shows minor power saving improvement.

Changed in linux:
status: Triaged → Fix Committed
Revision history for this message
Colin Ian King (colin-king) wrote :

If you want to totally turn off extraneous cpu cores, one can actually do the following:

echo 0 > /sys/devices/system/cpu/cpu1/online

This only applies for cores 1..N-1 on a N core machine, i.e. you cannot turn core 0 off. This will then turn off any "Rescheduling Interrupts" since there is only 1 core running.

Let me know if it reduces your power consumption.

Revision history for this message
Guillermo Pérez (bisho) wrote :

This is very strange.

After disabling cpu1 the wakeups-from-idle per second drops from ~31000 to ~250, and the <kernel IPI> : Rescheduling interrupts disappears from the output of the powertop.

But the power consumption actually increases. It goes from 22.8W while idling with 2 cores activated to 24.8W with only one. It should be the other way, so I dunno what is really happening. Maybe the powertop utility it's not really acurate with two cores.

I forgot to say that this is on a Dell D830. And the battery runtime has improved a lot anyway from gutsy -> hardy (from 3.5 hours to almost 5 hours!)

Revision history for this message
Guillermo Pérez (bisho) wrote :

I tried to put 3100 not 31000 wakeups :)

And I have noticed that the counter is clearly broken. The total says 3081 but summing-up the wakeups from the list should be closer to 430. Kernel IPI is the top cause with 166 wakeups, 38% of the total. Why the hell says 3081 is the total?
------------------------------------------
Wakeups-from-idle per second : 3081.2 interval: 5.0s
no ACPI power usage estimate available

Top causes for wakeups:
  38.0% (166.8) <kernel IPI> : Rescheduling interrupts
  20.0% ( 87.6) firefox : schedule_timeout (process_timeout)
  12.7% ( 55.8) <interrupt> : extra timer interrupt
  11.9% ( 52.4) firefox : futex_wait (hrtimer_wakeup)
   4.6% ( 20.0) <interrupt> : uhci_hcd:usb2, uhci_hcd:usb4, HDA Intel
   4.6% ( 20.0) USB device 4-2 : USB-PS/2 Optical Mouse (B16_b_02)
   1.8% ( 8.0) <kernel module> : usb_hcd_poll_rh_status (rh_timer_func)
   0.9% ( 3.8) <interrupt> : libata, b43
   0.7% ( 3.0) trackerd : schedule_timeout (process_timeout)
   0.5% ( 2.4) <interrupt> : eth0
   0.5% ( 2.2) gnome-terminal : schedule_timeout (process_timeout)
   0.4% ( 1.6) xfsbufd : schedule_timeout (process_timeout)
   0.3% ( 1.2) /usr/share/reve : schedule_timeout (process_timeout)
   0.3% ( 1.2) Xorg : schedule_timeout (process_timeout)
   0.3% ( 1.2) netspeed_applet : schedule_timeout (process_timeout)
   0.2% ( 1.0) NetworkManager : tg3_open (tg3_timer)
   0.2% ( 1.0) dhcdbd : schedule_timeout (process_timeout)
   0.2% ( 1.0) NetworkManager : input_open_polled_device (delayed_work_timer
   0.2% ( 1.0) nm-applet : schedule_timeout (process_timeout)
   0.2% ( 1.0) ntpd : do_setitimer (it_real_fn)
   0.2% ( 0.8) <kernel core> : neigh_table_init_no_netlink (neigh_periodic_
   0.1% ( 0.6) pulseaudio : schedule_timeout (process_timeout)
   0.1% ( 0.6) <kernel module> : neigh_table_init_no_netlink (neigh_periodic_

Revision history for this message
Gabriel Ambuehl (gabriel-ambuehl) wrote :

On Thursday 14 February 2008 20:58:21 Guillermo Pérez wrote:
> And I have noticed that the counter is clearly broken. The total says 3081
> but summing-up the wakeups from the list should be closer to 430. Kernel
> IPI is the top cause with 166 wakeups, 38% of the total. Why the hell says
> 3081 is the total? ------------------------------------------

I once read that powertop does not count all wake ups so the sums may diverge.
However, I've also seen total wakeups be LOWER than if summed up individual
ones so the numbers probably are to be taken with a big grain of salt.

Revision history for this message
robe (r-evert) wrote :

Its possible that the CPU does not sleep properly with one CPU offline ( Just a thought ).

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Ubuntu-2.6.24-8.13

Changed in linux:
assignee: colin-king → nobody
status: Fix Committed → Fix Released
Revision history for this message
Colin Ian King (colin-king) wrote :

>Guillermo Perez wrote:
>
>I forgot to say that this is on a Dell D830. And the battery runtime has improved a lot anyway from gutsy -> hardy >(from 3.5 hours to almost 5 hours!)

With the tickless kernel setting (CONFIG_NO_HZ) being set in 2.4.24.8, one should also see (small) power savings because of less wakeup ticks per second.

Revision history for this message
Guillermo Pérez (bisho) wrote :

With 2.4.24.8 the situation improves again:

  36,5% ( 69,0) <interrupt> : i915@pci:0000:00:02.0
  16,0% ( 30,2) <kernel IPI> : Rescheduling interrupts
  15,9% ( 30,0) Xorg : schedule_timeout (process_timeout)
   5,8% ( 11,0) /usr/bin/condui : schedule_timeout (process_timeout)
   4,3% ( 8,2) <kernel module> : usb_hcd_poll_rh_status (rh_timer_func)
   3,7% ( 7,0) compiz.real : schedule_timeout (process_timeout)
   3,3% ( 6,3) <interrupt> : extra timer interrupt
   2,6% ( 4,9) conduit.real : schedule_timeout (process_timeout)
   1,5% ( 2,8) trackerd : schedule_timeout (process_timeout)
   0,9% ( 1,7) gnome-terminal : schedule_timeout (process_timeout)
   0,8% ( 1,6) xfsbufd : schedule_timeout (process_timeout)
   0,6% ( 1,1) /usr/share/reve : schedule_timeout (process_timeout)
   0,5% ( 1,0) <interrupt> : libata, b43
   0,5% ( 1,0) nm-applet : schedule_timeout (process_timeout)
   0,5% ( 1,0) netspeed_applet : schedule_timeout (process_timeout)
   0,5% ( 1,0) NetworkManager : tg3_open (tg3_timer)
   0,5% ( 1,0) dhcdbd : schedule_timeout (process_timeout)

The total count is still crazy, but looking into the list looks much better, ~160 wakeups per second while idle (with GNOME open with system monitor and other applets at the gnome-panel).

Revision history for this message
Davide Bertola (dade) wrote :

I have the same problem. I use a macbook core duo first generation.
The kernel 2.6.24-8.14 uses a lot of power,
powertop shows " <kernel IPI> : Rescheduling interrupts " doing ~300/500 wakeups on idle.
When not idle it goes up to 6000 - 10000 wakeups

There is also a usb device 4-1 that makes 100 wakeups without reason
(the gutsy kernel does not have these problems)

Just tell me what I need to produce, it won't be a problem.

Revision history for this message
Bruce Cowan (bruce89-deactivatedaccount) wrote :

This is still/is again a problem with 2.6.24-11.17. The system is essentially idle, but I get 400-600 wakeups per second thanks to "Rescheduling interrupts".

See http://lkml.org/lkml/2008/1/21/339 for further analysis.

Revision history for this message
Cleber Santz (clebersantz) wrote :

Same here

using :
Intel core 2 duo T5200
Linux clbr 2.6.24-11-generic #1 SMP Fri Feb 29 22:08:31 UTC 2008 i686 GNU/Linux

     PowerTOP version 1.9 (C) 2007 Intel Corporation

Cn Avg residency P-states (frequencies)
C0 (cpu running) (12.4%) 1.60 Ghz 1.7%
C1 0.0ms ( 0.0%) 1333 Mhz 0.0%
C2 0.4ms ( 1.1%) 1067 Mhz 0.0%
C3 1.5ms (86.6%) 800 Mhz 98.3%

Wakeups-from-idle per second : 609.8 interval: 5.0s
no ACPI power usage estimate available

Top causes for wakeups:
  48.5% (448.8) <kernel IPI> : Rescheduling interrupts
  28.3% (261.6) <interrupt> : PS/2 keyboard/mouse/touchpad
   4.8% ( 44.0) <interrupt> : HDA Intel
   4.2% ( 38.8) <interrupt> : extra timer interrupt
   3.3% ( 30.8) <interrupt> : iwl3945
   1.8% ( 16.8) totem : schedule_timeout (process_timeout)

Revision history for this message
Cleber Santz (clebersantz) wrote :

I have only running at this time firefox with one Tab and totem playing some MP4 audiobook`s

Changed in linux:
status: Fix Released → New
Changed in linux:
assignee: nobody → ubuntu-kernel-team
status: New → Triaged
Changed in linux:
assignee: ubuntu-kernel-team → colin-king
status: Triaged → In Progress
Changed in linux:
assignee: colin-king → abogani
Changed in linux:
assignee: abogani → colin-king
24 comments hidden view all 104 comments
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq,

A couple of things:

1. can you please attach the following information so that we can look at this problem for your machine and configuration in more detail:

* uname -a > uname-a.log
* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

2. Also, can you boot your machine into single user mode by editing the boot line in grub - press e to edit at boot time in grub - and change

kernel /boot/vmlinuz-2.6.24-15-generic root=UUID=..... ro quiet splash

to

kernel /boot/vmlinuz-2.6.24-15-generic root=UUID=..... ro S

and at the recovery menu select 'drop to root shell prompt'

and run powertop in single user mode to see if the system behaves differently than in multi-user mode when idle and let us know the results.

thank you.

Revision history for this message
Barteq (barteqpl) wrote :

Hi Colin. Thanks for reply. Here you are things you've asked me about.

There was a little glitch on single user mode - there are no C or P states in powertop...

Please also notice that these 50k wakeups occurs randomly, usually after some time.

Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq,

Thanks for the information - after my initial look, I suggest a couple more bits of information to help me corner this:

Can you again reboot into single user mode and do the following:

cat /proc/interrupts; sleep 60; cat /proc/interrupts >> interrupts.log

and repeat this about 5 times so I get an idea of interrupt activity on your system and of course attach the log file into this bug report.

Many thanks

Colin.

Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq

IRQ 0 is really being hit excessively by something not behaving correctly. Can you try adding the boot parameter acpi=noirq and seeing if this helps reduce the excessive IRQ 0 interrupts?

Thanks. Colin.

Revision history for this message
Colin Ian King (colin-king) wrote :

Barteq, ..just to add, here are the deltas of IRQ's on your system from the last set of /proc/interrupts capture from your data.. showing a worryingly high IRQ 0 hit.

              CPU0 CPU1
  0: 23573 23612 IO-APIC-edge timer
  1: 558 562 IO-APIC-edge i8042
  8: 0 0 IO-APIC-edge rtc
  9: 94 95 IO-APIC-fasteoi acpi
 12: 0 0 IO-APIC-edge i8042
 14: 0 0 IO-APIC-edge libata
 15: 0 0 IO-APIC-edge libata
 16: 0 0 IO-APIC-fasteoi uhci_hcd:usb3
 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb1
 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb2
 19: 0 0 IO-APIC-fasteoi ehci_hcd:usb6
 22: 0 0 IO-APIC-fasteoi uhci_hcd:usb5
 23: 0 2 IO-APIC-fasteoi ehci_hcd:usb7
217: 2090 2047 PCI-MSI-edge ahci
NMI: 0 0 Non-maskable interrupts
LOC: 13271 12598 Local timer interrupts
RES: 12201 11276 Rescheduling interrupts
CAL: 14 6 function call interrupts
TLB: 65 61 TLB shootdowns
TRM: 0 0 Thermal event interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
MIS: 0

and lspci shows that the following devices share IRQ 0:

00:01.0 PCI bridge [0604]: Intel Corporation Mobile PM965/GM965/GL960 PCI Express Root Port [8086:2a01] (rev 0c) (prog-if 00 [Normal decode])
00:1b.0 Audio device [0403]: Intel Corporation 82801H (ICH8 Family) HD Audio Controller [8086:284b] (rev 03)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 [8086:283f] (rev 03) (prog-if 00 [Normal decode])
00:1c.1 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 [8086:2841] (rev 03) (prog-if 00 [Normal decode])
00:1c.2 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI Express Port 3 [8086:2843] (rev 03) (prog-if 00 [Normal decode])
00:1c.3 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 [8086:2845] (rev 03) (prog-if 00 [Normal decode])
00:1c.4 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 [8086:2847] (rev 03) (prog-if 00 [Normal decode])
01:00.0 VGA compatible controller [0300]: nVidia Corporation Quadro NVS 140M [10de:0429] (rev a1) (prog-if 00 [VGA controller])
03:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 4965 AG or AGN Network Connection [8086:4230] (rev 61)

..one of this may be the culprit. But first I'd like to see of acpi=noirq helps first.

Thanks, Colin.

Revision history for this message
Barteq (barteqpl) wrote :

Hi colin.
After adding acpi=noirq results are quite suprising. 1 digit wakeups from idle in powertop is very good result, but there is no way to get into Xorg with this settings. Nvidia drivers shows strange errors.
Please also look on dmesg. There is some kernel debug info..

Thanks,
Barteq

Revision history for this message
Barteq (barteqpl) wrote :

Hi Colin.
After adding acpi=noirq results are quite suprising. 1 digit wakeups from idle in powertop is very good result, but there is no way to get into Xorg with this settings. Nvidia drivers shows strange errors.
Please also look on dmesg. There is some kernel debug info..

Thanks,
Barteq

Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq,

It appears that there may be some ACPI problems here with the IRQ setup - booting with acpi=noirq definitely shows a far quieter system with only a few hundred IRQs occurring over the 10 second sample periods. I suggest now fulling booting your system using "acpi=noirq" and see if this improves the powertop interrupt and wakeup events.

We may need to start looking at the ACPI DSTD - c.f. https://wiki.ubuntu.com/DebuggingACPI.

In these cases one generally needs to dump the DSTD, decompile it and see if it's incorrect. Then fix it, recompile it and load it at boot time.

Let me know what the fully booted system's activity looks like using powertop and the acpi=noirq boot options.

Colin

Revision history for this message
Colin Ian King (colin-king) wrote :

Also, it may be worth consulting the following page for BIOS upgrades:

http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-67989#changes

Revision history for this message
Barteq (barteqpl) wrote :

Hi Colin.

BIOS upgrade was the very first thing I've made just after buing new machine, so I've got a newest one.

As was said - I'm not able to fully boot up computer. Xorg is crashing with strange nvidia driver errors. With nv or vesa driver screen is full of random green bars from top up to the center, and then there is 1/4 of screen filled of actual image, but don't have to say that it is unusable at all.
Without Xorg running on nvidia binary drivers i've got no ability to check power results. This drivers have 'powermizer' feature to downclock GPU, so on opensource drivers power consumption is even 5 or 6 Watts more..

Of course I'll try once more..

ACPI debugging info will come in next few minutes.

Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq,

This hardware issue is most curious. I am afraid I am going to ask some more questions. I've seen reports of Ubuntu working fine on ThinkPad T61's before, so I am a little perplexed by this not working well on your laptop - perhaps it's the specific model of T61 that is causing problems.

What is the exact model of your machine?

Perhaps you may find users of that same machine on the Ubuntu forums where their machine is working correctly just to help verify the issue is perhaps specific to your machine and not generally T61's of your variety.

It seems very strange to me that with ACPI irq configuration you see so many interrupts, yet without ACPI irq configuration the machine is behaves more "normally" in a single user state. I am afraid I am not an expert on specific ACPI issues. Is it possible to revert back to the original BIOS settings to see if this makes the laptop work correctly in the normal ACPI boot mode? Was there any specific reason for you upgrading the BIOS when you first got the laptop?

Do you see the same problem on older kernels or installations of Ubuntu?

Colin

Revision history for this message
Barteq (barteqpl) wrote :

It's quite new machine, manufactured in february this year. Maybe there were some minor changes in architeture without noticing anynone..
My model is 7664-RWU - (its specs - http://www-307.ibm.com/pc/support/site.wss/quickPath.do?quickPathEntry=7664RWU + 2g ram more)

Bios update was made due to.. big amount of wakeups, and fan working on and on with ac plugged in.. This fan issue still occurs but after 3 weeks i'm used to it..

Let's try to reset bios settings, but AFAIK the only changed thing was support for VT couse I'm using it quite often, and it is not enabled by default (strange.. )

I didn't installed any older ubuntus here.. No idea about that, but will try to set up 7.10 on external usb disk for some testing..

Little bit similar hardware (but older) - http://www.klabs.be/~fpiat/linux/debian/Etch_on_Thinkpad_T61.html with 1M interrupts on one core vs 7M I have right now..

Revision history for this message
Barteq (barteqpl) wrote :

Some further investigation. Resetting bios to it's defaults was a good idea.. Don't know what exacly was changed but it improved many things..
Amount of wakeups without any special kernel parameters is quite low after that. I've even managed to get under 16W boundary and made 15,4W with full working desktop, wifi on but hdd off (hdparm -B 120 /dev/sdb - which is not good, but why not for testing).

Here are some results - with acpi=noirq and without it (after bios settings reset).

Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :
Revision history for this message
Barteq (barteqpl) wrote :

Attached Interrupts on desktop full on with wifi active..
After some minutes there is still 1,5k interrupts without a reason, so problem is definitely not gone.
Tomorrow will make some experiments with 7.10.. installer crashes on my laptop with blank screen after xorg starts.. Have to download alternate one.

Revision history for this message
Barteq (barteqpl) wrote :

Powertop on normal laptop session with xorg on and so

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi,

It's good to hear that things are more promising now.

The overall IRQ delta over 10 seconds (from your last results of interrupts_normal_xorg.log) look more promising - IRQ 0 is definitely not being saturated now, but it does appear that you system is busy somewhere causing ~7K rescheduling interrupts per second.

           CPU0 CPU1
  0: 2597 2582 IO-APIC-edge timer
  1: 0 0 IO-APIC-edge i8042
  8: 4 3 IO-APIC-edge rtc
  9: 18 16 IO-APIC-fasteoi acpi
 12: 0 0 IO-APIC-edge i8042
 14: 0 0 IO-APIC-edge libata
 15: 0 0 IO-APIC-edge libata
 16: 35 25 IO-APIC-fasteoi uhci_hcd:usb5, nvidia
 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb1
 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb2
 19: 4 4 IO-APIC-fasteoi ehci_hcd:usb3
 21: 1 1 IO-APIC-fasteoi ehci_hcd:usb4
 22: 0 0 IO-APIC-fasteoi uhci_hcd:usb6, ohci1394, HDA Intel
 23: 0 0 IO-APIC-fasteoi uhci_hcd:usb7
215: 384 410 PCI-MSI-edge iwl4965
216: 14 16 PCI-MSI-edge eth0
217: 3 2 PCI-MSI-edge ahci
NMI: 0 0 Non-maskable interrupts
LOC: 2110 1800 Local timer interrupts
RES: 68628 75310 Rescheduling interrupts
CAL: 0 0 function call interrupts
TLB: 0 6 TLB shootdowns
TRM: 0 0 Thermal event interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
MIS: 0

Running vmstat 1 for ~15-30 seconds will give an idea of how busy the system is - primarily the context switches/second will be interesting to see.

Colin

Revision history for this message
Barteq (barteqpl) wrote :

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 0 0 0 2320476 38056 366092 0 0 0 0 149 2951 2 2 95 0
 0 0 0 2320272 38056 366092 0 0 0 0 188 2889 3 4 92 0
 0 0 0 2320512 38056 366092 0 0 0 0 175 3198 4 3 93 0
 0 0 0 2320656 38056 366092 0 0 0 0 177 3289 3 2 95 0
 0 0 0 2320376 38056 366092 0 0 0 4 119 3260 5 3 92 0
 1 0 0 2320536 38056 366092 0 0 0 0 122 3304 3 3 94 0
 0 0 0 2320664 38056 366092 0 0 0 0 128 3288 4 2 94 0
 0 0 0 2320484 38056 366092 0 0 0 0 173 3457 5 2 92 0
 0 0 0 2320608 38056 366092 0 0 0 0 132 3301 3 3 94 0
 0 0 0 2320344 38056 366092 0 0 0 0 126 3159 5 3 92 0
 0 0 0 2320472 38056 366092 0 0 0 0 118 3267 3 3 94 0
 0 0 0 2320608 38056 366092 0 0 0 0 141 3294 3 3 94 0
 0 0 0 2320320 38056 366092 0 0 0 12 138 3074 3 3 94 0
 0 0 0 2320628 38056 366092 0 0 0 0 157 3468 4 2 93 0
 0 0 0 2320812 38056 366092 0 0 0 0 139 3180 3 3 94 0
 0 0 0 2320540 38056 366092 0 0 0 0 126 3223 3 3 94 0
 0 0 0 2320628 38056 366092 0 0 0 0 122 3171 3 2 94 0
 0 0 0 2320348 38056 366092 0 0 0 0 159 3263 4 2 94 0
 0 0 0 2320516 38056 366092 0 0 0 0 133 3237 3 2 95 0
 0 0 0 2320724 38056 366092 0 0 0 0 170 3334 4 4 92 0
 0 0 0 2320652 38056 366092 0 0 0 0 187 3330 5 1 94 0
 0 0 0 2320748 38056 366092 0 0 0 0 142 3328 4 4 92 0
 0 0 0 2320856 38056 366092 0 0 0 0 133 3269 3 3 94 0

About 3k context switches.. Quite much.. I've asked my two friends to show their vmstat and voila.. the have about 300~500 CS/s.

http://wklejto.pl/2171
http://phpfi.com/308388

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi again,

With that many context switched it may be worth running top in batch mode to see which process(es) are causing this much activity:

top -b -d 10 -n 6

Colin

Revision history for this message
Barteq (barteqpl) wrote :
Download full text (3.6 KiB)

It is Opera.. With no pages opened it looks like this;

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 0 0 0 1088276 93032 1545680 0 0 0 0 79 129 3 0 97 0
 0 0 0 1088168 93032 1545680 0 0 0 0 104 305 4 1 95 0
 0 0 0 1088344 93032 1545680 0 0 0 0 87 130 3 1 96 0
 0 0 0 1088544 93032 1545680 0 0 0 0 75 119 3 0 97 0
 0 0 0 1088232 93032 1545680 0 0 0 0 112 176 2 0 97 0
 0 0 0 1088488 93032 1545680 0 0 0 0 72 125 3 0 97 0
 0 0 0 1088036 93032 1545680 0 0 0 0 74 131 2 0 97 0
 0 0 0 1088460 93032 1545680 0 0 0 In attachment - first two top results - no pages opened0
second two - this page opened.. 0 91 319 3 0 97 0
 0 0 0 1088460 93032 1545680 0 0 0 0 128 254 6 1 93 0
 0 0 0 1088124 93032 1545680 0 0 0 0 75 127 3 0 96 0
 0 0 0 1088340 93032 1545680 0 0 0 0 81 130 3 0 97 0
 0 0 0 1088116 93032 1545680 0 0 0 0 85 153 3 0 97 0
 0 0 0 1088204 93032 1545680 0 0 0 0 82 122 4 0 96 0
 0 0 0 1088064 93032 1545680 0 0 0 0 98 311 4 1 96 0
 0 0 0 1088180 93032 1545680 0 0 0 0 78 114 3 0 97 0
 0 0 0 1088364 93032 1545680 0 0 0 0 73 122 3 0 97 0
 0 0 0 1088084 93032 1545680 0 0 0 0 74 129 3 0 97 0
 0 0 0 1088228 93032 1545680 0 0 0 0 71 119 2 0 98 0
 0 0 0 1088060 93032 1545680 0 0 0 0 80 137 3 1 97 0
 0 0 0 1088312 93032 1545680 0 0 0 0 91 349 4 0 96 0
 0 0 0 1088372 93032 1545680 0 0 0 0 76 143 2 0 97 0
 0 0 0 1088100 93032 1545680 0 0 0 0 72 128 3 1 96 0
 0 0 0 1088316 93032 1545680 0 0 0 0 86 140 3 1 96 0
 0 0 0 1088020 93032 1545680 0 0 0 0 79 129 3 0 97 0
 0 0 0 1088244 93032 1545680 0 0 0 0 77 140 3 1 97 0

With this page opened:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 0 0 0 1092884 93112 1541572 0 0 23 125 200 1202 5 6 87 2
 0 0 0 1092436 93112 1541572 0 0 0 0 134 4304 2 4 94 0
 0 0 0 1092360 93112 1541572 0 0 0 0 113 4534 3 4 93 0
 1 0 0 1092176 93112 1541572 0 0 0 0 136 4364 7 5 88 0
 0 0 0 1092328 93112 1541572 0 0 0 0 105 4532 3 2 95 0
 0 0 0 1092380 93112 1541572 0 0 0 0 109 4456 2 5 93 0
 0 0 0 1092148 93112 1541572 0 0 0 0 97 4510 2 3 95 0
 1 0 0 1092368 93112 1541572 0 0 0 0 183 3997 5 3 92 0
 2 0 0 1092428 93112 1541572 0 0 0 0 122 3681 4 4 92 ...

Read more...

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Barteq,

I suspected it would be some kind of killer app(!).

I think we can conclusively say that this is probably wrapped up now with respect to a kernel issue. I suggest if you think Opera is misbehaving by sucking all your power by over active wakeup calls then you put a bug report against it and possibly reference this bug for some history.

Thanks for all you input so that we could corner this one. If you are OK with this, I'd like to mark this bug as "won't fix" as there is no further kernel issue to resolve.

Regards, Colin

Revision history for this message
Brandon Dubois (tu13es) wrote :

For what it's worth, I upgraded to the 2.6.24-15 kernel that was released in the Hardy Beta and I'm still getting about 50% of my wakeups from <kernel IPI> : Rescheduling interrupts.

I can provide logs etc. if it would help. I've been trying to follow the recent exchange between barteq and colin but got a bit lost.

Thanks,

Brandon

Revision history for this message
Colin Ian King (colin-king) wrote :

Brandon,

Just to re-iterate things:

1. Rescheduling Interrupts are not necessarily a problem - they are just show that the scheduler is working hard to distribute load across multiple cores to keep the processors evenly load and keep overall CPU power down. The main issue is if are they excessive (how many per second) and if so, what is causing them.
2. To find out what is causing them, it is good to look at the interrupt activity over a period of time, by looking at /proc/interrupts. For example, capturing /proc/interrupts in a log, sleeping for 10 seconds and capturing /proc/interrupts again. Take the 1st results from the 2nd and you will then see which interrupts may be causing the wakeups that cause the rescheduling interrupts activity. This may show a mis-configured BIOS (very unlikely, but did happen in Barteq's case).
3. The next thing to look at is user space program activity. Some apps do excessing sleep/wakeups on many threads which cause the scheduler to try to load balance, which shows up a "rescheduling interrupts". A good way to see system activity is use vmstat and look for context switch (cs) activity. If this is excessively high, then one needs to find out which application(s) are contributing to this. The top command is a quick way of seeing which processes are using a lot of CPU and possibly causing this activity.

Hope this helps.

Revision history for this message
Yves Glodt (yglodt) wrote :

Hello,

I use 2.6.24 on my T2300 dualcore laptop, with kubuntu hardy (without opera), and also 2.6.24 on a PIII-3GHz with debian unstable. Note that I do see "Rescheduling Interrupts" as well on my debian, but less often and less intensive. Isn't this a pointer to that it's however a kernel issue?

In gutsy I did not see that many wakeups neither on my laptop, I do not remember having seen "Rescheduling Interrupts" at all in powertop... Also, I dist-upgraded from gutsy to hardy, so my installed software is basically the same.

regards and thanks for your detailed look into this

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Yves,

The rescheduling interrupts information was added into the kernel in the main kernel in July 2007 ( http://lkml.org/lkml/2007/7/30/437 ) and I believe it was not in the Gutsy kernel tree - so this activity was not visible in user space and hence PowerTop would not be able to report it in Gutsy.

Also, even if the software components are the same, the software has changed due to upgrades (e.g. some software may now have more/less threads running), so like for like comparison is difficult.

Colin.

Revision history for this message
Yves Glodt (yglodt) wrote :

Hi Colin, I did not know this, thanks for the information. I will try to track down which program causes the wakeups

Revision history for this message
Brandon Dubois (tu13es) wrote :

Colin,

Thanks for your explanation. I ran cat /proc/interrupts >> interrupts.log;sleep 60;cat /proc/interrupts >> interrupts.log a few times. The system was idle with nothing but a few terminals open.

Let me know if this log helps.

Also, I tried running vmstat but am not exactly sure what I am looking at or if I'm running it right.

Revision history for this message
Colin Ian King (colin-king) wrote :

Brandon,

attached is my awk script to parse the interrupts.log file, to be run as follows:

awk -f dualcore.awk < interrupts.log

it calculates the number of interrupts a second you are getting over a 10 sample second period. From what I can see, you system looks fine and is not suffering from the problem described in this bug. Your first sample of 10 seconds looks as follows:

 IRQ CPU0 IRQ/s CPU1 IRQ/s Description
  0: 330.1 0.0 IO-APIC-edge timer
  1: 0.1 0.0 IO-APIC-edge i8042
  5: 0.0 0.0 IO-APIC-edge serial
  8: 0.0 0.0 IO-APIC-edge rtc
  9: 9.7 0.0 IO-APIC-fasteoi acpi
 12: 0.0 0.0 IO-APIC-edge i8042
 15: 0.0 0.0 IO-APIC-edge libata
 20: 176.2 0.0 IO-APIC-fasteoi i915@pci:0000:00:02.0
 21: 0.0 0.0 IO-APIC-fasteoi ehci_hcd:usb1
 22: 0.0 0.0 IO-APIC-fasteoi ohci1394, HDA Intel
 23: 0.0 0.0 IO-APIC-fasteoi sdhci:slot0
217: 3.0 0.0 PCI-MSI-edge eth0
218: 32.1 0.0 PCI-MSI-edge iwl3945
219: 2.0 0.0 PCI-MSI-edge ahci
NMI: 0.0 0.0 Non-maskable interrupts
LOC: 33.5 197.8 Local timer interrupts
RES: 271.6 290.2 Rescheduling interrupts
CAL: 0.0 0.1 function call interrupts
TLB: 0.1 0.3 TLB shootdowns
TRM: 0.0 0.0 Thermal event interrupts
SPU: 0.0 0.0 Spurious interrupts
ERR: 0.0 0.0
MIS: 0.0 0.0

This is fine. No need to worry.

Revision history for this message
Colin Ian King (colin-king) wrote :

Marking this as won't fix as the more significant outstanding issues were resolved, namely:

1. Some issues causes by incorrect BIOS settings (see the case of Barteq's issues)
2. Opera causes a lot of wakeups that causes a lot of legitimate rescheduling interrupts - this is not a scheduler problem but an application issue
3. Given details on how to identify which rogue apps are causing a lot of wakeups (debug using top, vmstat, etc)
4. Any improvements upstream are significant patches back to 2.6.24, and the risk of upsetting the scheduler at this point in the release cycle is too high. Some less intrusive patches have been tried by the power consumption increases when reducing rescheduling interrupts.
5. Rescheduling interrupts shown to be a manifestation of other wakeup or interrupt issues (e.g. bad ACPI DSTD, bad apps, etc) and not necessarily a problem with the scheduler.
6. Hardy+1 will pick up a lot of the more subtle scheduler improvements in 2.6.25+

Changed in linux:
status: In Progress → Won't Fix
Revision history for this message
Colin Ian King (colin-king) wrote :

Now added a Wiki page to describe troubleshooting of this problem.

https://help.ubuntu.com/community/ReschedulingInterrupts

Displaying first 40 and last 40 comments. View all 104 comments or add a comment.