T400 laptop. Freezes or mouse stutters. Kworker process using too much CPU.

Bug #779753 reported by Wiggy on 2011-05-09
122
This bug affects 22 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

This occurred with a clean install of Ubuntu 10.10 reverting to kernel 2.6.34-02063407 resolved the issue. Now I've upgraded to 11.04 problem has reoccurred.

Laptop is Lenovo T400 with latest bios update. Periodically system will freeze for approx 10-15 seconds then control can be regained but mouse moment is jerky for another 10-15 seconds. Multiple kworker processes use approx 50-90% CPU when view using top. After approx 30 seconds from start of issue control is back to normal.

Issue seems to occur randomly, sometimes hours after boot, sometimes minutes.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-2.6.38-8-generic-pae 2.6.38-8.42
ProcVersionSignature: Ubuntu 2.6.38-8.42-generic-pae 2.6.38.2
Uname: Linux 2.6.38-8-generic-pae i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: wiggy 1479 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfc020000 irq 48'
   Mixer name : 'Conexant CX20561 (Hermosa)'
   Components : 'HDA:14f15051,17aa211c,00100000'
   Controls : 16
   Simple ctrls : 8
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 7VHT16WW-1.06'
   Mixer name : 'ThinkPad EC 7VHT16WW-1.06'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Mon May 9 08:05:42 2011
HibernationDevice: RESUME=UUID=b23044b8-b130-476a-a926-a3872fe01aaa
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007)
MachineType: LENOVO 6475FM4
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 LANGUAGE=en_GB:en
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-8-generic-pae root=UUID=756f4157-72df-4327-b193-6739cb49661d ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-2.6.38-8-generic-pae N/A
 linux-backports-modules-2.6.38-8-generic-pae N/A
 linux-firmware 1.52
SourcePackage: linux
UpgradeStatus: Upgraded to natty on 2011-05-07 (1 days ago)
WifiSyslog:

dmi.bios.date: 03/11/2011
dmi.bios.vendor: LENOVO
dmi.bios.version: 7UET92WW (3.22 )
dmi.board.name: 6475FM4
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7UET92WW(3.22):bd03/11/2011:svnLENOVO:pn6475FM4:pvrThinkPadT400:rvnLENOVO:rn6475FM4:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 6475FM4
dmi.product.version: ThinkPad T400
dmi.sys.vendor: LENOVO

Wiggy (brian-wigginshouse) wrote :
Wiggy (brian-wigginshouse) wrote :

I forgot to mention when this occurred in 10.10 process was kslowd, not kworker. It happened on all kernel versions post 2.6.34.

Vincent Vatelot (vvatelot) wrote :

Hello, I have exactly the same behaviour... (Thinkpad T400) When I tried to install 10.10 I had freezes so I decided to rollback to 10.04 (no problem). But this week end, I upgraded to 11.04 and the system sometimes freezes and then everything come back to normal. It seems that kworker uses a lot of CPU during freeze...

What informations can I provide to help you with this bug?

Brad Figg (brad-figg) on 2011-05-10
Changed in linux (Ubuntu):
status: New → Confirmed
Matthieu Riou (matthieu-riou) wrote :

Same here on a Thinkpad T500 and exactly the same behavior (including fslowd in 10.10 and kworker on 11.04).

xhanka00 (xhanka00) wrote :

Exactly the same problem here in 11.04 on ThinkPad T400. It's making whole system unusable.

Jeffrey Ballagh (jballagh) wrote :

Same here. 11.04 on a ThinkPad T400. Behaves almost exactly like the kslowd issue from maverick. Only way I found to fix the problem in maverick was to freeze to the 2.6.32 kernel. Not an option for Natty.

Kernel hacker I am not, but happy to help however possible to sort this. Most annoying aspect is that it's inconsistent, but when activity spikes, it's definitely a kworker thread, 0:0 for me. Worst shortly after booting, after resuming from sleep, and when running from battery.

Seems to be endemic among T400s. Did all my fellow T400 owners have the same issue with kslowd for 2.6.34 to 2.6.36?

Possible duplicates:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/717919
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/746084
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/744154

Jeffrey Ballagh (jballagh) wrote :

Spoke too soon. When left idle during 60 second powertop
----------------------------------------------------------------------------------
Top causes for wakeups:
  32.7% ( 58.5) [ 0] [kernel scheduler] Load balancing tick
  28.8% ( 51.6) [ ] [iwlagn] <interrupt>
  23.1% ( 41.3) [ ] kworker/0:0
----------------------------------------------------------------------------------

Wiggy (brian-wigginshouse) wrote :

11.04 is completely unusable on the T400. I have now reverted back to 10.10 with 2.6.34-02063407-generic kernel. Any newer kernel has the freezing/stuttering problem with kslowd or kworker.

I happy to run any tests needed to try and diagnose this fault.

Jeffrey Ballagh (jballagh) wrote :

Believe this to be a duplicate of bug #746084 with a likely fix in upstream kernel - anything 2.6.39rc2 or later.

https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/746084/comments/11

No freezing or stuttering with 2.6.39rc4 and based on what I see in the system monitor panel applet, idle processes seem to be consuming less CPU time in general.

Thanks to https://launchpad.net/~ikt for cause and fix from upstream.

Jeffrey Ballagh (jballagh) wrote :

Having my own mixed results with rc4 as well. Similar... but different. More on that in a moment.

There are a number of kernel or driver related bugs with various freeze/hang symptoms (and likely various root causes). Since this bug has T400 in the name I'm removing my earlier duplicate flag. That way my fellow T400 owners have a simple place to track whether the issue we are all experiencing has truly been resolved or not.

Given the number of hang/freeze related bugs I have looked at, I'd like to know if we are all experiencing the same symptoms. Also gives us a way to separate this issue from similar bugs with causes and fixes irrelevant here.

With 2.3.38 - problem manifests intermittently. When it surfaces, every moment of fluid activity is followed by a moment frozen. Seems exacerbated by resuming from sleep or running from battery. Worst in the first 10 minutes after booting/resuming. Found a bug that causes the system to freeze every 30 seconds or so - symptoms here are distinctly different.

With 2.6.39rc4 - So far, I have not experienced the same quick cycling freeze, but in the first few minutes after boot and resume the mouse and keyboard will stop taking input for longer periods, 10s of seconds to minutes at a time. However, only a handful of boots at this point.

Wiggy (brian-wigginshouse) wrote :

Jeffrey,

I see the exact same symptoms as yourself. With 2.6.39rc4 a few times it froze completely, I waited about 3 minutes and then rebooted.

I'm unable to use Natty, it's too frustrating, I now have Maverick and Natty installed on separate partitions, as stated earlier 2.6.34 is the latest kernel that does not cause issues with Maverick.

August Black (august-alien) wrote :

I have an ideapad u350 and have the same symptoms.
uname -a
Linux slack 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

It makes the machine virtually useless

Tor Håkon Haugen (torh) wrote :

While my machine doesn't "freeze" completely, kworker use 60-100% of one of the cpu cores all times when running Linux 2.6.38-8-generic on a ThinkPad X61.

Reverting back to Linux 2.6.35-28-generic for now.

olaf (olaf-x100) wrote :

Same problem on Thinkpad R400 @ 11.04. Upgrading to kernel 2.6.39 did not resolve the problem.

Kworker threads (4 of them) suddenly consume all CPU resources. Sometimes the system recovers, sometimes it blocks for > 5 minutes, i.e. until I flip the switch.

This happens on an fresh, out-of the box installation. However, the first few reboots / hibernation / suspend-cycles do not yield the issue; the problem appears after a few hours to days of usage - perhaps in conjunction with inserting a USB device (mobile broadband, anyone?) and / or configuring Network / WIFI connections. I will re-install once more and test, recording perfdata using the "ridiculous" method Linus suggested:

   perf record -ag sleep 10
   perf report

 (see https://lkml.org/lkml/2011/3/30/836)

Seems this is completely kernel / driver -related, since multiple distros are affected.

olaf (olaf-x100) wrote :

Allright, re-installed 11.04 yesterday, standard kernel:

2.6.38-8-generic-pae #42-Ubuntu SMP Mon Apr 11 05:17:09 UTC 2011 i686 i686 i386 GNU/Linux

Just managed to observe the issue for more that 4 seconds and capture it with perf. Seems its the intel gfx drivres in my case:

# Overhead Command Shared Object Symbol

  45.29% kworker/1:1 [i915] [k] get_clock
                |
                --- get_clock
                   |
                   |--99.73%-- sclhi
                   | |
                   | |--85.30%-- i2c_outb.clone.1
                   | | try_address.clone.4
                   | | bit_doAddress.clone.5
                   | | bit_xfer
                   | | intel_i2c_quirk_xfer
                   | | gmbus_xfer
                   | | i2c_transfer
                   | | drm_class_suspend
                   | | drm_class_suspend
                   | | intel_hdmi_detect
                   | | drm_fb_helper_initial_config
                   | | process_one_work
                   | | |
                   | | |--69.63%-- worker_thread
                   | | | kthread
                   | | | kernel_thread_helper
                   | | |
                   | | --30.37%-- process_scheduled_works
                   | | worker_thread
                   | | kthread
                   | | kernel_thread_helper
                   | |
                   | --14.70%-- i2c_stop
                   | |
                   | |--78.80%-- try_address.clone.4
                   | | bit_doAddress.clone.5
                   | | bit_xfer
                   | | intel_i2c_quirk_xfer
                   | | gmbus_xfer
                   | | i2c_transfer
                   | | drm_class_suspend
                   | | drm_class_suspend
                   | | intel_hdmi_detect

....

Attaching data captured with perf. Is this a known issue? Can I provide additional information? Please advise.

chen wei (chenwei-sh) wrote :

Same problem with my R400. The itop shows that sometimes the i915 and i804 have interrupt storms, sometimes 20000 interrupts per seconds. Kworker is consuming 90% CPU loading.

olaf (olaf-x100) wrote :

@chen wei I beleive our issues are in fact addressed by #746084 (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/746084).

@all it would be great if more people affected by this could provide perf records (sudo perf record -ag sleep 10) to determine the source; this issue might indeed be a duplicate.

Perhaps this is an issue with drm? It's a rather long shot but I'll try building the libdrm from source (http://intellinuxgraphics.org/download.html) and see if this helps. Will report back with results.

cmcginty (casey-mcginty) wrote :

Here is my perf record. Runnning on a T400. System hangs every 3 seconds for about .5 to 1 second continuously. I an duplicate this bug usually when docking/redocking or suspend/resume.

Wiggy (brian-wigginshouse) wrote :

Here is my perf data taken during high kworker CPU.

Wiggy (brian-wigginshouse) wrote :

File did not attach to my last post.

Wiggy (brian-wigginshouse) wrote :

OK, third time lucky, I've given myself rights to read the file this time ;)

olaf (olaf-x100) wrote :
Download full text (3.7 KiB)

So I´m back with the results with updating the libdrm since it's one of my suspects.
So far I've build & installed the latest version o f librm (Release Q1 2011). The system is much more stable after that. I´ve only hat one kworker spike for about 5 Seconds since, which in my case is an improvement.

Here is what i did:

1.) Download & unpack the latest *stable* libdrm release:

http://dri.freedesktop.org/libdrm/libdrm-2.4.25.tar.bz2

2.) Install libraries / headers required to compile libdrm

sudo apt-get install libpthread-stubs0-dev

3.) in a terminal within the unpacked libdrm directory, do
./configure
make
sudo make install

4.) reboot.

Luckily I was able to capture that one last spike, too. Again, the intel driver, however a different call than before.

# Events: 8K cycles
#
# Overhead Command Shared Object Symbol
# ........ ............... ............................... ...........................................
#
    26.11% kworker/0:1 [i915] [k] i915_gem_execbuffer_relocate_slow
                |
                --- i915_gem_execbuffer_relocate_slow
                   |
                   |--99.79%-- sclhi
                   | |
                   | |--84.92%-- i2c_bit_add_numbered_bus
                   | | i2c_bit_add_numbered_bus
                   | | i2c_bit_add_numbered_bus
                   | | i2c_bit_add_numbered_bus
                   | | i915_gem_execbuffer_relocate_slow
                   | | i915_gem_execbuffer_relocate_slow
                   | | i2c_transfer
                   | | drm_class_suspend
                   | | drm_class_suspend
                   | | i915_gem_execbuffer_relocate_slow
                   | | drm_fb_helper_initial_config
                   | | process_one_work
                   | | |
                   | | |--78.40%-- worker_thread
                   | | | kthread
                   | | | kernel_thread_helper
                   | | |
                   | | --21.60%-- process_scheduled_works
                   | | worker_thread
                   | | kthread
                   | | kernel_thread_helper
                   | |
                   | --15.08%-- i2c_stop
                   | i2c_bit_add_numbered_bus
                   | |
                   | |--83.99%-- i2c_bit_add_numbered_bus
                   | | i2c_bit_add_numbered_bus
                   | | i915_gem_execbuffer_relocate_slow
                   | | i915_gem_execbuffer_relocate_slow
...

Read more...

olaf (olaf-x100) wrote :

And here's the perfdata for my last post

Poul Møller Hansen (pmhansen) wrote :

I'm facing the same issue, also on a T400. perf.data are attached.
I'm not sure what's starting the problem. Once it started a few minutes after system start, and I had only been using Firefox.
In every case it's seems to be related to I2C. Does a laptop have an I2C bus anyway ?

@olaf can you still confirm that compiling the newest source of drm is a usable workaround ?
I'm not to fond of having a lot of software, not handled by the package management.

olaf (olaf-x100) wrote :

@Poul Møller Hansen

Hi Poul,
It seemed like an improvement at first, however I had the issue reoccurring again throughout the last days - so my previous attempt to manually install the 2001 Q1 release of libdrm does *not* resolve / improve the issue. The perf data still shows a problem with kworker / intel....

Poul Møller Hansen (pmhansen) wrote :

I thought so. I would imagine that it's kernel related.
So far this workaround has worked here
http://souriguha.wordpress.com/2011/03/08/how-to-solve-problem-with-thinkpadkslowd-kworker-on-linux-kernel-2-35-2-36/

At least my T400 is usable now :)

chen wei (chenwei-sh) wrote :

Poul,

thanks for the sharing. Now my R400 is usable again !!! :)
BTW, I'm not sure if this drm is the root cause, since one of my tricks to get the "unusable" laptop back to work is to switch off the wlan/unplug the USB mouse etc. As I reported earlier, the interrupts from graphic/ethernet and other peripherals are extremely high when my R400 is not usable. I'm wondering why these two symptoms are correlated.

xhanka00 (xhanka00) wrote :

The workaround from Paul's link solved the problem and 2.6.38-8-generic runs flawlessly at Lenovo T400. Thanks

olaf (olaf-x100) wrote :

I'm afraid I have to report the issue persists with Ubuntu 11.04 and Kernel 3.0.3-030003-generic. To make matters worse, Paul's workaround has less effect with this kernel, i.e. I am experiencing random peaks of kworker load making the system almost unusable even with the workaround in place...

Ignacio Huerta (iox8) wrote :

I'm afraid this issue is still alive in 11.10 Beta 1, kernel 3.0.0-9-generic 64 bits on a T400.

Before finding this issue I had posted this other bug: https://bugs.launchpad.net/ubuntu/+bug/840949. I just marked it as a duplicate of this bug.

I'll try the workaround and a newer kernel and keep you posted.

Ignacio Huerta (iox8) wrote :

All right, after updating the kernel to 3.0.0-10 (from the updates) and applying Poul's workaround it works much better. Thank you very much.

Let's hope the kernel people find a permanent solution for this :)

Gard Spreemann (gspreemann) wrote :

In case anybody with the same hardware as me comes across this: I'm seeing the same symptoms on a Dell E6410 powered by an nVidia card using the nouveau driver. The workaround described in comment #26 also fixes the problem for me.

David Young (dove-young) wrote :

There are some kworker process which comsume tons of CPU time at each wake up. It was very bad on 10.10 but become no so bad in 11.04 and 11.10

Joseph Salisbury (jsalisbury) wrote :

There is no single cause for kworker issues:
https://lkml.org/lkml/2011/3/30/836

Getting additional information:
https://lkml.org/lkml/2011/3/31/68

Would it be possible for you to test the latest upstream kernel? It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . If possible, please test the latest v3.2-rcN kernel (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the others). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed by the mainline kernel, please add the following tag 'kernel-fixed-upstream-KERNEL-VERSION'. For example, if kernel version 3.2-rc1 fixed and issue, the tag would be: 'kernel-fixed-upstream-v3.2-rc1'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Thanks in advance.

tags: added: needs-upstream-testing

Occuring here on a brand new Dell Lattitude (E5420). Making it unusable -- should be a top priority bug.

revdjenk (revdjenk) wrote :

T400 with the kworker freeze, especially noticeable after long period of input inactivity (!?) or returning from suspend (activated by lid closure).
running kernel 3.0.0-12-generic #20-Ubuntu SMP (LinuxMint 12)

followed suggested fix in post #26.

Seems to correct, at least shut lid for suspend and opened to find cursor is free-flowing. Will continue to watch ...

Can also report have identical setup on a T61 with no troubles at all with kworker slowdown.

David Young (dove-young) wrote :

kworker process problem is still there while returning from suspend on my T400, even with kernel 3.0.0.16

Daniel Wheeler (daniel-d-man) wrote :

I belive I have found a fix for this ! This worked perfectly for me.
    Reboot your system with the new kernel.
    Open shell and become root.
    Type #echo N> /sys/module/drm_kms_helper/parameters/poll This should drastically reduce the system load and put your system back to normal.
    Note that this will only be sustained till system reboot. After reboot the value will revert back to poll=Y
    For a sustained effect: #echo "options drm_kms_helper poll=N">/etc/modprobe.d/local.conf

Hope this will help your woes. Enjoy! Please post any problems you may experience.

Having this problem on 11.10, on a Dell Lattitude Laptop. It occurs only and always when the machine has been plugged in to mains power, is disconnected, then plugged back in again. When I need to use the mouse I can temporarily unplug the power wire and it works again! (Then dies again when plugged back in.)

Xavier Claessens (zdra) wrote :

On my lenovo X200s, I have this problem since natty and it's still there on precise. Pretty serious bug, if I wasn't geek enough to find the workaround (same as comment #38) my laptop would be just unusable.

Wiggy, thank you for reporting this and helping make Ubuntu better. As you noted how the that this issue began using Natty but reverting to kernel 2.6.34-02063407, the next step is to perform a bisect to narrow down the offending commit. Could you please perform this following https://wiki.ubuntu.com/Kernel/KernelBisection?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers