Ubuntu

Ubuntu 12.04, system hangs, message "task kworker/0:0:4 blocked for more than 120 seconds"

Reported by Igor on 2012-04-06
46
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
linux (Ubuntu)
High
Unassigned

Bug Description

Ubuntu 12.04, amd64 platform.
After upgrade, system hangs right after the boot. Messages on console:

[ 242.628031] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[ 242.634660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.642493] kworker/0:0 D 0000000000000000 0 4 2 0x00000000
[ 242.649584] ffff8801b50e5af0 0000000000000046 ffff8801b50e5a90 ffffffff81c93280
[ 242.657060] ffff8801b50e5fd8 ffff8801b50e5fd8 ffff8801b50e5fd8 0000000000013780
[ 242.664529] ffff88019dc7dbc0 ffff8801b50cc4d0 ffff8801b0a52e40 7fffffffffffffff
[ 242.671990] Call Trace:
[ 242.674454] [<ffffffff8165a56f>] schedule+0x3f/0x60
[ 242.679430] [<ffffffff8165abb5>] schedule_timeout+0x2a5/0x320
[ 242.685274] [<ffffffff81491b8a>] ? usb_rh_urb_dequeue+0x7a/0x100
[ 242.691380] [<ffffffff8165a3af>] wait_for_common+0xdf/0x180
[ 242.697048] [<ffffffff81055fdd>] ? set_next_entity+0xad/0xd0
[ 242.702802] [<ffffffff8105f990>] ? try_to_wake_up+0x200/0x200
[ 242.708642] [<ffffffff8165a52d>] wait_for_completion+0x1d/0x20
[ 242.714569] [<ffffffff81084b43>] wait_on_cpu_work+0xd3/0xe0
[ 242.720235] [<ffffffff810822c0>] ? do_work_for_cpu+0x30/0x30
[ 242.725989] [<ffffffff81084c04>] wait_on_work+0xb4/0x120
[ 242.731396] [<ffffffff81085633>] __cancel_work_timer+0x73/0x80
[ 242.737322] [<ffffffff81085670>] cancel_work_sync+0x10/0x20
[ 242.743019] [<ffffffffa0033875>] e1000_down_and_stop+0x25/0x50 [e1000]
[ 242.749647] [<ffffffffa0037e4f>] e1000_down+0x14f/0x200 [e1000]
[ 242.755669] [<ffffffffa0039ef0>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000]
[ 242.762385] [<ffffffffa0039f5e>] e1000_reset_task+0x6e/0x90 [e1000]
[ 242.762392] [<ffffffff81084e1a>] process_one_work+0x11a/0x480
[ 242.762397] [<ffffffff81085bc4>] worker_thread+0x164/0x370
[ 242.762403] [<ffffffff81085a60>] ? manage_workers.isra.29+0x130/0x130
[ 242.762408] [<ffffffff8108a41c>] kthread+0x8c/0xa0
[ 242.762413] [<ffffffff81666bf4>] kernel_thread_helper+0x4/0x10
[ 242.762417] [<ffffffff8108a390>] ? flush_kthread_worker+0xa0/0xa0
[ 242.762421] [<ffffffff81666bf0>] ? gs_change+0x13/0x13

Looks like there is a deadlock in e1000 driver. This lock happened when eth1 , that use e1000 driver, configured to receive ip dynamically, from dhcp server. No hangs happened when interface works with static ip.

Same bug reported in debian Bug#665693
http://lists.debian.org/debian-kernel/2012/03/msg00811.html.

Relevant discussion in LKML:
https://lkml.org/lkml/2011/11/17/434

patch from vanilla did NOT solve the problem.
https://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=3a3847e007aae732d64d8fd1374126393e9879a3;hp=1032c736e81cdf490ae62f86da7efe67c3c3e61d

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-22-generic 3.2.0-22.35
ProcVersionSignature: Ubuntu 3.2.0-22.35-generic 3.2.14
Uname: Linux 3.2.0-22-generic x86_64
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0-0ubuntu4
Architecture: amd64
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
CurrentDmesg:

Date: Fri Apr 6 22:49:55 2012
HibernationDevice: RESUME=UUID=43e769d7-fc7f-4c86-a921-46fc14530600
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MachineType: Supermicro X6DA8
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-22-generic root=UUID=ebd665c1-03f4-4648-b567-a9ceb678e82e ro console=tty0 console=ttyS0,115200n8 crashkernel=384M-2G:64M,2G-:128M debug
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to precise on 2012-04-06 (0 days ago)
dmi.bios.date: 01/24/2006
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: X6DA8
dmi.board.vendor: Supermicro
dmi.board.version: PCB Version
dmi.chassis.type: 1
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 0123456789
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd01/24/2006:svnSupermicro:pnX6DA8:pvr0123456789:rvnSupermicro:rnX6DA8:rvrPCBVersion:cvnSupermicro:ct1:cvr0123456789:
dmi.product.name: X6DA8
dmi.product.version: 0123456789
dmi.sys.vendor: Supermicro

Igor (xrevolver) wrote :
Igor (xrevolver) on 2012-04-06
description: updated
Brad Figg (brad-figg) on 2012-04-06
Changed in linux (Ubuntu):
status: New → Confirmed
Igor (xrevolver) on 2012-04-06
description: updated
Igor (xrevolver) wrote :

No hang experienced when 12.04 booted with old kernel 3.0.0-17, which use same version of e1000 driver 7.3.21-k8-NAPI.

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc1-precise/

Changed in linux (Ubuntu):
importance: Undecided → Medium
importance: Medium → High
tags: added: kernel-da-key kernel-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: regression-update
removed: kernel-key
Igor (xrevolver) on 2012-04-06
tags: added: kernel-unable-to-test-upstream
Igor (xrevolver) wrote :

Can not test mainline kernel - 3.4.0-rc1 kernel crash on early boot stage.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Does the mainline kernel have the same errors on the console as the Precise kernel?

Igor (xrevolver) wrote :

No, mainline 3.4.0-rc1 kernel crash with different error on very early stage of boot process

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-23.36
Igor (xrevolver) wrote :

I have tested requested 3.2.0-23.36 kernel - same result. Bug still exists.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Can you test an earlier version of the upstream kernel:

v3.3:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Igor (xrevolver) wrote :

Same error with kernel 3.3.0-030300-generic

[ 242.548023] INFO: task kworker/1:1:24 blocked for more than 120 seconds.
[ 242.554737] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.562573] kworker/1:1 D 0000000000000001 0 24 2 0x00000000
[ 242.562587] ffff8801b6abdad0 0000000000000046 ffff8801b6abdfd8 0000000000013600
[ 242.562606] ffff8801b6abc010 0000000000013600 0000000000013600 0000000000013600
[ 242.562625] ffff8801b6abdfd8 0000000000013600 ffff8801b1dd5bc0 ffff8801b69edbc0
[ 242.562644] Call Trace:
[ 242.562660] [<ffffffff8165f8ff>] schedule+0x3f/0x60
[ 242.562672] [<ffffffff8165db4d>] schedule_timeout+0x1fd/0x2e0
[ 242.562682] [<ffffffff81085e10>] ? try_to_wake_up+0x2d0/0x2d0
[ 242.562692] [<ffffffff8108c78a>] ? update_curr+0x14a/0x1e0
[ 242.562702] [<ffffffff8108eb6d>] ? dequeue_entity+0x11d/0x2e0
[ 242.562710] [<ffffffff8165f73b>] wait_for_common+0xdb/0x180
[ 242.562719] [<ffffffff81085e10>] ? try_to_wake_up+0x2d0/0x2d0
[ 242.562727] [<ffffffff8166065f>] ? _raw_spin_lock_irqsave+0x2f/0x40
[ 242.562736] [<ffffffff8165f8bd>] wait_for_completion+0x1d/0x20
[ 242.562746] [<ffffffff8106f021>] wait_on_work+0x1a1/0x1b0
[ 242.562754] [<ffffffff8106d010>] ? do_work_for_cpu+0x30/0x30
[ 242.562763] [<ffffffff8106f14d>] __cancel_work_timer+0x4d/0x170
[ 242.562773] [<ffffffff810dfd41>] ? synchronize_irq+0x51/0xf0
[ 242.562782] [<ffffffff8106f2a0>] cancel_work_sync+0x10/0x20
[ 242.562819] [<ffffffffa004e9b5>] e1000_down_and_stop+0x25/0x50 [e1000]
[ 242.562836] [<ffffffffa0053e2f>] e1000_down+0x14f/0x230 [e1000]
[ 242.562854] [<ffffffffa0054430>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000]
[ 242.562873] [<ffffffffa005448c>] e1000_reset_task+0x5c/0x80 [e1000]
[ 242.562883] [<ffffffff8106de0b>] process_one_work+0x12b/0x470
[ 242.562892] [<ffffffff81070976>] worker_thread+0x176/0x420
[ 242.562901] [<ffffffff81070800>] ? manage_workers+0x120/0x120
[ 242.562909] [<ffffffff810754be>] kthread+0x9e/0xb0
[ 242.562919] [<ffffffff8166a0a4>] kernel_thread_helper+0x4/0x10
[ 242.562927] [<ffffffff81075420>] ? kthread_freezable_should_stop+0x70/0x70
[ 242.562936] [<ffffffff8166a0a0>] ? gs_change+0x13/0x13

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Like you mention in the bug description, there was a commit applied to v3.3 to fix a similar issue:

git describe --contains 3a3847e007aae732d64d8fd1374126393e9879a3
v3.3-rc1~182^2~20

However, you still seem to hit this issue in v3.3 final.

Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Igor (xrevolver) wrote :
tags: removed: kernel-unable-to-test-upstream
Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
67GTA (shawnr-wildblue) wrote :

I have been getting these system freezes since 3.0. Until 3.2 in 12.04, I haven't been able to get any message logs and sysrq doesn't work. Last night I managed to get info from the syslog. This is a Dell Studio 1596 laptop.

67GTA (shawnr-wildblue) wrote :
67GTA (shawnr-wildblue) wrote :

This laptop has broadcom card, and doesn't use the e1000 module. I wonder if this could be in dhcp or something else. It seems to be something low level. I will test with static IP and see if it still happens.

AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu6
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: shawnr 2154 F.... pulseaudio
 /dev/snd/controlC0: shawnr 2154 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf0700000 irq 44'
   Mixer name : 'Realtek ALC269'
   Components : 'HDA:10ec0269,10280417,00100004'
   Controls : 17
   Simple ctrls : 9
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xcfeec000 irq 45'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100100'
   Controls : 6
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=2ede7e41-cc1b-4613-9fe1-ac429a70d99a
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta i386 (20120421)
MachineType: Dell Inc. Studio 1569
NonfreeKernelModules: fglrx
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic-pae root=UUID=ac7c5eeb-370b-4d05-8a48-d61e58222343 ro ipv6.disable=1 rootflags=data=writeback quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-23.36-generic-pae 3.2.14
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-23-generic-pae N/A
 linux-backports-modules-3.2.0-23-generic-pae N/A
 linux-firmware 1.79
SourcePackage: linux
StagingDrivers: mei
Tags: precise staging precise staging
Uname: Linux 3.2.0-23-generic-pae i686
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 03/31/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A09
dmi.board.name: 0R231F
dmi.board.vendor: Dell Inc.
dmi.board.version: A09
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: A09
dmi.modalias: dmi:bvnDellInc.:bvrA09:bd03/31/2011:svnDellInc.:pnStudio1569:pvrA09:rvnDellInc.:rn0R231F:rvrA09:cvnDellInc.:ct8:cvrA09:
dmi.product.name: Studio 1569
dmi.product.version: A09
dmi.sys.vendor: Dell Inc.

tags: added: apport-collected staging

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Floris Mouwen (florismouwen) wrote :

I have the same problem as Igor. Only he uses a Intel Server adapter and i have the simpel GT Desktop adapter. I also updated bugreport 43132 on the kernel bug tracker. I also tried kernel 3.4.0-rc4 (without patches) but the problem is still there.

The workaround at this moment is to use the 3.0.0-17 kernel from ubuntu 11.10. Only the problem with that is that the iscsitarget module is not compiling (also after installing the header files).

Igor (xrevolver) wrote :

Hi,

there is a patch from Intel developers that was tested against mainline kernel.
Would it be possible to apply this patch on Ubuntu kernel so we can receive fixed driver in upcoming update?

https://bugzilla.kernel.org/show_bug.cgi?id=43132#c23

Igor (xrevolver) wrote :

Patch tested against latest 12.04 kernel 3.2.0-24-generic - patched e1000 works as expected and stable.

Igor (xrevolver) on 2012-05-07
Changed in linux (Ubuntu):
status: Triaged → Fix Committed
Changed in linux:
status: Confirmed → Fix Released
Luis Henriques (henrix) wrote :

Igor, this patch is already queued in upstreams stable tree and it should be Ubuntu pretty soon.

Changed in linux (Ubuntu):
status: Fix Committed → Triaged
CharlesA (charlesa) wrote :

Thanks for the info, Luis.

I noticed this started happening on 5-22-2012 after a kernel upgrade to 3.2.0-24-generic.

The box in question is a VM with the "Intel Pro/1000 MT Desktop" adaptor.

Is there any ETA when it will hit the Ubuntu repos?

Pachy (darkamster2004) wrote :
Download full text (13.0 KiB)

I believe I am experiencing this bug too. Dell Studio 1569.
Found this in my syslog.log:

Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142243] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:3:566]
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142250] Modules linked in: snd_seq_dummy joydev snd_hda_codec_hdmi snd_hda_codec_realtek parport_pc ppdev rfcomm bnep btusb bluetooth binfmt_misc arc4 dell_wmi sparse_keymap dell_laptop dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device iwlwifi i915 mac80211 drm_kms_helper wmi uvcvideo psmouse videodev v4l2_compat_ioctl32 cfg80211 mac_hid intel_ips snd serio_raw drm mei(C) soundcore i2c_algo_bit video snd_page_alloc lp parport usbhid hid r8169
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142316] CPU 0
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142318] Modules linked in: snd_seq_dummy joydev snd_hda_codec_hdmi snd_hda_codec_realtek parport_pc ppdev rfcomm bnep btusb bluetooth binfmt_misc arc4 dell_wmi sparse_keymap dell_laptop dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device iwlwifi i915 mac80211 drm_kms_helper wmi uvcvideo psmouse videodev v4l2_compat_ioctl32 cfg80211 mac_hid intel_ips snd serio_raw drm mei(C) soundcore i2c_algo_bit video snd_page_alloc lp parport usbhid hid r8169
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142369]
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142373] Pid: 566, comm: kworker/0:3 Tainted: G C 3.2.0-24-generic #39-Ubuntu Dell Inc. Studio 1569/0R225F
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142381] RIP: 0010:[<ffffffffa00c9d73>] [<ffffffffa00c9d73>] mei_timer+0xc3/0x260 [mei]
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142394] RSP: 0018:ffff880115149de0 EFLAGS: 00000283
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142398] RAX: ffff88012cef17f0 RBX: ffff880133409f00 RCX: ffff88012cef17f0
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142401] RDX: 0000000000000000 RSI: ffff88012cef1270 RDI: ffff88012cef1298
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142405] RBP: ffff880115149e00 R08: 0000000000000004 R09: 0000000000000004
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142408] R10: 0000000000000004 R11: 0000000000000064 R12: 0000000000000000
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142412] R13: ffffffff81500e30 R14: ffff880115149dd0 R15: 0000000000000282
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142417] FS: 0000000000000000(0000) GS:ffff880137c00000(0000) knlGS:0000000000000000
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142421] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142424] CR2: 000000007e3e1000 CR3: 0000000001c05000 CR4: 00000000000006f0
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142428] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142432] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 11 16:41:19 amu-Studio-1569 kernel: [11126.142436] Process kworker/0:3 (pid: 566, threadinfo ffff880115148000, task ffff8801...

Louis (louis-van-belle) wrote :

same here, solved it by going back to Debian.

Louis (louis-van-belle) wrote :

Ow and im using bnx2 NIC no intel e1000

James Burke (jburke) wrote :
Download full text (25.4 KiB)

I have a ubuntu file server that is also experiencing this bug.

uname -a
Linux spartan 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Server is running on ESXi 5.1, 8 CPU's, 10 GB memory

This locks up the who server and I need to reboot to fix.

traces from the syslog:

Mar 6 14:59:45 spartan kernel: [373442.140153] INFO: task kworker/3:1:16763 blocked for more than 120 seconds.
Mar 6 14:59:45 spartan kernel: [373442.140293] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 6 14:59:45 spartan kernel: [373442.140413] kworker/3:1 D 0000000000000003 0 16763 2 0x00000000
Mar 6 14:59:45 spartan kernel: [373442.140419] ffff88009f71db90 0000000000000046 0000000000000000 ffffffff81c87360
Mar 6 14:59:45 spartan kernel: [373442.140425] ffff88009f71dfd8 ffff88009f71dfd8 ffff88009f71dfd8 00000000000137c0
Mar 6 14:59:45 spartan kernel: [373442.140430] ffff8802ab9dae00 ffff8800a8a94500 ffff88009f71dbc0 7fffffffffffffff
Mar 6 14:59:45 spartan kernel: [373442.140434] Call Trace:
Mar 6 14:59:45 spartan kernel: [373442.140537] [<ffffffff8165b48f>] schedule+0x3f/0x60
Mar 6 14:59:45 spartan kernel: [373442.140541] [<ffffffff8165bad5>] schedule_timeout+0x2a5/0x320
Mar 6 14:59:45 spartan kernel: [373442.140595] [<ffffffff8101bde3>] ? native_sched_clock+0x13/0x80
Mar 6 14:59:45 spartan kernel: [373442.140599] [<ffffffff8101be59>] ? sched_clock+0x9/0x10
Mar 6 14:59:45 spartan kernel: [373442.140604] [<ffffffff81091e45>] ? sched_clock_local+0x25/0x90
Mar 6 14:59:45 spartan kernel: [373442.140608] [<ffffffff8165b2cf>] wait_for_common+0xdf/0x180
Mar 6 14:59:45 spartan kernel: [373442.140653] [<ffffffff81060600>] ? try_to_wake_up+0x200/0x200
Mar 6 14:59:45 spartan kernel: [373442.140657] [<ffffffff8165b44d>] wait_for_completion+0x1d/0x20
Mar 6 14:59:45 spartan kernel: [373442.140662] [<ffffffff8108b4ef>] kthread_create_on_node+0xaf/0x130
Mar 6 14:59:45 spartan kernel: [373442.140667] [<ffffffff81086930>] ? manage_workers.isra.31+0x130/0x130
Mar 6 14:59:45 spartan kernel: [373442.140671] [<ffffffff810845b4>] ? alloc_worker+0x24/0x70
Mar 6 14:59:45 spartan kernel: [373442.140675] [<ffffffff81086686>] create_worker+0x136/0x1b0
Mar 6 14:59:45 spartan kernel: [373442.140679] [<ffffffff81086765>] maybe_create_worker+0x65/0x100
Mar 6 14:59:45 spartan kernel: [373442.140682] [<ffffffff81086848>] manage_workers.isra.31+0x48/0x130
Mar 6 14:59:45 spartan kernel: [373442.140686] [<ffffffff81086c1b>] worker_thread+0x2eb/0x370
Mar 6 14:59:45 spartan kernel: [373442.140689] [<ffffffff81086930>] ? manage_workers.isra.31+0x130/0x130
Mar 6 14:59:45 spartan kernel: [373442.140693] [<ffffffff8108b31c>] kthread+0x8c/0xa0
Mar 6 14:59:45 spartan kernel: [373442.140748] [<ffffffff81667af4>] kernel_thread_helper+0x4/0x10
Mar 6 14:59:45 spartan kernel: [373442.140752] [<ffffffff8108b290>] ? flush_kthread_worker+0xa0/0xa0
Mar 6 14:59:45 spartan kernel: [373442.140755] [<ffffffff81667af0>] ? gs_change+0x13/0x13
Mar 6 15:01:45 spartan kernel: [373562.140109] INFO: task kworker/3:1:16763 blocked for more than 120 seconds.
Mar 6 15:01:45 spartan kernel: [373562...

Julian Wiedmann (jwiedmann) wrote :

The patch requested in comment #41 landed with 3.2.0-25.40.

Changed in linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.