BUG: soft lockup - CPU#13 stuck for 61s! [kswapd1:188]

Bug #731637 reported by Rich Wohlstadter
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux-lts-backport-maverick (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.35-23-server

I've notice this error in my kernel logs on my 12core systems. Entire error is:

Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707346] BUG: soft lockup - CPU#13 stuck for 61s! [kswapd1:188]
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707375] Modules linked in: binfmt_misc mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu cachefiles autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc dm_crypt i7core_edac power_meter joydev dcdbas bnx2 hed edac_core lp parport usbhid hid usb_storage megaraid_sas
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707388] CPU 13
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707389] Modules linked in: binfmt_misc mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu cachefiles autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc dm_crypt i7core_edac power_meter joydev dcdbas bnx2 hed edac_core lp parport usbhid hid usb_storage megaraid_sas
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707399]
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707401] Pid: 188, comm: kswapd1 Not tainted 2.6.35-23-server #41~lucid1-Ubuntu 02Y41P/PowerEdge M610
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707403] RIP: 0010:[<ffffffff812b6ae6>] [<ffffffff812b6ae6>] find_next_bit+0x16/0xb0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707409] RSP: 0018:ffff881807ec3cd0 EFLAGS: 00000287
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707410] RAX: 0000000000000040 RBX: ffff881807ec3cd0 RCX: 0000000000000009
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707411] RDX: 0000000000000009 RSI: 0000000000000040 RDI: ffffffff81ae6848
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707412] RBP: ffffffff8100aa8e R08: ffffffff81ae6848 R09: 0000000000000009
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707414] R10: 0000000000000001 R11: 0000000000000001 R12: ffff881807ec3d5c
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707415] R13: ffffffff81055d7b R14: ffff881807ec3ce0 R15: ffff880001e10640
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707417] FS: 0000000000000000(0000) GS:ffff880001ec0000(0000) knlGS:0000000000000000
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707418] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707419] CR2: 00007fbf12366000 CR3: 0000000c064d5000 CR4: 00000000000006e0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707421] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707422] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707424] Process kswapd1 (pid: 188, threadinfo ffff881807ec2000, task ffff881807e22dc0)
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707425] Stack:
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707445] ffff881807ec3d00 ffffffff81117bf3 ffffffffffffff10 ffff880100000e00
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707447] <0> 0000000000000000 0000000000000000 ffff881807ec3d40 ffffffff81105c1a
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707449] <0> 0000000000000000 0000000000000e10 0000000000000002 ffff880100000000
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707451] Call Trace:
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707476] [<ffffffff81117bf3>] ? zone_nr_free_pages+0xa3/0xc0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707479] [<ffffffff81105c1a>] ? zone_watermark_ok+0x2a/0xf0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707481] [<ffffffff81110354>] ? balance_pgdat+0x1c4/0x740
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707483] [<ffffffff811109e5>] ? kswapd+0x115/0x310
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707486] [<ffffffff8107f0b0>] ? autoremove_wake_function+0x0/0x40
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707488] [<ffffffff811108d0>] ? kswapd+0x0/0x310
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707490] [<ffffffff8107eb56>] ? kthread+0x96/0xa0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707493] [<ffffffff8100aee4>] ? kernel_thread_helper+0x4/0x10
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707495] [<ffffffff8107eac0>] ? kthread+0x0/0xa0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707496] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707497] Code: e1 b3 00 00 c9 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 39 f2 48 89 f0 48 89 e5 0f 83 8d 00 00 00 48 89 d1 49 89 d1 <48> c1 e9 06 49 83 e1 c0 4c 8d 04 cf 48 89 f7 4c 29 cf 83 e2 3f
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707700] Call Trace:
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707702] [<ffffffff81117bf3>] ? zone_nr_free_pages+0xa3/0xc0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707704] [<ffffffff81105c1a>] ? zone_watermark_ok+0x2a/0xf0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707705] [<ffffffff81110354>] ? balance_pgdat+0x1c4/0x740
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707707] [<ffffffff811109e5>] ? kswapd+0x115/0x310
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707709] [<ffffffff8107f0b0>] ? autoremove_wake_function+0x0/0x40
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707711] [<ffffffff811108d0>] ? kswapd+0x0/0x310
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707713] [<ffffffff8107eb56>] ? kthread+0x96/0xa0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707714] [<ffffffff8100aee4>] ? kernel_thread_helper+0x4/0x10
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707716] [<ffffffff8107eac0>] ? kthread+0x0/0xa0
Mar 7 07:15:28 blade13-3-16 kernel: [2055372.707718] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10

I have found similar error and patch from a fedora bug report that is supposed to fix this:

https://bugzilla.redhat.com/show_bug.cgi?id=649694

and the corresponding kerneltrap discussion:

http://kerneltrap.org/mailarchive/linux-kernel/2010/10/27/4637977

Can this be investigated and potentially patched to address the issue? Thanks

ubuntu-bug output:

ProblemType: Bug
Architecture: amd64
Date: Tue Mar 8 16:16:10 2011
Dependencies:
 adduser 3.112ubuntu1
 base-files 5.0.0ubuntu20.10.04.3
 base-passwd 3.5.22
 busybox-initramfs 1:1.13.3-1ubuntu11
 coreutils 7.4-2ubuntu3
 cpio 2.10-1ubuntu2
 debconf 1.5.28ubuntu4
 debconf-i18n 1.5.28ubuntu4
 debianutils 3.2.2
 dpkg 1.15.5.6ubuntu4.5
 findutils 4.4.2-1ubuntu1
 gcc-4.4-base 4.4.3-4ubuntu5
 initramfs-tools 0.92bubuntu78
 initramfs-tools-bin 0.92bubuntu78
 klibc-utils 1.5.17-4ubuntu1
 libacl1 2.2.49-2
 libattr1 1:2.4.44-1
 libblkid1 2.17.2-0ubuntu1.10.04.2
 libc-bin 2.11.1-0ubuntu7.8
 libc6 2.11.1-0ubuntu7.8
 libdb4.8 4.8.24-1ubuntu1
 libgcc1 1:4.4.3-4ubuntu5
 libglib2.0-0 2.24.1-0ubuntu1
 libklibc 1.5.17-4ubuntu1
 liblocale-gettext-perl 1.05-6
 libncurses5 5.7+20090803-2ubuntu3
 libpam-modules 1.1.1-2ubuntu5
 libpam0g 1.1.1-2ubuntu5
 libpcre3 7.8-3build1
 libselinux1 2.0.89-4
 libslang2 2.2.2-2ubuntu1
 libstdc++6 4.4.3-4ubuntu5
 libtext-charwidth-perl 0.04-6
 libtext-iconv-perl 1.7-2
 libtext-wrapi18n-perl 0.06-7
 libudev0 151-12.3
 libusb-0.1-4 2:0.1.12-14ubuntu0.2
 libuuid1 2.17.2-0ubuntu1.10.04.2
 lsb-base 4.0-0ubuntu8
 lzma 4.43-14ubuntu2
 module-init-tools 3.11.1-2ubuntu1
 ncurses-bin 5.7+20090803-2ubuntu3
 passwd 1:4.1.4.2-1ubuntu2
 perl-base 5.10.1-8ubuntu2
 procps 1:3.2.8-1ubuntu4
 sed 4.2.1-6
 sensible-utils 0.0.1ubuntu3
 tzdata 2010o-0ubuntu0.10.04
 udev 151-12.3
 util-linux 2.17.2-0ubuntu1.10.04.2
 wireless-crda 1.12
 zlib1g 1:1.2.3.3.dfsg-15ubuntu1
DistroRelease: Ubuntu 10.04
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
Package: linux-image-2.6.35-23-server 2.6.35-23.41~lucid1
PackageArchitecture: amd64
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.35-23.41~lucid1-server 2.6.35.7
SourcePackage: linux-lts-backport-maverick
Tags: lucid
Uname: Linux 2.6.35-23-server x86_64

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-backport-maverick (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.