Bug: soft lockup - CPU #2 stuck for 61s! [kswapd0:75]

Bug #745342 reported by Nick Belnap
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Ubuntu 10.10
8 core machine with 16 GB RAM, Intel board with (2) 4 core Xeon's. RAID 6 array with 6 disks in software RAID (mdraid)

After large copy job to vbox virutal machine on this host found this on console:

Swap space is on RAID 6 array.

"[314082.372497] Bug: soft lockup - CPU #2 stuck for 61s! [kswapd0: 75]"
followed by some hex code.

This error was preceded by several errors like this:

Info: task kdmflush: 572 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
 message.

Info: task jbd2/md2-8:589 blocked for more than 120 seconds.
Info: task flush-9:2:639 blocked for more than 120 seconds.
Info: task jbd2/dm-0-8:1371 blocked for more than 120 seconds.
Info: task rsyslogd:7393 blocked for more than 120 seconds.
Info: task master:1569 blocked for more than 120 seconds.

The machine was not locked up and I have not yet attempted a reboot but this is very concerning.

Here's the soft lockup error from dmesg:

[314082.372497] BUG: soft lockup - CPU#2 stuck for 61s! [kswapd0:75]
[314082.372550] Modules linked in: vboxnetadp vboxnetflt vboxdrv lp ioatdma parp
ort joydev i7core_edac hed edac_core raid10 raid456 async_pq async_xor xor async
_memcpy async_raid6_recov usbhid hid igb dca raid6_pq async_tx raid1 raid0 multi
path linear
[314082.372576] CPU 2
[314082.372578] Modules linked in: vboxnetadp vboxnetflt vboxdrv lp ioatdma parp
ort joydev i7core_edac hed edac_core raid10 raid456 async_pq async_xor xor async
_memcpy async_raid6_recov usbhid hid igb dca raid6_pq async_tx raid1 raid0 multi
path linear
[314082.372605]
[314082.372609] Pid: 75, comm: kswapd0 Not tainted 2.6.35-28-server #49-Ubuntu S
5520HC/S5520HC
[314082.372612] RIP: 0010:[<ffffffff81118f70>] [<ffffffff81118f70>] zone_nr_fre
e_pages+0x0/0xc0
[314082.372620] RSP: 0018:ffff880265dade08 EFLAGS: 00000282
[314082.372624] RAX: 0000000000000020 RBX: ffff880265dade40 RCX: 000000000000000
0
[314082.372627] RDX: 0000000000000895 RSI: 0000000000000000 RDI: ffff880100000e0
0
[314082.372631] RBP: ffffffff8100aa8e R08: 0000000000000000 R09: 000000000000010
0
[314082.372634] R10: 0000000000000000 R11: 0000000000000003 R12: 000000000000000
0
[314082.372637] R13: ffff880265dade04 R14: ffff8802668144d0 R15: ffff880265daddb
0
[314082.372642] FS: 0000000000000000(0000) GS:ffff880001e20000(0000) knlGS:0000
000000000000
[314082.372646] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[314082.372649] CR2: 0000000001e003c0 CR3: 0000000001a2a000 CR4: 00000000000026e
0
[314082.372652] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 000000000000000
0
[314082.372656] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 000000000000040
0
[314082.372660] Process kswapd0 (pid: 75, threadinfo ffff880265dac000, task ffff
8802668144d0)
[314082.372662] Stack:
[314082.372700] ffffffff8110700a ffff880265dade40 ffff880100000000 000000000000
0002
[314082.372705] <0> ffff8802668144d0 ffff880265dade70 ffff8801000040a8 ffff88026
5dadee0
[314082.372711] <0> ffffffff81111f6c ffff880265dadfd8 0000000000000000 000000006
5dadfd8
[314082.372717] Call Trace:
[314082.372760] [<ffffffff8110700a>] ? zone_watermark_ok+0x2a/0xf0
[314082.372765] [<ffffffff81111f6c>] ? kswapd+0x25c/0x300
[314082.372770] [<ffffffff8107fb10>] ? autoremove_wake_function+0x0/0x40
[314082.372775] [<ffffffff81111d10>] ? kswapd+0x0/0x300
[314082.372780] [<ffffffff8107f596>] ? kthread+0x96/0xa0
[314082.372785] [<ffffffff8100aee4>] ? kernel_thread_helper+0x4/0x10
[314082.372790] [<ffffffff8107f500>] ? kthread+0x0/0xa0
[314082.372794] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10
[314082.372797] Code: 48 89 c2 e8 13 36 fe ff 89 df e8 6c e5 fd ff 48 8b 5d d8 4
c 8b 65 e0 4c 8b 6d e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 90 90 90 90 90 90 <55> 48 8
9 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 0f
[314082.373196] Call Trace:
[314082.373201] [<ffffffff8110700a>] ? zone_watermark_ok+0x2a/0xf0
[314082.373205] [<ffffffff81111f6c>] ? kswapd+0x25c/0x300
[314082.373210] [<ffffffff8107fb10>] ? autoremove_wake_function+0x0/0x40
[314082.373215] [<ffffffff81111d10>] ? kswapd+0x0/0x300
[314082.373219] [<ffffffff8107f596>] ? kthread+0x96/0xa0
[314082.373224] [<ffffffff8100aee4>] ? kernel_thread_helper+0x4/0x10
[314082.373229] [<ffffffff8107f500>] ? kthread+0x0/0xa0
[314082.373234] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10

Tags: maverick
Revision history for this message
Daniel Asarnow (dasarnow) wrote :
Download full text (11.1 KiB)

I experienced a full system lockup with a similar error in syslog. My system runs 10.10 on an AMD board w/ 1x quad-core, 8GB RAM. Kernel: 2.6.35-28-generic. There is a md RAID 1 among other single drives. kern.log and syslog excerpts below.

kern.log (less part identical to syslog):

May 3 14:42:18 theFourthTower kernel: [1642564.161333] INFO: task Xorg:1433 blocked for more than 120 seconds.
May 3 14:42:18 theFourthTower kernel: [1642564.161341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 3 14:42:18 theFourthTower kernel: [1642564.161348] Xorg D 0000000109c92f32 0 1433 1364 0x00400004
May 3 14:42:18 theFourthTower kernel: [1642564.161360] ffff880204ba3758 0000000000000086 ffff880200000000 0000000000015980
May 3 14:42:18 theFourthTower kernel: [1642564.161370] ffff880204ba3fd8 0000000000015980 ffff880204ba3fd8 ffff880204ba8000
May 3 14:42:18 theFourthTower kernel: [1642564.161379] 0000000000015980 0000000000015980 ffff880204ba3fd8 0000000000015980

syslog:

May 3 14:38:43 theFourthTower kernel: [1642349.323750] BUG: soft lockup - CPU#3 stuck for 61s! [kswapd0:76]
May 3 14:38:43 theFourthTower kernel: [1642349.323750] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs ex
portfs reiserfs usb_storage xt_multiport binfmt_misc vboxnetadp vboxnetflt ipt_MASQUERADE xt_state ipt_REJECT xt_tcpudp iptable_filt
er nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_tftp nf_conntrack_tftp
 nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntra
ck nf_defrag_ipv4 ip_tables x_tables vboxdrv snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_p
cm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq fglrx(P) snd_timer snd_seq_device ppdev parport_pc snd it87 hwmon_vid k10temp
 i2c_piix4 edac_core edac_mce_amd soundcore snd_page_alloc shpchp lp parport raid10 raid456 async_pq async_xor xor async_memcpy asyn
c_raid6_recov usbhid hid raid6_pq async_tx raid1 raid0 multipath linear btrfs zlib_deflate 8139too floppy crc32
May 3 14:38:43 theFourthTower kernel: c 8139cp libcrc32c firewire_ohci firewire_core r8169 crc_itu_t ahci pata_atiixp libahci mii
May 3 14:38:43 theFourthTower kernel: [1642349.323750] CPU 3
May 3 14:38:43 theFourthTower kernel: [1642349.323750] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs ex
portfs reiserfs usb_storage xt_multiport binfmt_misc vboxnetadp vboxnetflt ipt_MASQUERADE xt_state ipt_REJECT xt_tcpudp iptable_filt
er nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_tftp nf_conntrack_tftp
 nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntra
ck nf_defrag_ipv4 ip_tables x_tables vboxdrv snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_p
cm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq fglrx(P) snd_timer snd_seq_device ppdev parport_pc snd it87 hwmon_vid k10temp
 i2c_piix4 edac_core...

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 745342

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: maverick
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Ernestas (ernetas)
Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 745342

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.