Soft lockup when running bonnie++ only at 1600 mt/s
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
SRU Justification:
Impact: running a test like bonnie++ makes the system instable and prone to hangs.
Fix: apply the attached patches and recompile a kernel.
Test case: leave bonnie running in a loop for 24hrs.
--
When bonnie++ was run in a loop, the system exhibits a hang behavior with
"rcu_sched: self-detected stall on CPU"
The time to error can be inconsistent. One time it took 7 hours and the next time more than 2 days.
Commands to reproduce the failure:
$ sudo apt-get install bonnie++
$ mkdir bonnie
$ while true; do bonnie++ -d bonnie; done &>>bonnie0.log &
Stack trace:
[237019.072290] INFO: rcu_sched self-detected stall on CPU { 1} (t=19305216 jiffies g=580389 c=580388 q=84)
[237019.080901] CPU: 1 PID: 44 Comm: kswapd0 Tainted: GF 3.11.0-
[237019.088879] [<c002bc00>] (unwind_
[237019.096700] [<c0026f1c>] (show_stack+
[237019.104051] [<c05cbe50>] (dump_stack+
[237019.112262] [<c00bf37c>] (rcu_check_
[237019.121254] [<c00492a0>] (update_
[237019.129933] [<c008cdbc>] (tick_sched_
[237019.138300] [<c008d00c>] (tick_sched_
[237019.146433] [<c005db50>] (__run_
[237019.154800] [<c005e6f8>] (hrtimer_
[237019.163871] [<c0492e44>] (arch_timer_
[237019.173332] [<c00b8c2c>] (handle_
[237019.182402] [<c00b54ec>] (generic_
[237019.190378] [<c0023ff4>] (handle_
[237019.198041] [<c0008508>] (gic_handle_
[237019.205624] Exception stack(0xee2c1c18 to 0xee2c1c60)
[237019.210238] 1c00: 00000004 00000004
[237019.217666] 1c20: 00000008 00000001 ee2c1c8c ca208700 ca208700 0996b000 ca208708 00000001
[237019.225093] 1c40: 00000002 edb31300 00000003 ee2c1c60 c02f54fc c00923c8 200f0013 ffffffff
[237019.232523] [<c05d1c00>] (__irq_
[237019.240500] [<c00923c8>] (generic_
[237019.249805] [<c00924f4>] (smp_call_
[237019.259812] [<c0029920>] (broadcast_
[237019.268882] [<c0029adc>] (flush_
[237019.277484] [<c011fc8c>] (ptep_clear_
[237019.286554] [<c011a60c>] (page_reference
[237019.295155] [<c011c034>] (page_reference
[237019.303756] [<c00fc410>] (shrink_
[237019.312279] [<c00fdadc>] (shrink_
[237019.320176] [<c00fddb0>] (shrink_
[237019.327527] [<c00fe430>] (kswapd+
[237019.334487] [<c005aae0>] (kthread+0xa4/0xb0) from [<c0023198>] (ret_from_
Setup details:
Quad-core A15 server nodes on Calxeda Midway hardware.
The failure has been seen two times with DDR setting of DDR3@1600mt/s
cat /proc/version_
Ubuntu 3.11.0-
The issue was first seen on Ubuntu 3.11.0-
cat /etc/issue
Ubuntu 13.04 \n \l
Additional debug information attached
---
Architecture: armhf
DistroRelease: Ubuntu 13.04
MarkForUpload: True
Package: linux (not installed)
ProcEnviron:
LANGUAGE=en_US:
TERM=vt102
PATH=(custom, no user)
LANG=en_US
SHELL=/bin/bash
Uname: Linux 3.11.0-
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1239800
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.