ISST-KVM: R3-0: Firestone: PowerNV : Call traces w.r.t filesystem while running stress test

Bug #1495863 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Taco Screen team

Bug Description

== Comment: #0 - Krishnaja Balachandran <email address hidden> - 2015-09-02 05:01:44 ==
---Problem Description---
While running stress tests( IO BASE TCP NFS) on Firestone system "amp" I see the following call traces in "dmesg". Also, few commands are hanging in amp.

Contact Information = <email address hidden>

---uname output---
Linux amp 3.19.0-26-generic #28~14.04.1-Ubuntu SMP Wed Aug 12 14:10:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = PowerNV 8335-GTA

 ---Debugger---
"xmon" was configured, however the system did not enter into the debugger

Stack trace output:
----------------------------
 [69633.095738] INFO: rcu_sched detected stalls on CPUs/tasks: { 11 59} (detected by 171, t=10356287 jiffies, g=2131264, c=2131263, q=51507717)
[69633.096031] Task dump for CPU 11:
[69633.096058] kswapd8 R running task 0 989 2 0x00000804
[69633.096110] Call Trace:
[69633.096138] [c000003c9f93f070] [c000003c9f93f0b0] 0xc000003c9f93f0b0 (unreliable)
[69633.096347] [c000003c9f93f240] [c000003c9f93f370] 0xc000003c9f93f370
[69633.096399] Task dump for CPU 59:
[69633.096426] kworker/u386:14 D 0000000000000000 0 77488 2 0x00000804
[69633.096484] Workqueue: writeback bdi_writeback_workfn (flush-8:32)
[69633.096536] Call Trace:
[69633.096556] [c000003c9f936f00] [c000001ff4ee76f0] 0xc000001ff4ee76f0 (unreliable)
[69633.096618] [c000003c9f9370d0] [c000003c9f937130] 0xc000003c9f937130
[69633.096671] [c000003c9f937130] [c000000000a11370] __schedule+0x370/0x8d0
[69633.096726] [c000003c9f937350] [c000000000a153e8] rwsem_down_write_failed+0x288/0x400
[69633.096788] [c000003c9f9373e0] [c000000000a147f8] down_write+0x88/0x90
[69633.096864] [c000003c9f937410] [d000000029bc5254] xfs_ilock+0xf4/0x160 [xfs]
[69633.096933] [c000003c9f937450] [d000000029bc1bc8] xfs_iomap_write_allocate+0x238/0x3f0 [xfs]
[69633.097011] [c000003c9f937580] [d000000029ba60bc] xfs_map_blocks+0x1cc/0x2f0 [xfs]
[69633.097087] [c000003c9f9375f0] [d000000029ba7b24] xfs_vm_writepage+0x194/0x630 [xfs]
[69633.097148] [c000003c9f9376d0] [c00000000021c66c] __writepage+0x4c/0xb0
[69633.097373] [c000003c9f937710] [c00000000021cda4] write_cache_pages+0x1e4/0x4c0
[69633.097437] [c000003c9f937850] [c00000000021d0e4] generic_writepages+0x64/0x90
[69633.097513] [c000003c9f9378b0] [d000000029ba5e70] xfs_vm_writepages+0x70/0xa0 [xfs]
[69633.097575] [c000003c9f9378f0] [c00000000021e720] do_writepages+0x60/0xc0
[69633.097627] [c000003c9f937920] [c0000000002f1db8] __writeback_single_inode+0x68/0x370
[69633.097687] [c000003c9f937970] [c0000000002f2478] writeback_sb_inodes+0x2c8/0x4f0
[69633.097747] [c000003c9f937a40] [c0000000002f2784] __writeback_inodes_wb+0xe4/0x150
[69633.097807] [c000003c9f937aa0] [c0000000002f354c] wb_writeback+0x30c/0x3e0
[69633.097859] [c000003c9f937b40] [c0000000002f405c] bdi_writeback_workfn+0x14c/0x550
[69633.097919] [c000003c9f937c60] [c0000000000d28bc] process_one_work+0x19c/0x480
[69633.097980] [c000003c9f937cf0] [c0000000000d3160] worker_thread+0x190/0x5b0
[69633.098032] [c000003c9f937d80] [c0000000000da494] kthread+0x114/0x140
[69633.098085] [c000003c9f937e30] [c00000000000956c] ret_from_kernel_thread+0x5c/0x70

 NOTE:

System is on a private network. Access the private network via SSH to "banner.isst.aus.stglabs.ibm.com" using your GSA ID and password.
(Banner itself is behind a BSO, so must authenticate through that first.)

Login details :
ssh banner.isst.aus.stglabs.ibm.com [debug/don2rry ]

Host login:-

amp.isst.aus.stglabs.ibm.com [10.33.31.106 ]
[root/don2rry]

login via GUI:
bmc-amp.isst.aus.stglabs.ibm.com [10.33.31.106 ]

IPMI Login to host amp console :-
-------------------------------------------------
From banner machine run the following :
ssh banner.isst.aus.stglabs.ibm.com [debug/don2rry ]

ipmitool -I lanplus -H bmc-amp -U ADMIN -P admin sol deactivate
ipmitool -I lanplus -H bmc-amp -U ADMIN -P admin sol activate

-----------------------------------------------------------------------------
                             TESTING INFORMATION
-----------------------------------------------------------------------------

SYSTEM INFORMATION
---------------------------------

   HOST NAME or NETWORK ADDRESS: amp.isst.aus.stglabs.ibm.com [ 10.33.31.106 ]

   BMC NAME and BMC ip: bmc-amp.isst.aus.stglabs.ibm.com [ 10.33.31.108 ]

   Firmware Revision: 2.02.82263
   Firmware Build Time: Aug 24 2015 10:44:30 CDT

RECENT SYSTEM CHANGES : none
-----------------------------------------
  none

Revision history for this message
bugproxy (bugproxy) wrote : dmesg from amp

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-129789 severity-high targetmilestone-inin---
Revision history for this message
bugproxy (bugproxy) wrote : /var/log/messages from amp

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : sysctl -a output from amp

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : console logs from amp

Default Comment by Bridge

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1495863/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Breno Leitão (breno-leitao) wrote :

This bug is mirrored to Canonical just for awareness at the moment.

Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-09-28 19:19 EDT-------
Hi does the system eventually recover once the stress tests finish?

What sort of storage is being used? If you are able to reproduce this reliably can you narrow it down to a particular set of tests, like the IO tests.

If this issue is brought upon by I/O stress tests and happens reliably, can you give either noop or deadline I/O schedulers a try instead of the default CFQ scheduler. If the storage being tested is SAN storage then noop is preferred.

Steve Langasek (vorlon)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : sysctl -a output from amp

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : dmesg from amp

------- Comment (attachment only) From <email address hidden> 2015-10-16 05:52 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : /var/log/messages from amp

------- Comment (attachment only) From <email address hidden> 2015-10-16 05:53 EDT-------

Revision history for this message
Tim Gardner (timg-tpi) wrote :

I wonder if this bug is a duplicate of http://bugs.launchpad.net/bugs/1527062

Luciano Chavez (lnx1138)
Changed in linux (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.