HP ProLiant m400 Server sda timeout causes file system hang

Bug #1469218 reported by Colin Ian King
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
Medium
dann frazier

Bug Description

While running stress-ng on a ProLiant m400 Server I managed to cause the disk controller to timeout which causes a file system hang. I am wondering if the timeout was causes because I completely overloaded the system causing it to be slow on ata handling times.

I ran:

stress-ng --all 64 -t 600 -v

and the machine slowly clogged up with stress processes, and finally hit the hang. Had to reboot the system.

Attached is a kernel log.

Revision history for this message
Colin Ian King (colin-king) wrote :
Changed in linux (Ubuntu):
assignee: nobody → dann frazier (dannf)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1469218

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Triaged
Revision history for this message
Ming Lei (tom-leiming) wrote :

I can't reproduce it after running half a day on ms10-36, and OOM is often triggered .

Revision history for this message
Ming Lei (tom-leiming) wrote :

BTW, it can't be reproduced on mustang when running vivid too.

Revision history for this message
Ming Lei (tom-leiming) wrote :
Download full text (27.3 KiB)

But another kernel oops is just found on one mustang with vivid:

Call trace:
Unable to handle kernel NULL pointer dereference at virtual address 00000018
pgd = ffffffc105652000
[00000018] *pgd=000000432d9dc003Unable to handle kernel NULL pointer dereference at virtual address 00000030
pgd = ffffffc1b8d49000
[00000030] *pgd=00000041b8d4a003, *pud=00000041b8d4a003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 15416 Comm: stress-ng-sysin Tainted: G W 3.19.8-ckt1+ #40
Hardware name: APM X-Gene Mustang board (DT)
task: ffffffc233cf0ac0 ti: ffffffc132094000 task.ti: ffffffc132094000
PC is at thread_group_cputime+0x90/0x154
LR is at thread_group_cputime+0x30/0x154
pc : [<ffffffc0000d0cf4>] lr : [<ffffffc0000d0c94>] pstate: 00000145
sp : ffffffc132097d90
x29: ffffffc132097d90 x28: ffffffc132094000
x27: ffffffc233cf0ac0 x26: ffffffc02cc24154
x25: 0000000000000000 x24: 0000000000000000
x23: 0000000000000000 x22: ffffffc02cc23fc0
x21: 0000000000000000 x20: fffffffffffffc48
x19: ffffffc132097e38 x18: 0000000000000000
x17: 0000007fa836cff0 x16: ffffffc0000bb030
x15: 0024167618000000 x14: 0000007fa82d3624
x13: 6f6d2c6b32393237 x12: 0000000000000000
x11: 0000000000000006 x10: 0101010101010101
x9 : fefeff7ea7306a2f x8 : 0000000000000099
x7 : 746e756f6d203631 x6 : 0000007fc0aa71c3
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 0000000000000000 x2 : 0000000000000000
x1 : 0000000000000000 x0 : ffffffc02cc23fd0

Process stress-ng-sysin (pid: 15416, stack limit = 0xffffffc132094058)
Stack: (0xffffffc132097d90 to 0xffffffc132098000)
7d80: 32097e00 ffffffc1 000d0ed0 ffffffc0
7da0: 32097e80 ffffffc1 33cf0ac0 ffffffc2 ffffffff ffffffff a836cffc 0000007f
7dc0: 60000000 00000000 00000015 00000000 0000011a 00000000 00000099 00000000
7de0: 007f2000 ffffffc0 001e7742 00000000 00000000 00000000 00000000 00000000
7e00: 32097e50 ffffffc1 000bafd8 ffffffc0 32097eb0 ffffffc1 0044cf58 00000000
7e20: 00000000 00000000 32097e88 ffffffc1 c0aaaa80 0000007f 00000000 00000000
7e40: 00000000 00000000 00000000 00000000 32097e90 ffffffc1 000bb04c ffffffc0
7e60: c0aaab20 0000007f 0044cf58 00000000 ffffffff ffffffff 000617db 00000000
7e80: 001e7746 00000000 001e7742 00000000 c0aaaa80 0000007f 00086470 ffffffc0
7ea0: 00000000 00000000 00001026 00000000 00000000 00000000 00000000 00000000
7ec0: 00000000 00000000 00000000 00000000 c0aaab20 0000007f c0aaabb0 0000007f
7ee0: c0aaab20 0000007f 00000000 00000000 00000000 00000000 00000000 00000000
7f00: c0aa71c3 0000007f 6d203631 746e756f 00000099 00000000 a7306a2f fefeff7e
7f20: 01010101 01010101 00000006 00000000 00000000 00000000 32393237 6f6d2c6b
7f40: a82d3624 0000007f 18000000 00241676 0044c5e8 00000000 a836cff0 0000007f
7f60: 00000000 00000000 c0aaaca0 0000007f 0044cf58 00000000 c0aaaca0 0000007f
7f80: 00000000 00000000 0044c000 00000000 a82b3370 0000007f 00000010 00000000
7fa0: 00000000 00000000 c0aab168 0000007f 0044c000 00000000 c0aaaa80 0000007f
7fc0: 0042068c 00000000 c0aaaa80 0000007f a836cffc 0000007f 60000000 00000000
7fe0: c0aaab20 0000007f 00000099 00000000 49494949 49494949 49494949 49494949
Call trace:
[<fff...

Revision history for this message
Ming Lei (tom-leiming) wrote :

For #5's log, the fault happened in the following code snippet of kernel/sched/cputime.c:

                for_each_thread(tsk, t) {
ffffffc0000d3dc0: f9445760 ldr x0, [x27,#2216]
ffffffc0000d3dc4: f8410c03 ldr x3, [x0,#16]!
ffffffc0000d3dc8: f90037a3 str x3, [x29,#104]
ffffffc0000d3dcc: f94037b4 ldr x20, [x29,#104]
ffffffc0000d3dd0: eb00029f cmp x20, x0
ffffffc0000d3dd4: d10ee294 sub x20, x20, #0x3b8
ffffffc0000d3dd8: 54000081 b.ne ffffffc0000d3de8 <thread_group_cputime+0x90>
ffffffc0000d3ddc: 14000015 b ffffffc0000d3e30 <thread_group_cputime+0xd8>
ffffffc0000d3de0: f9400261 ldr x1, [x19]
ffffffc0000d3de4: f9400662 ldr x2, [x19,#8]
                                cputime_t *utime, cputime_t *stime)
{
        if (utime)
                *utime = t->utime;
        if (stime)
                *stime = t->stime;
ffffffc0000d3de8: f941f684 ldr x4, [x20,#1000] #fault instruction

Revision history for this message
Ming Lei (tom-leiming) wrote :

BTW, about the sata read timeout issue, I have run fio to verify sata disk read/randread/write/randwrite, and looks
it works fine

Revision history for this message
Ming Lei (tom-leiming) wrote :

Wrt. disk read failure in Colin's report, looks the sectors themselves invoved in timeout are good, becasue the following
test runs well in ms10-34 now:

    sudo dd if=/dev/sda skip=9741420 iflag=direct of=/dev/null bs=512 count=1K

1024+0 records in
1024+0 records out
524288 bytes (524 kB) copied, 0.0767411 s, 6.8 MB/s

Revision history for this message
Ming Lei (tom-leiming) wrote :

Some static code analysis:

1), looks not possbile in scsi request submit path
- preempt is disabled for arm64 vivid
- the timer is always added just before submitting to hardware
- once it can't be submitted to hardware, the timer is disabled

2), another possibility is in ata's completion path
- block request is always completed in softirq(scsi_done()->blk_complete_request())
- the softirq may be run from another CPU, which is triggered by IPI
- but one atomic complete flag is used to monitor the request completion status
(blk_mark_rq_complete()), so looks the timeout can't be caused by softirq delay schedule.

Revision history for this message
Ming Lei (tom-leiming) wrote :

The night before yesterday, I have run stress-ng for one night, looks it isn't crashed on
mcdivitt.

Yesterday, I found my system is upgraded from trysty to vivid directly and the 'systemd'
package isn't installed, then 'systemd-timesyn' can't be found, so I install the package
and make sure systemd-timesyncd is installed because the process was reported two times
in Colin's reports.

Then last night, I enable the the server by 'sudo timedatectl set-ntp true' before running
the test, but still not crash, see the logs attached.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.