ubuntu_vfat_stress failed on Bionic Power8 when running with the whole test suite

Bug #1801907 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
High
Colin Ian King

Bug Description

There will be some kernel traces for the thread get blocked over 120 secondes when you run this test along with the test suite (sru-1):

11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098728] INFO: task stress-ng:65079 blocked for more than 120 seconds.
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098780] Not tainted 4.15.0-39-generic #42-Ubuntu
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098822] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098884] stress-ng D 0 65079 65035 0x00040002
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098886] Call Trace:
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098888] [c00000076e2736e0] [c00000076e273780] 0xc00000076e273780 (unreliable)
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098891] [c00000076e2738b0] [c00000000001c1d0] __switch_to+0x2a0/0x4d0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098894] [c00000076e273910] [c000000000cfd2e4] __schedule+0x2a4/0xaf0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098897] [c00000076e2739e0] [c000000000cfdb70] schedule+0x40/0xc0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098900] [c00000076e273a00] [c000000000152e7c] io_schedule+0x2c/0x50
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098902] [c00000076e273a30] [c0000000006d1e04] wbt_wait+0x484/0x4e0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098905] [c00000076e273ad0] [c000000000698e74] blk_mq_make_request+0x104/0x6e0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098908] [c00000076e273b70] [c000000000686a34] generic_make_request+0x124/0x380
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098911] [c00000076e273be0] [c000000000686d4c] submit_bio+0xbc/0x1d0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098914] [c00000076e273c90] [c000000000676f74] submit_bio_wait+0x64/0xa0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098916] [c00000076e273ce0] [c00000000068c374] blkdev_issue_flush+0xb4/0x110
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098919] [c00000076e273d20] [c000000000416dac] generic_file_fsync+0x4c/0x70
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098922] [c00000076e273d50] [c0000000005490f0] fat_file_fsync+0x30/0x80
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098925] [c00000076e273d80] [c0000000004263a8] vfs_fsync_range+0x78/0x170
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098927] [c00000076e273dd0] [c000000000426528] do_fsync+0x58/0xd0
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098930] [c00000076e273e10] [c000000000426918] SyS_fsync+0x28/0x40
11/05 13:43:34 DEBUG| utils:0153| [stdout] [ 5559.098933] [c00000076e273e30] [c00000000000b284]

And the test will fail in the end.

You will find the complete test result for sru-1 test suite here: https://pastebin.ubuntu.com/p/gB77bJFCnZ/

This failure cannot be reproduced when running this manually on the SUT.

Tags: bionic ppc64el
Po-Hsu Lin (cypressyew)
Changed in stress-ng:
assignee: nobody → Colin Ian King (colin-king)
Changed in stress-ng:
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Download full text (4.3 KiB)

Manually started the sru-1 test suite on a Power8 box:

 stress-ng: invoked with '/home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-n' by user 0
' Linux 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:41:45 UTC 2018 ppc64le
 stress-ng: memory (MB): total 130597.19, free 128099.44, shared 29.00, buffer 6.38, swap 8191.94, free swap 8191.94
 stress-ng: info: [157795] dispatching hogs: 2 hdd, 2 lockf, 2 seek, 2 aio, 2 dentry, 2 dir, 2 fallocate, 2 fstat, 2 lease, 2 open, 2 rename, 2 chdir, 2 rename
 stress-ng: info: [157795] cache allocate: using built-in defaults as unable to determine cache details
 stress-ng: fail: [157832] stress-ng-chdir: mkdir failed, errno=28 (No space left on device)
 stress-ng: fail: [157819] stress-ng-chdir: mkdir failed, errno=28 (No space left on device)
 stress-ng: fail: [157806] stress-ng-hdd: read failed, errno=28 (No space left on device)
 stress-ng: fail: [157821] stress-ng-hdd: read failed, errno=28 (No space left on device)
 ELOG[5039]: LID[50d02eb8]::SRC[B1763321]::Other Subsystems::Informational Event::No service action required
 kernel: [25133.553427] INFO: task kworker/u321:0:72730 blocked for more than 120 seconds.
 kernel: [25133.553441] Not tainted 4.15.0-39-generic #42-Ubuntu
 kernel: [25133.553446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 kernel: [25133.553451] kworker/u321:0 D 0 72730 2 0x00000800
 kernel: [25133.553462] Workqueue: writeback wb_workfn (flush-7:0)
 kernel: [25133.553465] Call Trace:
 kernel: [25133.553469] [c0000000039cea90] [c0000000039ceb00] 0xc0000000039ceb00 (unreliable)
 kernel: [25133.553474] [c0000000039cec60] [c00000000001c1d0] __switch_to+0x2a0/0x4d0
 kernel: [25133.553479] [c0000000039cecc0] [c000000000cfd2e4] __schedule+0x2a4/0xaf0
 kernel: [25133.553483] [c0000000039ced90] [c000000000cfdb70] schedule+0x40/0xc0
 kernel: [25133.553487] [c0000000039cedb0] [c000000000152e7c] io_schedule+0x2c/0x50
 kernel: [25133.553491] [c0000000039cede0] [c0000000006d1e04] wbt_wait+0x484/0x4e0
 kernel: [25133.553494] [c0000000039cee80] [c000000000698e74] blk_mq_make_request+0x104/0x6e0
 kernel: [25133.553498] [c0000000039cef20] [c000000000686a34] generic_make_request+0x124/0x380
 kernel: [25133.553501] [c0000000039cef90] [c000000000686d4c] submit_bio+0xbc/0x1d0
 kernel: [25133.553504] [c0000000039cf040] [c00000000042de2c] submit_bh_wbc+0x1dc/0x240
 kernel: [25133.553507] [c0000000039cf090] [c00000000042e108] __block_write_full_page+0x278/0x570
 kernel: [25133.553510] [c0000000039cf130] [c00000000054ac6c] fat_writepage+0x2c/0x40
 kernel: [25133.553513] [c0000000039cf150] [c00000000043bdf0] __mpage_writepage+0x170/0x7b0
 kernel: [25133.553517] [c0000000039cf680] [c0000000002f2d9c] write_cache_pages+0x25c/0x590
 kernel: [25133.553520] [c0000000039cf7c0] [c00000000043bac8] mpage_writepages+0x78/0x150
 kernel: [25133.553522] [c0000000039cf850] [c00000000054abe8] fat_writepages+0x28/0x40
 kernel: [25133.553525] [c0000000039cf870] [c0000000002f5f1c] do_writepages+0x4c/0x130
 kernel: [25133.553527] [c0000000039cf8e0] [c00000000041eff0] __writeback_single_inode+0x70/0x570
 kernel: [25133.553530] [c0000000039cf940] [c000000000...

Read more...

tags: added: bionic ppc64el
Revision history for this message
Colin Ian King (colin-king) wrote :

Note that the vfat test is running on a loop-back mounted file system:

/dev/loop0 1022M 492M 531M 49% /mnt/vfat-test-80992

How much memory has this system? Actually, can I get access to it as I've not been able to reproduce this inside a x86 hosted VM.

Colin

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

For the memory:

modoc: 128GB
entei: 32GB

Revision history for this message
Colin Ian King (colin-king) wrote :
Changed in stress-ng:
status: In Progress → Invalid
Changed in ubuntu-kernel-tests:
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Colin Ian King (colin-king)
Changed in stress-ng:
assignee: Colin Ian King (colin-king) → nobody
no longer affects: stress-ng
Revision history for this message
Colin Ian King (colin-king) wrote :

@Sam, I believe this should address the issue, do you mind re-running the tests to see if it locks up now?

Changed in ubuntu-kernel-tests:
status: Fix Committed → Fix Released
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi Colin,

I found P8 entei and P9 blater are failing like this (bug 1805778)
As modoc is good, so I decided to open another bug report.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.