ubuntu_stress_smoke_test interrupted with dev test on Trusty

Bug #1880090 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Stress-ng
Won't Fix
Undecided
Unassigned
ubuntu-kernel-tests
Won't Fix
High
Colin Ian King

Bug Description

Issue found on node "kili" with Trusty kernel.

The ubuntu_stress_smoke_test timeout with the dev test (Test suite HEAD SHA1: c352fe6):

Running '/home/ubuntu/autotest/client/tests/ubuntu_stress_smoke_test/ubuntu_stress_smoke_test.sh'
Free memory: 62745 MB
Memory used: 56471 MB
kili: x86_64 64322 MB memory, 860 GB disk
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.650831 s, 1.6 GB/s
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=7e7e36c0-24e3-44ea-a359-9fab0a865274

Machine Configuration
Physical Pages: 16466650
Pages available: 15794777
Page Size: 4096
Zswap enabled: N

Free memory:
             total used free shared buffers cached
Mem: 65866600 2687244 63179356 1252 33992 1676344
-/+ buffers/cache: 976908 64889692
Swap: 9437176 0 9437176

Number of CPUs: 56
Number of CPUs Online: 56

access STARTING
access RETURNED 0
access PASSED
af-alg STARTING
af-alg RETURNED 0
af-alg PASSED
affinity STARTING
affinity RETURNED 0
affinity PASSED
aio STARTING
aio RETURNED 0
aio PASSED
aiol STARTING
aiol RETURNED 0
aiol PASSED
bad-altstack STARTING
bad-altstack RETURNED 0
bad-altstack PASSED
bigheap STARTING
bigheap RETURNED 0
bigheap PASSED
binderfs STARTING
binderfs RETURNED 0
binderfs PASSED
branch STARTING
branch RETURNED 0
branch PASSED
brk STARTING
brk RETURNED 0
brk PASSED
cache STARTING
cache RETURNED 0
cache PASSED
cap STARTING
cap RETURNED 0
cap PASSED
chattr STARTING
chattr RETURNED 0
chattr PASSED
chdir STARTING
chdir RETURNED 0
chdir PASSED
chmod STARTING
chmod RETURNED 0
chmod PASSED
chown STARTING
chown RETURNED 0
chown PASSED
chroot STARTING
chroot RETURNED 0
chroot PASSED
clock STARTING
clock RETURNED 0
clock PASSED
close STARTING
close RETURNED 0
close PASSED
context STARTING
context RETURNED 0
context PASSED
cpu STARTING
cpu RETURNED 0
cpu PASSED
crypt STARTING
crypt RETURNED 0
crypt PASSED
cyclic STARTING
cyclic RETURNED 0
cyclic PASSED
daemon STARTING
daemon RETURNED 0
daemon PASSED
dccp STARTING
dccp RETURNED 0
dccp PASSED
dentry STARTING
dentry RETURNED 0
dentry PASSED
dev STARTING
Timer expired (2100 sec.), nuking pid 15053
ERROR ubuntu_stress_smoke_test.stress-smoke-test ubuntu_stress_smoke_test.stress-smoke-test timestamp=1590087528 localtime=May 21 18:58:48 Test timeout expired, rc=15
END ERROR ubuntu_stress_smoke_test.stress-smoke-test ubuntu_stress_smoke_test.stress-smoke-test timestamp=1590087528 localtime=May 21 18:58:48

Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
tags: added: amd64 sru-20200518 trusty ubuntu-stress-smoke-test
Revision history for this message
Colin Ian King (colin-king) wrote :

This is a kernel bug for sure. If I can some info on that machine I'll try debug the kernel when I get back to work on Tuesday.

Revision history for this message
Colin Ian King (colin-king) wrote :

It make be worth testing with a previous kernel to see if this is a regression or not. I suspect it's not a regression but something existing that stress-ng found.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This can be reproduced on the Trusty kernel in -updates (3.13.0-170-generic)

I have sent you the access information.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

From syslog:
 kili kernel: [ 842.781641] INFO: task stress-ng-dev:120415 blocked for more than 120 seconds.
 kili kernel: [ 842.781680] Tainted: G I 3.13.0-170-generic #220-Ubuntu
 kili kernel: [ 842.781703] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 kili kernel: [ 842.781733] stress-ng-dev D ffff88085f513b80 0 120415 120411 0x00000004
 kili kernel: [ 842.781742] ffff880035e01cf0 0000000000000082 ffff8807bf75e000 0000000000013b80
 kili kernel: [ 842.781752] ffff880035e01fd8 0000000000013b80 ffff8807bf75e000 ffff88104e0c4c28
 kili kernel: [ 842.781761] fffffff200000002 ffff88104e0c4c30 ffff8807bf75e000 7fffffffffffffff
 kili kernel: [ 842.781769] Call Trace:
 kili kernel: [ 842.781786] [<ffffffff81740739>] schedule+0x29/0x70
 kili kernel: [ 842.781792] [<ffffffff8173f9f9>] schedule_timeout+0x279/0x310
 kili kernel: [ 842.781807] [<ffffffff8132383d>] ? apparmor_capable+0x1d/0x130
 kili kernel: [ 842.781817] [<ffffffff81743b68>] ldsem_down_read+0x108/0x280
 kili kernel: [ 842.781832] [<ffffffff814650a0>] tty_ldisc_ref_wait+0x20/0x50
 kili kernel: [ 842.781839] [<ffffffff8145ee08>] tty_ioctl+0x6e8/0xbf0
 kili kernel: [ 842.781853] [<ffffffff81020257>] ? __restore_xstate_sig+0x87/0x500
 kili kernel: [ 842.781862] [<ffffffff810957e5>] ? enqueue_hrtimer+0x25/0xa0
 kili kernel: [ 842.781874] [<ffffffff811dc7f3>] do_vfs_ioctl+0x2e3/0x4d0
 kili kernel: [ 842.781884] [<ffffffff81082949>] ? restore_altstack+0x19/0x30
 kili kernel: [ 842.781891] [<ffffffff811dca61>] SyS_ioctl+0x81/0xa0
 kili kernel: [ 842.781902] [<ffffffff8174d5c9>] system_call_fastpath+0x26/0x2b

Po-Hsu Lin (cypressyew)
tags: added: kqa
tags: added: kqa-blocker
removed: kqa
Changed in ubuntu-kernel-tests:
importance: Undecided → High
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
Revision history for this message
Colin Ian King (colin-king) wrote :

If I can get access to the machine I will try and debug this. I can't reproduce this in a VM so I think it is device specific - the the /dev stressor hanging normally means it's a broken device driver.

Revision history for this message
Colin Ian King (colin-king) wrote :

18+ hours of continuous soak testing could not trigger this issue. Re-testing with regression tests didn't re-trigger this.

Marking this as Won't Fix for now. If this occurs again I'll try and debug it further.

Changed in ubuntu-kernel-tests:
status: In Progress → Won't Fix
Changed in stress-ng:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.