mm:cpuset01 from ubuntu_ltp flaky on scobee-kernel with J-realtime (warning found in dmesg)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned |
Bug Description
It seems we didn't run this test on scobee-kernel with J-realtime before, so it's a bit difficult to determine if this is caused by the recent LTP fork update [1].
Test failed with timeout:
INFO: Test start time: Thu Dec 21 09:12:45 UTC 2023
COMMAND: /opt/ltp/
LOG File: /dev/null
FAILED COMMAND File: /dev/null
TCONF COMMAND File: /dev/null
Running tests.......
tst_test.c:1690: TINFO: LTP version: 20230929-
tst_test.c:1574: TINFO: Timeout per run is 0h 00m 30s
Test timeouted, sending SIGKILL!
tst_test.c:1622: TINFO: Killed the leftover descendant processes
tst_test.c:1628: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
tst_test.c:1630: TBROK: Test killed! (timeout?)
Summary:
passed 0
failed 0
broken 1
skipped 0
warnings 0
INFO: ltp-pan reported some tests FAIL
LTP Version: 20230929-
INFO: Test end time: Thu Dec 21 09:13:15 UTC 2023
And it looks like this test will trigger a warning on this system even if the test has passed:
[ 165.551988] ------------[ cut here ]------------
[ 165.552018] WARNING: CPU: 0 PID: 15 at kernel/
[ 165.552083] Modules linked in: binfmt_misc nls_iso8859_1 ipmi_ssif arm_spe_pmu acpi_ipmi hisi_zip ipmi_si hns_roce_hw_v2 hisi_sec2 hisi_hpre ecdh_generic libcurve25519_
[ 165.552555] aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[ 165.552595] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 5.15.0-
[ 165.552614] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B160.01 01/15/2020
[ 165.552624] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 165.552641] pc : set_task_
[ 165.552660] lr : detach_
[ 165.552684] sp : ffff80000860ba20
[ 165.552692] x29: ffff80000860ba20 x28: ffff2020075ac300 x27: ffffa9913540a928
[ 165.552718] x26: ffffa99134c794c0 x25: ffffa99134c794c0 x24: ffff003f7fbd9fb0
[ 165.552739] x23: 0000000000000001 x22: ffff003f7fbd94c0 x21: ffffa99135407a18
[ 165.552761] x20: 000000000000000d x19: ffff2020075ac300 x18: 0000000000000000
[ 165.552784] x17: ffff56ae4adda000 x16: ffffa99133a3e780 x15: 00003d094ed85380
[ 165.552805] x14: ffffa9913543a5a8 x13: ffffa9913543a078 x12: 000000000000000d
[ 165.552828] x11: 0000000000000004 x10: ffffa99135407b50 x9 : ffffa99132e30dd8
[ 165.552846] x8 : 000000000000000d x7 : ffffffffffffe000 x6 : 0000000000000314
[ 165.552866] x5 : 0000000000532ae2 x4 : 0000000000000001 x3 : 000000000000b67e
[ 165.552886] x2 : 0000000000000000 x1 : ffffa991344399b8 x0 : 0000000000000001
[ 165.552908] Call trace:
[ 165.552915] set_task_
[ 165.552933] detach_
[ 165.552948] load_balance+
[ 165.552967] rebalance_
[ 165.552984] _nohz_idle_
[ 165.553004] run_rebalance_
[ 165.553022] __do_softirq+
[ 165.553035] run_ksoftirqd+
[ 165.553052] smpboot_
[ 165.553072] kthread+0x158/0x16c
[ 165.553092] ret_from_
[ 165.553117] ---[ end trace 0000000000000002 ]---
I tried to test this manually on scobee-kernel, but I found this is a bit flaky. In some attempts this test can finish with 10 seconds, but sometimes it will take up to 90 seconds.
Maybe bumping the timeout multiplier can be a possible solution.
[1] https:/
description: | updated |
summary: |
mm:cpuset01 from ubuntu_ltp flaky on scobee-kernel with J-realtime + (warning found in dmesg) |