Activity log for bug #2047694

Date Who What changed Old value New value Message
2023-12-29 12:04:20 Po-Hsu Lin bug added bug
2023-12-29 12:11:03 Po-Hsu Lin description It seems we didn't run this test on scobee-kernel with J-realtime before, so it's a bit difficult to determine if this is caused by the recent LTP fork update [1]. Test failed with timeout: INFO: Test start time: Thu Dec 21 09:12:45 UTC 2023 COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 244473 -n 244473 -f /tmp/ltp-SeaoDkJ1R1/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests....... tst_test.c:1690: TINFO: LTP version: 20230929-185-g19ef6521d tst_test.c:1574: TINFO: Timeout per run is 0h 00m 30s Test timeouted, sending SIGKILL! tst_test.c:1622: TINFO: Killed the leftover descendant processes tst_test.c:1628: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1 tst_test.c:1630: TBROK: Test killed! (timeout?) Summary: passed 0 failed 0 broken 1 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-185-g19ef6521d INFO: Test end time: Thu Dec 21 09:13:15 UTC 2023 I tried to test this manually on scobee-kernel, but I found this is a bit flaky. In some attempts this test can finish with 10 seconds, but sometimes it will take up to 90 seconds. Maybe bumping the timeout multiplier can be a possible solution. [1] https://lists.ubuntu.com/archives/kernel-team/2023-December/147590.html It seems we didn't run this test on scobee-kernel with J-realtime before, so it's a bit difficult to determine if this is caused by the recent LTP fork update [1]. Test failed with timeout: INFO: Test start time: Thu Dec 21 09:12:45 UTC 2023 COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 244473 -n 244473 -f /tmp/ltp-SeaoDkJ1R1/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests....... tst_test.c:1690: TINFO: LTP version: 20230929-185-g19ef6521d tst_test.c:1574: TINFO: Timeout per run is 0h 00m 30s Test timeouted, sending SIGKILL! tst_test.c:1622: TINFO: Killed the leftover descendant processes tst_test.c:1628: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1 tst_test.c:1630: TBROK: Test killed! (timeout?) Summary: passed 0 failed 0 broken 1 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-185-g19ef6521d INFO: Test end time: Thu Dec 21 09:13:15 UTC 2023 And it looks like this test will trigger a warning on this system even if the test has passed: [ 165.551988] ------------[ cut here ]------------ [ 165.552018] WARNING: CPU: 0 PID: 15 at kernel/sched/core.c:3109 set_task_cpu+0x168/0x244 [ 165.552083] Modules linked in: binfmt_misc nls_iso8859_1 ipmi_ssif arm_spe_pmu acpi_ipmi hisi_zip ipmi_si hns_roce_hw_v2 hisi_sec2 hisi_hpre ecdh_generic libcurve25519_generic ipmi_devintf ecc hisi_qm ipmi_msghandler authenc uacce hisi_uncore_l3c_pmu hisi_uncore_hha_pmu hisi_uncore_ddrc_pmu hisi_trng_v2 hisi_uncore_pmu cppc_cpufreq sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure mlx5_ib ib_uverbs ib_core hibmc_drm drm_vram_helper drm_ttm_helper ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect mlx5_core sysimgblt fb_sys_fops cec rc_core mlxfw realtek crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce hisi_sas_v3_hw hns3 psample hisi_sas_main hclge tls xhci_pci libsas drm hnae3 xhci_pci_renesas ahci scsi_transport_sas spi_dw_mmio spi_dw gpio_dwapb [ 165.552555] aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher [ 165.552595] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 5.15.0-1052-realtime #58-Ubuntu [ 165.552614] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B160.01 01/15/2020 [ 165.552624] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 165.552641] pc : set_task_cpu+0x168/0x244 [ 165.552660] lr : detach_tasks+0x138/0x4b0 [ 165.552684] sp : ffff80000860ba20 [ 165.552692] x29: ffff80000860ba20 x28: ffff2020075ac300 x27: ffffa9913540a928 [ 165.552718] x26: ffffa99134c794c0 x25: ffffa99134c794c0 x24: ffff003f7fbd9fb0 [ 165.552739] x23: 0000000000000001 x22: ffff003f7fbd94c0 x21: ffffa99135407a18 [ 165.552761] x20: 000000000000000d x19: ffff2020075ac300 x18: 0000000000000000 [ 165.552784] x17: ffff56ae4adda000 x16: ffffa99133a3e780 x15: 00003d094ed85380 [ 165.552805] x14: ffffa9913543a5a8 x13: ffffa9913543a078 x12: 000000000000000d [ 165.552828] x11: 0000000000000004 x10: ffffa99135407b50 x9 : ffffa99132e30dd8 [ 165.552846] x8 : 000000000000000d x7 : ffffffffffffe000 x6 : 0000000000000314 [ 165.552866] x5 : 0000000000532ae2 x4 : 0000000000000001 x3 : 000000000000b67e [ 165.552886] x2 : 0000000000000000 x1 : ffffa991344399b8 x0 : 0000000000000001 [ 165.552908] Call trace: [ 165.552915] set_task_cpu+0x168/0x244 [ 165.552933] detach_tasks+0x138/0x4b0 [ 165.552948] load_balance+0x260/0x834 [ 165.552967] rebalance_domains+0x280/0x3f4 [ 165.552984] _nohz_idle_balance.constprop.0.isra.0+0x1ec/0x34c [ 165.553004] run_rebalance_domains+0x84/0xb0 [ 165.553022] __do_softirq+0x170/0x468 [ 165.553035] run_ksoftirqd+0x80/0x150 [ 165.553052] smpboot_thread_fn+0x260/0x2e4 [ 165.553072] kthread+0x158/0x16c [ 165.553092] ret_from_fork+0x10/0x20 [ 165.553117] ---[ end trace 0000000000000002 ]--- I tried to test this manually on scobee-kernel, but I found this is a bit flaky. In some attempts this test can finish with 10 seconds, but sometimes it will take up to 90 seconds. Maybe bumping the timeout multiplier can be a possible solution. [1] https://lists.ubuntu.com/archives/kernel-team/2023-December/147590.html
2023-12-29 12:11:46 Po-Hsu Lin summary mm:cpuset01 from ubuntu_ltp flaky on scobee-kernel with J-realtime mm:cpuset01 from ubuntu_ltp flaky on scobee-kernel with J-realtime (warning found in dmesg)