memcg_regression_test in ubuntu_ltp_controllers cause soft lockup on Google g1-small with J-gkeop

Bug #2030709 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned

Bug Description

Issue found with 5.15.0-1025.30 and can be reproduced with 5.15.0-1024-gkeop on google instance g1-small only. There is no such issue in J-gcp and J-gke.

The test and the system will hang with 4th test case inside this test.

Test output:
COMMAND: /opt/ltp/bin/ltp-pan -e -S -a 1345 -n 1345 -p -f /tmp/ltp-8BAdmWLWz8/alltests -l /opt/ltp/results/LTP_RUN_ON-2023_08_08-05h_35m_08s.log -C /opt/ltp/output/LTP_RUN_ON-2023_08_08-05h_35m_08s.failed -T /opt/ltp/output/LTP_RUN_ON-2023_08_08-05h_35m_08s.tconf
LOG File: /opt/ltp/results/LTP_RUN_ON-2023_08_08-05h_35m_08s.log
FAILED COMMAND File: /opt/ltp/output/LTP_RUN_ON-2023_08_08-05h_35m_08s.failed
TCONF COMMAND File: /opt/ltp/output/LTP_RUN_ON-2023_08_08-05h_35m_08s.tconf
Running tests.......
<<<test_start>>>
tag=memcg_regression stime=1691472909
cmdline="memcg_regression_test.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
memcg_regression_test 1 TINFO: timeout per run is 0h 5m 0s
memcg_regression_test 1 TINFO: test starts with cgroup version 2
memcg_regression_test 1 TPASS: no kernel bug was found
memcg_regression_test 2 TCONF: Cgroup v2 found, skipping test
memcg_regression_test 3 TPASS: no kernel bug was found

dmesg output from console (ssh session can't get this far):
[ 296.923589] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [memcg_test_4.sh:1983]
[ 324.923599] watchdog: BUG: soft lockup - CPU#0 stuck for 52s! [memcg_test_4.sh:1983]
[ 352.923640] watchdog: BUG: soft lockup - CPU#0 stuck for 78s! [memcg_test_4.sh:1983]
[ 380.923622] watchdog: BUG: soft lockup - CPU#0 stuck for 104s! [memcg_test_4.sh:1983]
[ 408.923634] watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [memcg_test_4.sh:1983]
[ 436.923645] watchdog: BUG: soft lockup - CPU#0 stuck for 156s! [memcg_test_4.sh:1983]

Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ):

From test history, this issue can be dated all the way back to 5.15.0-1005.7
5.15.0-1004.6 is good.

But I cannot find -1005 nor -1004 from our apt archive.

tags: added: sru-20230710
Po-Hsu Lin (cypressyew)
summary: - memcg_regression_test in ubuntu_ltp_controllers cause softlockup on
+ memcg_regression_test in ubuntu_ltp_controllers cause soft lockup on
Google g1-small with J-gkeop
Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.