I stepped back from the bisect/debugging and looked at the higher level stats. The stress-ng test is started with one process for each core, and there are 96 of them. I looked at top[3] during a hang, and many of the stress-ng processes are running 'R'. However, a sysrq-q[2] also shows many stress-ng processes are 'D' in uninterruptible sleep. What also sticks out to me is all the stress-ng processes are running as root with a priority of 20. Looking back at one of the call traces[1], I see jbd2 stuck in an uninterruptible state:
...
[ 4461.908213] task:journal-offline state:D stack: 0 pid:17541 ppid: 1 flags:0x00000226
...
The jdb2 kernel thread also running with a priority of 20[4]. When the hang happens, jbd2 is also stuck in an uninterruptible state(As well as systemd-journal):
...
1521 root 20 0 0 0 0 D 0.0 0.0 4:10.48 jbd2/sda2-8
1593 root 19 -1 64692 15832 14512 D 0.0 0.1 0:01.54 systemd-journal
...
I am pinning all of the stress-ng threads to cores 1-95 and the kernel threads to a housekeeping cpu, 0.
However, even with this pinning, stress-ng ends up running on cpu 0, per the ps output[4]. This appears to be causing a dead-lock between jdb2 and the stress-ng processes, since they share the same priority/niceness.
To confirm this idea, I started test-storage / stress-ng so they had a lower priority than jbd2. I used the following:
sudo nice -10 test-storage
This causes jbd2 to continue to run with a priority of 20, but all the stress-ng threads are run with a priority of 30:
PSR TID PID COMMAND %CPU PRI NI
0 1517 1517 jbd2/sda2-8 5.0 20 0
0 125875 125875 stress-ng 15.5 30 10
0 125882 125882 stress-ng 4.4 30 10
0 125925 125925 stress-ng 4.4 30 10
...
By adding 'nice -10' the test will complete without hanging. It appears the system hang was it waiting to complete I/O, which would never happen since the jdb2 threads cannot preempt stress-ng and causes a dead-lock.
Michael, could you also try running with the following command to confirm the results:
sudo nice -10 test-storage
If this resolves the bug, there are several options:
1. Run the cert suite with a nice value for real-time tests.
2. Change the tests so they do not run as root.
3. Tune the real-time system so stress-ng threads are pinned to isolated cores and and kernel threads are on a housekeeping only core.
I'm going to investigate option 3. I am assigning cores 1-95 as the isolated cores, so stress-ng should not run on core 0, but it is. I'm going to figure out why this is happening.
I stepped back from the bisect/debugging and looked at the higher level stats. The stress-ng test is started with one process for each core, and there are 96 of them. I looked at top[3] during a hang, and many of the stress-ng processes are running 'R'. However, a sysrq-q[2] also shows many stress-ng processes are 'D' in uninterruptible sleep. What also sticks out to me is all the stress-ng processes are running as root with a priority of 20. Looking back at one of the call traces[1], I see jbd2 stuck in an uninterruptible state: offline state:D stack: 0 pid:17541 ppid: 1 flags:0x00000226
...
[ 4461.908213] task:journal-
...
The jdb2 kernel thread also running with a priority of 20[4]. When the hang happens, jbd2 is also stuck in an uninterruptible state(As well as systemd-journal):
...
1521 root 20 0 0 0 0 D 0.0 0.0 4:10.48 jbd2/sda2-8
1593 root 19 -1 64692 15832 14512 D 0.0 0.1 0:01.54 systemd-journal
...
I am pinning all of the stress-ng threads to cores 1-95 and the kernel threads to a housekeeping cpu, 0.
Output from cmdline: /boot/vmlinuz- 5.15.0- 1033-realtime root=UUID= 3583d8c4- d539-439f- 9d50-4341675268 cc ro console=tty0 console= ttyS0,115200 skew_tick=1 isolcpus= managed_ irq,domain, 1-95 intel_pstate= disable nosoftlockup tsc=nowatchdog crashkernel= 0M-2G:128M, 2G-6G:256M, 6G-8G:512M, 8G-:768M"
"BOOT_IMAGE=
However, even with this pinning, stress-ng ends up running on cpu 0, per the ps output[4]. This appears to be causing a dead-lock between jdb2 and the stress-ng processes, since they share the same priority/niceness.
To confirm this idea, I started test-storage / stress-ng so they had a lower priority than jbd2. I used the following:
sudo nice -10 test-storage
This causes jbd2 to continue to run with a priority of 20, but all the stress-ng threads are run with a priority of 30:
PSR TID PID COMMAND %CPU PRI NI
0 1517 1517 jbd2/sda2-8 5.0 20 0
0 125875 125875 stress-ng 15.5 30 10
0 125882 125882 stress-ng 4.4 30 10
0 125925 125925 stress-ng 4.4 30 10
...
By adding 'nice -10' the test will complete without hanging. It appears the system hang was it waiting to complete I/O, which would never happen since the jdb2 threads cannot preempt stress-ng and causes a dead-lock.
Michael, could you also try running with the following command to confirm the results:
sudo nice -10 test-storage
If this resolves the bug, there are several options:
1. Run the cert suite with a nice value for real-time tests.
2. Change the tests so they do not run as root.
3. Tune the real-time system so stress-ng threads are pinned to isolated cores and and kernel threads are on a housekeeping only core.
I'm going to investigate option 3. I am assigning cores 1-95 as the isolated cores, so stress-ng should not run on core 0, but it is. I'm going to figure out why this is happening.
[0] https:/ /launchpadlibra rian.net/ 653810449/ locking_ issue.txt /launchpadlibra rian.net/ 653810490/ call_trace. txt /launchpadlibra rian.net/ 655372944/ sysrq-w. txt /launchpadlibra rian.net/ 655374168/ top-during- hang.txt /launchpadlibra rian.net/ 655380123/ ps-test- running. txt
[1] https:/
[2] https:/
[3] https:/
[4] https:/