commit 34c2ef786555d56e26d3c09b63d9ed464cdcd0ea
Author: Jim Gauld <email address hidden>
Date: Fri Nov 26 15:44:51 2021 -0500
Enhance schedtop with blocked_max, disk waiters, and watch commands
The 'schedtop' monitoring tool is used to do engineering
analysis of process scheduling, disk IO, and latency.
This enhances the schedtop monitoring tool with:
- additional fields "bmax" latency and "D" disk-sleep tasks
- command-line options to watch specific tasks and mechanism to
trigger sysrq
The following new fields are reported:
- "bmax" milliseconds, corresponds to linux scheduler stats
"blocked_max". This represents involuntary wait of scheduling
and IO wait.
- "D:<n>", the current number of disk-sleep "D" tasks.
The following command line options are added to be able to watch
specific processes, and optionally trigger a sysrq (i.e., force
a crashdump) when trigger delay threshold milliseconds is reached.
[--watch-cmd=tid1,cmd1,cmd2,...] [--watch-only] [--watch-quiet]
[--trig-delay=time]
The --watch-cmd option matches process names 'comm' field pattern.
The --watch-only option watches and displays only the subset of
tasks discovered at tool startup. This dramatically reduces the
tool cpu overhead.
The --watch-quiet displays no sample output after tool startup,
the only output occurs when the --trig-delay is exceeded.
The --trig-delay=time option will trigger a sysrq to force a crash
dump any watched process "bmax" delay exceeds trigger delay time
in milliseconds.
Example: collect 1 minute of data, monitor all tasks,
reset scheduler hiwatermark statistics
schedtop \
--period=60 --reset-hwm
Example: collect 1 minute of data, watch specific tasks
Testcases:
PASS: Collect standard tool output, verified new bmax and D fields
PASS: Verify --watch-cmds will detect the specified commands or tids
PASS: Verify --watch-only will only display watched commands
PASS: Verify --trig-delay will generate a sysrq
PASS: Verify comm field is limited to 15 characters wide
Closes-Bug: 1927772
Signed-off-by: Jim Gauld <email address hidden>
Change-Id: I5368aac66b24608f5eab366cd929be4c0d4a1f76
Reviewed: https:/ /review. opendev. org/c/starlingx /monitoring/ +/819498 /opendev. org/starlingx/ monitoring/ commit/ 34c2ef786555d56 e26d3c09b63d9ed 464cdcd0ea
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 34c2ef786555d56 e26d3c09b63d9ed 464cdcd0ea
Author: Jim Gauld <email address hidden>
Date: Fri Nov 26 15:44:51 2021 -0500
Enhance schedtop with blocked_max, disk waiters, and watch commands
The 'schedtop' monitoring tool is used to do engineering
analysis of process scheduling, disk IO, and latency.
This enhances the schedtop monitoring tool with:
- additional fields "bmax" latency and "D" disk-sleep tasks
- command-line options to watch specific tasks and mechanism to
trigger sysrq
The following new fields are reported: blocked_ max". This represents involuntary wait of scheduling
- "bmax" milliseconds, corresponds to linux scheduler stats
"
and IO wait.
- "D:<n>", the current number of disk-sleep "D" tasks.
The following command line options are added to be able to watch watch-cmd= tid1,cmd1, cmd2,.. .] [--watch-only] [--watch-quiet] trig-delay= time]
specific processes, and optionally trigger a sysrq (i.e., force
a crashdump) when trigger delay threshold milliseconds is reached.
[--
[--
The --watch-cmd option matches process names 'comm' field pattern.
The --watch-only option watches and displays only the subset of
tasks discovered at tool startup. This dramatically reduces the
tool cpu overhead.
The --watch-quiet displays no sample output after tool startup,
the only output occurs when the --trig-delay is exceeded.
The --trig-delay=time option will trigger a sysrq to force a crash
dump any watched process "bmax" delay exceeds trigger delay time
in milliseconds.
Example: collect 1 minute of data, monitor all tasks,
reset scheduler hiwatermark statistics
schedtop \
--period=60 --reset-hwm
Example: collect 1 minute of data, watch specific tasks
schedtop \ cmd=jbd2, kube-apiserver, etcd,forward- journald, containerd \
--period=60 --reset-hwm \
--watch-
--watch-only
Example: watch specific tasks and trigger sysrq when any of the
watched commands exceed 10000ms delay (10 seconds)
schedtop \ cmd=jbd2, kube-apiserver, etcd,forward- journald, containerd \ delay=10000
--period=36000 --reset-hwm \
--watch-
--watch-only \
--trig-
Testcases:
PASS: Collect standard tool output, verified new bmax and D fields
PASS: Verify --watch-cmds will detect the specified commands or tids
PASS: Verify --watch-only will only display watched commands
PASS: Verify --trig-delay will generate a sysrq
PASS: Verify comm field is limited to 15 characters wide
Closes-Bug: 1927772
Signed-off-by: Jim Gauld <email address hidden> 8f5eab366cd929b e4c0d4a1f76
Change-Id: I5368aac66b2460