Update kernel bootargs for isolcpus managed_irq to reduce jitter from NVMe
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
sachin |
Bug Description
Brief description:
Update kernel bootargs for isolcpus=<cpulist> to "isolcpus=
Severity:
Minor
Steps to Reproduce:
Lab with NVMe hardware.
To see labs: look at: lab/yow directory, grep -r "nvme"
Look at what cpus the IRQs are landing:
grep -i nvme /proc/interrupts
Notice IRQs incremented on all cpus including cores configured at isolcpus (application-
Inspect process affinity of kernel nvme threads:
ps-sched.sh | grep nvme
These should all match Platform cores affinity mask, It is hard to reproduce scenario where there is error leading to new NVMe kernel threads being created. There would be kern.log indicating a file-system related error relating to mount, unmount, remount, clean mount, Volume not properly unmounted, etc. Don't have a recipe to cause this condition, TBD experiment/divine something to trigger this.
Expected Behavior:
Do not have NVMe IRQs counts on isolcpus cores.
New NVMe threads that get created as result of error do not land on isolcpus cores.
Actual Behavior:
New NVMe tasks have floating affinity mask across all cpus.
Reproducibility:
Intermittent.
Seen a few times when looking specifically validating process affinity.
Hard to reproduce the error case that generates new NVMe tasks.
System Configuration:
Systems with NVMe devices.
Workaround:
No solution that persists.
May correct a running system by re-affining nvme kernel threads with taskset.
Changed in starlingx: | |
status: | New → In Progress |
tags: | added: stx.distro.other stx.storage |
description: | updated |
description: | updated |
Changed in starlingx: | |
assignee: | nobody → sachin (skrishn5) |
Changed in starlingx: | |
importance: | Undecided → Low |
tags: | added: stx.8.0 |
Reviewed: https:/ /review. opendev. org/c/starlingx /config/ +/869929 /opendev. org/starlingx/ config/ commit/ dd490273c8a3c30 df4d44a23a22b5a 5631c1aef0
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit dd490273c8a3c30 df4d44a23a22b5a 5631c1aef0
Author: Sachin Gopala Krishna <email address hidden>
Date: Thu Jan 12 05:37:09 2023 -0500
Update kernel isolcpus boot args to isolate mananged IRQs
Update kernel bootargs from isolcpus=<cpulist> to nohz,domain, managed_ irq,<cpulist> " to reduce jitter to
"isolcpus=
low-latency applications. This configures kernel managed IRQ kernel
threads (such as NVMe) so they do not land on isolcpus cores.
This setting is recommended by kernel SME.
Test Plan:
PASS: Verify isolcpus kernel boot args have new settings.
Closes-Bug: 2002638
Signed-off-by: Sachin Gopala Krishna <email address hidden> 615597c40660a55 47530635535
Change-Id: I419e3aa2823de4