AIO: Support running high priority RT cpu hog
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Won't Fix
|
Medium
|
Jim Gauld |
Bug Description
Brief Description
-----------------
In cases where application pods use RR at high priority, critical linux tasks may be starved, leading to softdog timeouts and host reboot.
The following update will support running application pods that have high priority RT cpu hogs up to RR priority 50.
Severity
--------
Critical: low-latency systems will see unsatisfactory jitter, potential reboots.
Steps to Reproduce
------------------
Configure AIO with label kube-cpu-
system host-label-assign <hostname or id> kube-cpu-
Verify which cpus are application cores.
system host-cpu-list controller-0
Run stress-ng in application pod on application-cores with scheduler priority RR 50, specify subset of application cpus.
kubectl run stressng --image=
--overrides=
Note this is an example, there are many variations of applications and stress test options.
Expected Behavior
------------------
Running cyclictest in a pod, expect max jitter < 20 usec.
Expect host not to lockup and reboot.
Actual Behavior
----------------
Hit softdog timeout, host reboot.
Reproducibility
---------------
Depends on application settings.
100% reproducible.
System Configuration
-------
AIO low-latency.
Branch/Pull Time/Commit
-------
-
Last Pass
---------
-
Timestamp/Logs
--------------
-
Test Activity
-------------
Evaluation.
Workaround
----------
none.
Changed in starlingx: | |
assignee: | nobody → Jim Gauld (jgauld) |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.5.0 stx.config |
tags: | added: stx.metal |
Changed in starlingx: | |
status: | In Progress → New |
status: | New → In Progress |
Fix proposed to branch: master /review. opendev. org/758689
Review: https:/