AIO-DX task affining triggered by swact does not work
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Jim Gauld |
Bug Description
Brief Description
-----------------
On AIO,DX, SM swact will invoke a task affinity script to move tasks to idle cores and then back to platform cores. This is intended to dramatically speedup the swact. There are cases where end of swact condition does not trigger, (eg, say if a minor service fails), this leaves platform tasks running on both Platform and application cores.
Issuing a swact will dramatically slow down installation of openstack application if swact occurred before the install, since there is interaction with the other affine-tasks.sh init script, an it will move tasks to Platform cores prematurely.
Severity
--------
Major: System is usable but degraded.
* Latency impact to Applications, no longer isolated from Platform
* Openstack install can be slowed down
Steps to Reproduce
------------------
AIO-DX, swact back and forth.
Cause a service to fail or be disabled (eg, ceph-osd, nfv-vim).
Expected Behavior
------------------
Swact should complete, Platform processes should be re-affined back to platform.
Actual Behavior
----------------
Swact does not actually complete due to minor failed service, but tasks remain floating on application cores. Subsequent swacts, and subsequent host reboots does not fix the affinity there is a lingering flag file created in a persistent disk location, tasks are stuck forever.
Also interaction with installation of openstack which would be perceived as a slowdown. If there a openstack-
Reproducibility
---------------
Intermittent if swacting manually since we don't generally have failing services.
More frequent in specific Sanity TCs that swact as setup step.
System Configuration
-------
AIO-DX only.
Branch/Pull Time/Commit
-------
NA
Last Pass
---------
NA
Timestamp/Logs
--------------
Can see /var/log/sm.log for swact actions, if we see the following we never get to the end of SWACT;
2021-05-
2021-05-
On both controllers, see SM invoking sm_task_
controller-
grep -rs "affining to " .
./sm.log:
./sm.log:
SM is essentially doing the following:
// start of swact
source /etc/init.
// end of swact
source /etc/init.
Test Activity
-------------
Sanity
Workaround
----------
None
Changed in starlingx: | |
assignee: | nobody → Jim Gauld (jgauld) |
status: | New → Confirmed |
status: | Confirmed → In Progress |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.6.0 stx.ha |
Fix proposed to branch: master /review. opendev. org/c/starlingx /utilities/ +/792028
Review: https:/