Frequent mtce heartbeat misses in virtual environment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Triaged
|
Low
|
Eric MacDonald |
Bug Description
The hbsAgent log reports many ongoing heartbeat miss logs with a frequency that could (rarely) escalate to alarm or host degrade.
Heartbeating a cluster of hosts at a 100 ms cadence in virtual environment is known to exhibit this behavior. To account for this the hbsAgent supports a -V (virtual) startup option that commands it
to run in 'virtual' mode. In virtual mode the hbsAgent overrides the configured heartbeat cadence
with a static 500ms cadence. Heatbeating at a 500 msec cadence in virtual environment is fine.
The hbsAgent startup script calls 'virt-what' as a means to detect if the active controller is running in virtual mode and enables that mode if it is. However, output of 'virt-what' in the new virtual installer 'vdm' tool is different compared to how it was tested in the past. Now the script parsing of that output is no longer able to detect virtual mode so it heartbeats at the configured value thereby causing this issue.
The fix for this issue is to enhance the hbsAgent startup script to better handle the output of 'virt-what' to continue to handle the old but also accommodate for the way the vdm presents the output.
Severity
--------
Minor: Affects systems running in a virtual environment in a minor way.
Steps to Reproduce
------------------
Install system with 'vdm' tool
Expected Behavior
------------------
heartbeat at 500 msec cadence
Actual Behavior
----------------
Heartbeat at 100 msec cadence
Reproducibility
---------------
100% reproducible
System Configuration
-------
Any duplex system
Branch/Pull Time/Commit
-------
2020-06-26_04-10-00
Last Pass
---------
N/A
Timestamp/Logs
--------------
2020-06-
2020-06-
2020-06-
2020-06-
2020-06-
2020-06-
No collect required. Issue and fix is understood.
Test Activity
-------------
Feature Testing
Workaround
----------
system service-
system service-
Changed in starlingx: | |
assignee: | nobody → Eric MacDonald (rocksolidmtce) |
low priority - minor issue on virtual env