hbsAgent restart does not clear its alarm in AIO SX
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
Eric MacDonald |
Bug Description
The hbsAgent is not clearing its alarms over process startup in AIO SX system like it does on all other system types.
Severity
--------
Minor: Double fault case ; pmond failure that recovers over a hbsAgent process failure.
Stuck 'pmond' process Alarm. This is the only alarm that the hbsAgent could assert in AIO SX
Can lead to a stuck alarm if the alarm condition clears over the hbsAgent restart,
Alarm can remain stuck until the alarm condition re-occurs and clears again without hbsAgent restart.
Steps to Reproduce
------------------
Step 1. kill pmond until the alarm is raised
Step 2. then allow it to recover and restart hbsAgent at the same time.
Expected Behavior
------------------
pmond alarm gets cleared only to be re-raised if the pmond process continues to be failed.
Actual Behavior
----------------
No attempt is made to clear the pmond alarm.
Reproducibility
---------------
100% with the aforementioned steps
System Configuration
-------
AIO SX
Branch/Pull Time/Commit
-------
Current starlingx/master
Last Pass
---------
Test escape.
Timestamp/Logs
--------------
Issue does not produce any error logs.
Issue is understood by maintenance core prime/developer.
Test Activity
-------------
Stress testing
Workaround
----------
Manually clear alarm using fm cli.
tags: | added: stx.metal |
tags: | added: stx.5.0 |
Changed in starlingx: | |
status: | Triaged → Fix Released |
hostname_ inventory. size() == 0