Hearbeat always fails on nodes that reboot with reconfigured heartbeat action handling
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Eric MacDonald |
Bug Description
Brief Description
-----------------
If the system administrator reconfigures the maintenance heartbeat fault handling action from the default 'fail' to any other setting [degrade,
complete ; which never happens.
The /var/run/
Severity
--------
Major for customers that reconfigure maintenance heartbeat fault action handling.
Steps to Reproduce
------------------
system service-
system service-
log into standby controller and reboot
Expected Behavior
------------------
Node recovers in-service with heartbeat working
Actual Behavior
----------------
Node recovers but heartbeat is not working
Reproducibility
---------------
100% reproducible with heartbeat reconfigured to alarm, degrade or none
System Configuration
-------
AIO
Branch/Pull Time/Commit
-------
All loads built prior to this issue being fixed
Loads prior to June 3, 2024
Last Pass
---------
Test escape
Timestamp/Logs
--------------
for the hbsClient
2024-05-
or for the hbsAgent
2024-06-
Test Activity
-------------
Normal use in lossy networking environment
Workaround
----------
Lock and unlock affected nodes
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.10.0 stx.metal |
Changed in starlingx: | |
assignee: | nobody → Eric MacDonald (rocksolidmtce) |
Fix proposed to branch: master /review. opendev. org/c/starlingx /metal/ +/921332
Review: https:/