Standby controller reboots if active controller gracefully reboots
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Eric MacDonald |
Bug Description
SM fails the the standby controller on its way down from a spontaneous graceful reboot.
Although gracefully rebooting the active controller is not something that is supported, the fact that the standby controller is also taken down by that event is very undesirable.
Issue does not happen on a forced reboot (with --force option) of the active controller.
This is because of the timing around the graceful process shutdown leads to SM experiencing a heartbeat failure with its peer without the maintenance heartbeat cluster information providing the necessary data to allow SM to know that it needs to be the survivor in this case.
Suggest implementing a change in maintenance to make its heartbeat cluster state change notifications more timely.
Severity
--------
Minor: System recovers after unsupported spontaneous graceful reboot of the active controller.
Steps to Reproduce
------------------
In a duplex system 'sudo reboot' the active controller
Expected Behavior
------------------
SM on the standby controller takes over activity
Actual Behavior
----------------
SM on the standby controller fails itself and gets rebooted by maintenance
Reproducibility
---------------
Highly reproducible
System Configuration
-------
Duplex system
Branch/Pull Time/Commit
-------
starlingx/master at time this issue was created.
Actually, long standing behavior.
Last Pass
---------
Unknown
Timestamp/Logs
--------------
from /var/log/
2020-08-
from /var/log/sm.log
2020-08-
2020-08-
Test Activity
-------------
[Feature Testing, Regression Testing]
Workaround
----------
Don't gracefully reboot the active controller
Changed in starlingx: | |
assignee: | nobody → Eric MacDonald (rocksolidmtce) |
tags: | added: stx.metal |
Changed in starlingx: | |
importance: | Undecided → Critical |
importance: | Critical → Low |
status: | New → Triaged |
tags: | added: stx.5.0 |
Changed in starlingx: | |
importance: | Low → Medium |
summary: |
- Standby controller reboots if active controller spontaneously gracefully - reboots + Standby controller reboots if active controller gracefully reboots |
Fixed by: /review. opendev. org/c/starlingx /metal/ +/769936 /opendev. org/starlingx/ metal/commit/ 7a3adb2cdce217e 1eaaf5e0d9669dc 1190f62763
review: https:/
commit: https:/