Periodic short-term subcloud out-of-sync alarm (platform shared resources)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
Gustavo Herzmann |
Bug Description
Brief Description
-----------------
DC deployment reports a periodic subcloud out-of-sync alarms that occur more or less exactly with 7 days period at the same time (within single seconds) with all subclouds involved.
The out-of-sync status clears in a few seconds.
All subclouds are affected.
It appears this is due to the fernet key rotation sync requests.
Severity
--------
Minor: System/Feature is usable with minor issue
Steps to Reproduce
------------------
No steps to reproduce, just let the system run for more than 7 days and check fm-event.log for 280.002 alarms.
Expected Behavior
------------------
If the fernet key rotation is expected, a major alarm should not be generated in this case since the condition is expected and no corrective action is required.
Actual Behavior
----------------
Every time the fernet keys rotates, a major 280.002 alarm is raised.
Reproducibility
---------------
100% reproducible
System Configuration
-------
Distributed Cloud
Branch/Pull Time/Commit
-------
Happened with Centos build 2021-05, later confirmed to also be happening with latest Debian build (2023-01-04 master)
Last Pass
---------
NA.
Timestamp/Logs
--------------
Example message from fm-event.log:
2023-01-
Test Activity
-------------
Normal use
Workaround
----------
NA.
Changed in starlingx: | |
status: | New → In Progress |
assignee: | nobody → Gustavo Herzmann (gherzman) |
tags: | added: stx.distcloud stx.fault |
Changed in starlingx: | |
importance: | Undecided → Low |
tags: | added: stx.8.0 |
Fix proposed to branch: master /review. opendev. org/c/starlingx /distcloud/ +/869501
Review: https:/