After DOR, compute nodes came back in a degraded state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Andre Kantek |
Bug Description
Brief Description
-----------------
After the DOR test all the computes have Alarm “compute is degraded due to the failure of its 'hbsClient' process.
Severity
--------
<Major: System/Feature is usable but degraded>
Steps to Reproduce
------------------
1) Power off all the nodes
2) Wait for 60 seconds
3) Power on
Expected Behavior
------------------
After DOR test all the hosts were able to recover by itself
Actual Behavior
----------------
Computes are not recovered hbsClient process failure
Reproducibility
---------------
Intermittent
State if the issue is 100% reproducible, intermittent or seen once. If it is intermittent, state the frequency of occurrence
System Configuration
-------
Multi-node system (AIO-DX+worker)
Branch/Pull Time/Commit
-------
"2021-12-
Last Pass
---------
"2021-12-
Test Activity
-------------
Regression Testing
Workaround
----------
Reboot worker node
screening: stx.7.0 / medium - specific issue after power reset; workaround exists. Sufficient to fix in the active branch.