system health-query no response on Ceph query
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
John Kung |
Bug Description
Brief Description
-----------------
Provide a brief description of the issue. Usually, it should not be more than 2 to 3 lines.
Example: After performing a restore of the system, user is unable to swact the controller.
Severity
--------
Major: System/Feature is usable but degraded. Unable to see details of health-query.
Steps to Reproduce
------------------
With failed ceph cluster, the health-query does not provide details of the failed
Ceph condition.
Expected Behavior
------------------
# system health-query
System Health:
All hosts are provisioned: [OK]
All hosts are unlocked/enabled: [OK]
All hosts have current configurations: [OK]
All hosts are patch current: [OK]
Ceph Storage Healthy: [Fail]
No alarms: [Fail]
[5] alarms found, [3] of which are management affecting
All kubernetes nodes are ready: [OK]
All kubernetes control plane pods are ready: [OK]
Actual Behavior
----------------
[root@controller-0 common(
Unable to perform health query.
Reproducibility
---------------
<Intermittent>
Occurs when Ceph-api is unresponsive.
System Configuration
-------
Ceph configured
Branch/Pull Time/Commit
-------
Branch and the time when code was pulled or git commit or cengn load info
Last Pass
---------
N/A
Timestamp/Logs
--------------
[root@controller-0 common(
Tue Jun 14 13:49:05 UTC 2022
Unable to perform health query.
Tue Jun 14 13:50:06 UTC 2022
sysinv 2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
2022-06-14 13:50:06.589 106220 ERROR sysinv.
sysinv 2022-06-14 13:50:06.590 106220 WARNING wsme.api [-] Client-side error: Unable to perform health query.: ClientSideError: Unable to perform health query.
Test Activity
-------------
Integration Testing: orchestrated subcloud upgrades
Workaround
----------
The system health-query is dependent on Ceph being in good state for a response.
Changed in starlingx: | |
assignee: | nobody → John Kung (john-kung) |
importance: | Undecided → Medium |
tags: | added: stx.7.0 stx.config |
Fix proposed to branch: master /review. opendev. org/c/starlingx /config/ +/845829
Review: https:/