Ceph storage condition alarm was not cleared after lock and unlock controller and swact
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Won't Fix
|
Medium
|
Bob Church |
Bug Description
Brief Description
-------
During the automation run on while configuring PTP interface over dedicated interface below alarms are displayed. These alarms are generated during the controller-0 lock and unlock then swact controller-0. Later “Ceph Storage Alarm Condition: health warn” was not cleared.
fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:27:18,788] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 700.016 | Multi-Node Recovery Mode | subsystem=vim | major | 2020-08-
| 800.001 | Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. | cluster=
| 750.006 | A configuration change requires a reapply of the cert-manager application. | k8s_application
+------
[2020-08-24 05:35:56,206] 314 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:35:58,214] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 800.001 | Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. | cluster=
Steps to Reproduce
------------------
1. Initial health condition no alarm.
2. Configure PTP on system
3. Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:08:47,810] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Property | Value |
+------
| uuid | 42dd70b4-
| mode | hardware |
| transport | l2 |
| mechanism | p2p |
| isystem_uuid | c88683bf-
| created_at | 2020-08-
| updated_at | 2020-08-
+------
'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:09:04,802] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Property | Value |
+------
| uuid | 25696c3b-
| service | ptp |
| section | global |
| name | delay_mechanism |
| value | p2p |
| personality | None |
| resource | None |
+------
system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:09:12,100] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Property | Value |
+------
| uuid | bc44edaf-
| service | ptp |
| section | global |
| name | domainNumber |
| value | 24 |
| personality | None |
| resource | None |
+------
4. Locking and unlocking controller-0
'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:10:49,749] 436 DEBUG MainThread ssh.expect :: Output:
+------
| uuid | name | backend | state | task | services | capabilities |
+------
| f30c1cc8-
| | | | | | | |
+------
5. system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
6. 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
7. lock compute-1 and unlock
'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
8. system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:25:59,871] 436 DEBUG MainThread ssh.expect :: Output:
System Configuration
-------
regular system WCP7-10
Expected Behavior
------------------
alarm should be cleared after lock and unlock .
Actual Behavior
----------------
As description says alarms are not cleared.
Reproducibility
---------------
100% reproducible in WCP_7_10.
Load
----
2020-08-22_20-00-00
Last Pass
---------
2020-07-
Timestamp/Logs
--------------
fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-08-24 05:27:18,788] 436 DEBUG MainThread ssh.expect :: Output:
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 700.016 | Multi-Node Recovery Mode | subsystem=vim | major | 2020-08-
| 800.001 | Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. | cluster=
| 750.006 | A configuration change requires a reapply of the cert-manager application. | k8s_application
+------
Test Activity
-------------
Automated regression
Changed in starlingx: | |
assignee: | Elena Taivan (etaivan) → Bob Church (rchurch) |
tags: | removed: stx.retestneeded |
stx.5.0 / medium priority - as per Yang, issue seems reproducible on a system configured w/ PTP