ceph-osds are down after swact operation generating a coredump
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Won't Fix
|
Medium
|
Andrei Grosu |
Bug Description
Brief Description
-----------------
ceph-osds are down after swact operation raising the alarm "Loss of replication in replication group group-0: OSDs are down " and generating a coredump
Severity
--------
Minor
Steps to Reproduce
------------------
This issue has been noticed in cert-manager test automation after the swact operation
1)After a swact operation, the following alarms are generated on the system
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 800.011 | Loss of replication in replication group group-0: OSDs are down | cluster=
| 800.001 | Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. | cluster=
+------
'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204:
+------
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| fb516ca6-
| 2759ec2b-
Expected Behavior
------------------
swact should be successful without any errors
Actual Behavior
----------------
ceph osds down after swact
Reproducibility
---------------
50%
System Configuration
-------
WP_13_14 ipv6
Branch/Pull Time/Commit
-------
2020-07-31
Last Pass
---------
2020-07-25_00-00-00
Timestamp/Logs
--------------
2020-08-
Test Activity
-------------
Automation
Workaround
----------
Haven't found any
tags: | added: stx.storage |
description: | updated |
Changed in starlingx: | |
assignee: | Ovidiu Poncea (ovidiu.poncea) → Andrei Grosu (agrosu1) |
tags: | removed: stx.retestneeded |
stx.5.0 / medium priority - ceph appears to be down after a swact. The current assumption is that this is fairly intermittent since we don't see this reported in other sanity/regression runs.