B&R: Ceph HEALTH_ERR on controller-1

Bug #1910776 reported by Dan Voiculeasa
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Dan Voiculeasa

Bug Description

Brief Description
-----------------

Severity
--------
Critical: System/Feature is not usable due to the defect

Steps to Reproduce
------------------
Install stx-monitor app.
Do a B&R.

Expected Behavior
------------------
During restore procedure after controller-1 unlock.
Ceph health is HEALTH_OK and pods will start.

Actual Behavior
----------------
Ceph health becomes HEALTH_ERR after a few minutes after controller-1 is unlocked.
Pods(for example kibana) will not start.

Reproducibility
---------------
Intermittent.

System Configuration
--------------------
At least 2 nodes.
AIO-DX, STANDARD 2+2, IPv6

Branch/Pull Time/Commit
-----------------------
Take any December 2020 load.

Last Pass
---------
Don't know.

Timestamp/Logs
--------------
df -h -H -T --local -t ext2 -t ext3 -t ext4 -t xfs --total
/dev/sdb1 xfs 479G -14M 479G 0% /var/lib/ceph/osd/ceph-1

controller-1:/var/lib/ceph/osd$ ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 0.87097 - 892 GiB 1.6 GiB 890 GiB 1926370688.00 1.00 - root storage-tier
-2 0.87097 - 892 GiB 1.6 GiB 890 GiB 1926370688.00 1.00 - chassis group-0
-4 0.43549 - 446 GiB 1.6 GiB 444 GiB 0.36 0 - host controller-0
0 ssd 0.43549 1.00000 446 GiB 1.6 GiB 444 GiB 0.36 0 64 osd.0
-3 0.43549 - 446 GiB 16 EiB 446 GiB 3852741376.00 2.00 - host controller-1
1 ssd 0.43549 1.00000 446 GiB 16 EiB 446 GiB 3852741376.00 2.00 64 osd.1
TOTAL 892 GiB 1.6 GiB 890 GiB 1926370688.00

Test Activity
-------------
B&R Testing.

Changed in starlingx:
assignee: nobody → Dan Voiculeasa (dvoicule)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium - specific issue related to backup & restore procedure

tags: added: stx.5.0 stx.update
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Ghada Khalil (gkhalil)
tags: added: stx.storage
Revision history for this message
Dan Voiculeasa (dvoicule) wrote :
Changed in starlingx:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.