Ceph health warn after backup and restore with replication=3
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Invalid
|
Medium
|
Daniel Badea |
Bug Description
Brief Description
-----------------
The below is taken from Maria Yousaf's testing:
During backup and restore, I noticed ceph was in health warn state as follows and appears to be stuck:
[wrsroot@
cluster 2d62cbb0-
health HEALTH_WARN
555 pgs degraded
555 pgs stuck degraded
1536 pgs stuck unclean
555 pgs stuck undersized
555 pgs undersized
monmap e1: 3 mons at {controller-
osdmap e82: 12 osds: 12 up, 12 in; 981 remapped pgs
flags sortbitwise,
pgmap v449: 1920 pgs, 10 pools, 1588 bytes data, 1116 objects
460 MB used, 11383 GB / 11384 GB avail
ceph osd tree reports the following:
[wrsroot@
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-7 8.21172 root default
-6 1.45279 host storage-2
4 0.72639 osd.4 up 1.00000 1.00000
5 0.72639 osd.5 up 1.00000 1.00000
-8 2.25298 host storage-3
9 1.81749 osd.9 up 1.00000 1.00000
8 0.43549 osd.8 up 1.00000 1.00000
-9 2.25298 host storage-5
11 1.81749 osd.11 up 1.00000 1.00000
10 0.43549 osd.10 up 1.00000 1.00000
-10 2.25298 host storage-4
7 1.81749 osd.7 up 1.00000 1.00000
6 0.43549 osd.6 up 1.00000 1.00000
-2 0 root cache-tier
-1 2.90558 root storage-tier
-3 2.90558 chassis group-0
-4 1.45279 host storage-0
0 0.72639 osd.0 up 1.00000 1.00000
1 0.72639 osd.1 up 1.00000 1.00000
-5 1.45279 host storage-1
2 0.72639 osd.2 up 1.00000 1.00000
3 0.72639 osd.3 up 1.00000 1.00000
Severity
--------
Major: B&R fails when using more than 2 storage nodes
Steps to Reproduce
------------------
With more than 2 storage nodes, execute a B&R
Expected Behavior
------------------
No CEPH health warning should occur
Actual Behavior
----------------
see above
Reproducibility
---------------
100% reproducible with >2 storage nodes
System Configuration
-------
Dedicated storage config with >2 storage nodes
Branch/Pull Time/Commit
-------
Any StarlingX
Timestamp/Logs
--------------
n/a
Changed in starlingx: | |
assignee: | nobody → Daniel Badea (daniel.badea) |
description: | updated |
tags: | added: stx.2018.10 stx.config |
Changed in starlingx: | |
status: | New → Triaged |
importance: | Undecided → Medium |
tags: |
added: stx.1.0 removed: stx.2018.10 |
After further investigation, it was concluded that this is not an issue in starlingx master. Issue was opened in error. Marking as Invalid based on review with Frank Miller.