Activity log for bug #1789908

Date Who What changed Old value New value Message
2018-08-30 14:04:20 Frank Miller bug added bug
2018-08-30 14:04:59 Frank Miller starlingx: assignee Daniel Badea (daniel.badea)
2018-08-30 14:46:46 Frank Miller description Brief Description ----------------- During backup and restore, I noticed ceph was in health warn state as follows and appears to be stuck: [wrsroot@controller-0 scratch(keystone_admin)]$ ceph -s cluster 2d62cbb0-2f6c-4382-a4ea-a024c0dc166e health HEALTH_WARN 555 pgs degraded 555 pgs stuck degraded 1536 pgs stuck unclean 555 pgs stuck undersized 555 pgs undersized monmap e1: 3 mons at {controller-0=192.168.215.103:6789/0,controller-1=192.168.215.104:6789/0,storage-0=192.168.215.105:6789/0} election epoch 6, quorum 0,1,2 controller-0,controller-1,storage-0 osdmap e82: 12 osds: 12 up, 12 in; 981 remapped pgs flags sortbitwise,require_jewel_osds pgmap v449: 1920 pgs, 10 pools, 1588 bytes data, 1116 objects 460 MB used, 11383 GB / 11384 GB avail 561 active+remapped 555 active+undersized+degraded 420 active 384 active+clean ceph osd tree reports the following: [wrsroot@controller-0 scratch(keystone_admin)]$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -7 8.21172 root default -6 1.45279 host storage-2 4 0.72639 osd.4 up 1.00000 1.00000 5 0.72639 osd.5 up 1.00000 1.00000 -8 2.25298 host storage-3 9 1.81749 osd.9 up 1.00000 1.00000 8 0.43549 osd.8 up 1.00000 1.00000 -9 2.25298 host storage-5 11 1.81749 osd.11 up 1.00000 1.00000 10 0.43549 osd.10 up 1.00000 1.00000 -10 2.25298 host storage-4 7 1.81749 osd.7 up 1.00000 1.00000 6 0.43549 osd.6 up 1.00000 1.00000 -2 0 root cache-tier -1 2.90558 root storage-tier -3 2.90558 chassis group-0 -4 1.45279 host storage-0 0 0.72639 osd.0 up 1.00000 1.00000 1 0.72639 osd.1 up 1.00000 1.00000 -5 1.45279 host storage-1 2 0.72639 osd.2 up 1.00000 1.00000 3 0.72639 osd.3 up 1.00000 1.00000 Severity -------- Major: B&R fails when using more than 2 storage nodes Steps to Reproduce ------------------ With more than 2 storage nodes, execute a B&R Expected Behavior ------------------ No CEPH health warning should occur Actual Behavior ---------------- see above Reproducibility --------------- 100% reproducible with >2 storage nodes System Configuration -------------------- Dedicated storage config with >2 storage nodes Branch/Pull Time/Commit ----------------------- Any StarlingX Timestamp/Logs -------------- n/a Brief Description ----------------- The below is taken from Maria Yousaf's testing: During backup and restore, I noticed ceph was in health warn state as follows and appears to be stuck: [wrsroot@controller-0 scratch(keystone_admin)]$ ceph -s     cluster 2d62cbb0-2f6c-4382-a4ea-a024c0dc166e      health HEALTH_WARN             555 pgs degraded             555 pgs stuck degraded             1536 pgs stuck unclean             555 pgs stuck undersized             555 pgs undersized      monmap e1: 3 mons at {controller-0=192.168.215.103:6789/0,controller-1=192.168.215.104:6789/0,storage-0=192.168.215.105:6789/0}             election epoch 6, quorum 0,1,2 controller-0,controller-1,storage-0      osdmap e82: 12 osds: 12 up, 12 in; 981 remapped pgs             flags sortbitwise,require_jewel_osds       pgmap v449: 1920 pgs, 10 pools, 1588 bytes data, 1116 objects             460 MB used, 11383 GB / 11384 GB avail                  561 active+remapped                  555 active+undersized+degraded                  420 active                  384 active+clean ceph osd tree reports the following: [wrsroot@controller-0 scratch(keystone_admin)]$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY  -7 8.21172 root default  -6 1.45279 host storage-2   4 0.72639 osd.4 up 1.00000 1.00000   5 0.72639 osd.5 up 1.00000 1.00000  -8 2.25298 host storage-3   9 1.81749 osd.9 up 1.00000 1.00000   8 0.43549 osd.8 up 1.00000 1.00000  -9 2.25298 host storage-5  11 1.81749 osd.11 up 1.00000 1.00000  10 0.43549 osd.10 up 1.00000 1.00000 -10 2.25298 host storage-4   7 1.81749 osd.7 up 1.00000 1.00000   6 0.43549 osd.6 up 1.00000 1.00000  -2 0 root cache-tier  -1 2.90558 root storage-tier  -3 2.90558 chassis group-0  -4 1.45279 host storage-0   0 0.72639 osd.0 up 1.00000 1.00000   1 0.72639 osd.1 up 1.00000 1.00000  -5 1.45279 host storage-1   2 0.72639 osd.2 up 1.00000 1.00000   3 0.72639 osd.3 up 1.00000 1.00000 Severity -------- Major: B&R fails when using more than 2 storage nodes Steps to Reproduce ------------------ With more than 2 storage nodes, execute a B&R Expected Behavior ------------------ No CEPH health warning should occur Actual Behavior ---------------- see above Reproducibility --------------- 100% reproducible with >2 storage nodes System Configuration -------------------- Dedicated storage config with >2 storage nodes Branch/Pull Time/Commit ----------------------- Any StarlingX Timestamp/Logs -------------- n/a
2018-08-31 19:12:25 Ghada Khalil tags stx.2018.10 stx.config
2018-08-31 19:12:37 Ghada Khalil starlingx: status New Triaged
2018-08-31 19:13:16 Ghada Khalil starlingx: importance Undecided Medium
2018-09-06 21:18:11 Ghada Khalil starlingx: status Triaged Invalid
2019-04-06 16:14:38 Ken Young tags stx.2018.10 stx.config stx.1.0 stx.config