Replied via email, see inline [Ovi] tag: @Ovidiu, checking your comments from LP1827529, couple of questions: - 2. On a storage setup (i.e. one that has storage nodes) users can go from 2 to 3 if there are less than 2 storage nodes deployed. Assuming your comment about “having less than 2 storage nodes” I was thinking having following scenario: Scenario 1) when having 2 storage nodes in a dedicated storage with at least 3 osd.[x]s, meaning we can storage on osd.0 and get replicated on osd.1 and osd.2 [wrsroot@controller-0 ~(keystone_admin)]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.17743 root storage-tier -3 1.30646 chassis group-0 -4 0.43549 host storage-0 0 ssd 0.43549 osd.0 up 1.00000 1.00000 1 ssd 0.43549 osd.1 up 1.00000 1.00000 -5 0.87097 host storage-1 2 ssd 0.43549 osd.2 up 1.00000 1.00000 ***Could you please share if there is any other scenario? [Ovi] Note that data on osd.0 does not replicate on osd.1. Replication is done per node, not per OSD. So data from osd.0 and osd.1 will get replicated on osd.2 (data is divided into small chunks called placement gorups - PGs - PGs get replicated not OSDs => there is no corresponding replicated OSD, so you can't say that OSD.x is replicating on OSD.y but you can say that PG1 on osd.0 gets replicated to osd.2, PG2 on osd.0 gets replicated on osd.2 PG3 on osd.1 gets replicated on osd.2 and so on...). In this case you have replication 2. If you would have replication 3 you will get something like output below and data from osd.0 & osd.1 will be replication on osd.2 and on osd.3: [wrsroot@controller-0 ~(keystone_admin)]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.17743 root storage-tier -3 1.30646 chassis group-0 -4 0.43549 host storage-0 0 ssd 0.43549 osd.0 up 1.00000 1.00000 1 ssd 0.43549 osd.1 up 1.00000 1.00000 -5 0.87097 host storage-1 2 ssd 0.43549 osd.2 up 1.00000 1.00000 -5 0.87097 host storage-2 2 ssd 0.43549 osd.3 up 1.00000 1.00000 - 2. No more than two storage nodes provisioned, once 3rd storage node is provisioned users should no longer be allowed to increase replication number, nor to go back (the reason is that with replication 2 the 3rd node is part of a different replication group and is impossible to go back w/o loosing data or complex operations). based on your comments, we should be thinking in following prerequisites? could you please confirm? Prerequisites to enable factor 3. o Have 2 storage nodes with at least 3 osds up [Ovi] at least one OSD per storage node. There is no need for more than one. o Remove SB_TASK_RECONFIG_CONTROLLER state since is not longer required. IT THIS MEAN there should be a fix/commit for this change? [Ovi] Yes, it;s not related to testing :) o All storage nodes should be on OK status. [Ovi] yes o and after that run “$system storage-backend-modify ceph-store replication=3 min_replication=2” command? [Ovi] yes @Ovidiu, could you please confirm if adding below test cases make sense for you? - Add negative test case where replication factor 3 is not allowed on DX and standard (a.k.a. 2+2 w/o storage nodes) [Ovi] I confirm Regarding storage modes, we should be adding following negative test cases - For AIO-SX model, confirm replication is made on OSDs [Ovi] I confirm. Btw. if AIO-SX has 2 OSDs and user goes from replication 2 to 3 will set cluster to HEALTH_WARN till 3rd OSD is installed. - For Controller-model, confirm two osd monitors are on controllers and the 3rd one in a worker and make sure after that user wont be able to install more storage nodes. [Ovi] I confirm - For storage-model, once storage-0 is added user can no longer add monitors to a compute nor OSDs to controllers. Meaning we can do it if we first create osds on controllers and then add storage-0 node [Ovi] No, once OSDs are added to controllers storage node install should not be allowed (a test is worth to check for this)