Comment 9 for bug 1430845

Revision history for this message
Mykola Golub (mgolub) wrote :

Failure is expected in your case. You had replication factor 2 (while 3 is recommended for production), you ended up with the cluster that had 10 osd. but only 4 was up and in, so there was a large probability of both replica being on the lost nodes.

It is not supposed, that after adding a node, any other changes with cluster are performed until the rebalancing is complete. Also it is recommened remove osds one by one.

The proposed ceph.conf options decrease load during rebalancing and recovery still you don't have to expect much from clusters like yours (severals OSDs under virtual machines).