StarlingX

Bug #1827119
Comment #6

Comment 6 for bug 1827119

Revision history for this message

Ovidiu Poncea (ovidiuponcea) wrote on 2019-05-15:

Re #1. We got 3 answers on the ceph mailing list:

1. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/034672.html
2. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/034674.html
3. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/034709.html

Summary is:
1. In normal conditions size is small (~1.5GB)
2. Ceph-mon data increases if there are OSDs misbehaving (i.e. OSDs down, nodes down) for long time as cluster needs to keep previous epochs for replays once the misbehaving OSD rejoins the cluster.
3. Once reply is done, previous data is cleaned and space is released back (so no leackage, which is good!)
4. There has to be enough space for replays, recommendation is ~64 GB but it depends on the cluster size (note that our clusters are quite small, 4-8 storage nodes each with 4 OSDs is a small cluster from Ceph's perspective). As you said Tingjie, better have a warning and allow user to take action (allow him to increase mon partition size & fix error condition).

Conclusions is that we still need the resize => #1 is out of the way. And the initial 20GB is ok but has to be resizable.