On Tue, May 29, 2018 at 05:57:56AM -0000, Trent Lloyd wrote:
> The currently committed scheme to keep track of which osd-devices were
> already formatted is likely not a sufficient safe guard and needs
> further improvement. Reasons are
>
> (1) Device letters change (i.e. when adding a new disk, but it comes up
> earlier in the device list order). This is very common on production
> hardware, not a side case. Devices are also not guaranteed to come up in
> the same order. It would be easy for an existing OSD to get pushed off
> into a device name that is not listed or previously initialised and
> accidentally get re-initialised.
If a user is concerned about this, they should use /dev/disk/{by-path,by-id} in the osd-devices option, I don't believe we have any restriction on the usage of those paths, we just check the prefix '/dev/' to check if the argument is a directory or a disk device.
>
> (2) New OSDs are added to the host but manually with ceph commands (in a
> way this is similar to the above). This has happened on production
> deployments when the add-disk action command was broken.
We could add an action to mark a device as osdized to allow an operator merge the view of the charm with what the system is actually running.
> However even if you use the add-disk action, the device may not be
> listed in the osd- devices config option if the action was taken
> before the charm stored the list of previously initialised osd-devices
> (the cation uses osdize(), so an new deployments would populate it -
> but old deployments won't).
As I mentioned in comment #3, the upgrade-charm needs to populate the osd-devices list.
> Even if we added code to populate this value on existing clusters from
> the osd-devices config option, OSDs that are in use but not in the
> config option won't be added to the previously initialised list and
> thus are still vulnerable to being reformatted.
osd-reformat is not an option someone should have set in this case and in conjunction with the action I suggest previously a user could get to state where the charm is in sync with the running system.
>
> (3) This is slightly out of use case for the charms, but something I can
> still see people doing and it would be ideal to avoid data loss from is
> also moving an existing OSD from one machine to another to recover it,
> which is something explicitly supported and generally recommended as a
> perfectly reasonable action within the Ceph community.
In this case the user should have osd-reformat set to false (or "").
[...]
> Having said all of that, I think that this option should be removed and instead replaced with a juju action to one-shot reformat OSDs.
This has been discussed, but the lack of support in the bundles to run actions at the end of a deployment make it non an option.
On Tue, May 29, 2018 at 05:57:56AM -0000, Trent Lloyd wrote:
> The currently committed scheme to keep track of which osd-devices were
> already formatted is likely not a sufficient safe guard and needs
> further improvement. Reasons are
>
> (1) Device letters change (i.e. when adding a new disk, but it comes up
> earlier in the device list order). This is very common on production
> hardware, not a side case. Devices are also not guaranteed to come up in
> the same order. It would be easy for an existing OSD to get pushed off
> into a device name that is not listed or previously initialised and
> accidentally get re-initialised.
If a user is concerned about this, they should use /dev/disk/ {by-path, by-id} in the osd-devices option, I don't believe we have any restriction on the usage of those paths, we just check the prefix '/dev/' to check if the argument is a directory or a disk device.
>
> (2) New OSDs are added to the host but manually with ceph commands (in a
> way this is similar to the above). This has happened on production
> deployments when the add-disk action command was broken.
We could add an action to mark a device as osdized to allow an operator merge the view of the charm with what the system is actually running.
> However even if you use the add-disk action, the device may not be
> listed in the osd- devices config option if the action was taken
> before the charm stored the list of previously initialised osd-devices
> (the cation uses osdize(), so an new deployments would populate it -
> but old deployments won't).
As I mentioned in comment #3, the upgrade-charm needs to populate the osd-devices list.
> Even if we added code to populate this value on existing clusters from
> the osd-devices config option, OSDs that are in use but not in the
> config option won't be added to the previously initialised list and
> thus are still vulnerable to being reformatted.
osd-reformat is not an option someone should have set in this case and in conjunction with the action I suggest previously a user could get to state where the charm is in sync with the running system.
>
> (3) This is slightly out of use case for the charms, but something I can
> still see people doing and it would be ideal to avoid data loss from is
> also moving an existing OSD from one machine to another to recover it,
> which is something explicitly supported and generally recommended as a
> perfectly reasonable action within the Ceph community.
In this case the user should have osd-reformat set to false (or "").
[...]
> Having said all of that, I think that this option should be removed and instead replaced with a juju action to one-shot reformat OSDs.
This has been discussed, but the lack of support in the bundles to run actions at the end of a deployment make it non an option.