[wishlist] Action to 'purge-osd' and 'set-osd-out' needed for fully-charmed disk lifecycle management

Bug #1813360 reported by Drew Freiberger
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
Triaged
Wishlist
Unassigned

Bug Description

When performing a replacement of an OSD, it is best practice to purge (Luminous) or remove the disk from the crush, osd, and auth maps (pre-Luminous) after setting the disk out.

First, we need the ability to set a single OSD out/down. It is possible with the ceph-osd charm to set an entire node's OSDs out with the osd-out action, but either ceph-mon or ceph-osd need the ability to take only a single failing disk out of the cluster before an expected replacement or in response to a failure.

Secondly, the disk will need to be able to be purged/removed from the maps so that ceph-osd charm action add-disk can be used upon successful replacement of the disk.

Here's a typical process today:

1. juju ssh ceph-mon/0 sudo ceph osd out $OSD_NAME (aka osd.26)
2. juju ssh -t ceph-mon/0 sudo watch ceph status
       * Wait for this to show HEALTH_OK and no "recovery/backfill" lines
3. juju ssh ceph-mon/0 sudo ceph osd purge $OSD_ID --yes-i-really-mean-it
       * Note: this is a Luminous only command. Fall-back for pre-luminous would be: osd rm; osd crush rm; osd auth rm on pre-luminous as noted in [1].
4. juju run-action --wait $OSD_UNIT zap-disk devices=<path-to-dead-disk>
       * Be VERY SURE as this will destroy data completely from the drive. If you've already added the disk back into the cluster at the same ID as it was previously (i.e. before/after is osd.26), do not run this command. Instead you'll need to use lvm commands to directly remove the vg and pv from record.

[1] http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-the-osd

James Page (james-page)
Changed in charm-ceph-mon:
status: New → Triaged
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.