Atomic disk handling

Bug #1830680 reported by Wouter van Bommel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Triaged
Medium
Unassigned

Bug Description

When using the vault charm in combination with ceph-osd, it will automatically encrypt disks and create a pv / vg / lv on top of this.
But in the case something goes wrong for a reason (e.g. bluestore lv has no space left) it will not undo the creation of the osd parts.

Running zap-disk action does not help, as the kernel will keep the keyring / lvm information in memory until a reboot. So to continue with adding a disk, without a reboot manual intervention is needed.

If the disk addition would be made more atomic (multi-phase commit) then it would be possible to recover from errors without manual interaction (with the risk of mixing up id's)

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

@Woulter, it's not clear (to me) exactly how you would see this working. Please could you describe the steps/commands that were taken, along with the resultant scenario that led to the bug report, and then what you had to do to resolve it manually?

Also, if you could indicate your preferred set of steps/actions (or what you think the charm should do) that would've removed the need for manually jumping onto the host?

Many thanks.

Changed in charm-ceph-osd:
status: New → Incomplete
Revision history for this message
Wouter van Bommel (woutervb) wrote :

Hi Alex,

The step I took as 'juju add-disk ceph-osd devices='/dev/nvme6p1'. This step failed as there was no room in the bluestore to add another db & wal lvm.

This made the disk /dev/nvme6p1 unusable and to get is back to a state in where I could re-add it to the charm, I had to take the following actions:

    * lvremove -v /dev/ceph-b432de26-c075-4097-b699-1cf0f41f7402/osd-block-b432de26-c075-4097-b699-1cf0f41f7402
    * vgremove -v ceph-b432de26-c075-4097-b699-1cf0f41f7402
    * pvremove -v /dev/mapper/crypt-b432de26-c075-4097-b699-1cf0f41f7402
    * cryptsetup close /dev/mapper/crypt-b432de26-c075-4097-b699-1cf0f41f7402
    * partprobe /dev/nvme6n1

The last step was just to make sure that all was fine.

So I would be great if the addition of a disk fails due e.g. a problem with the bluestore, the whole creation of the lvm, vg, pv & crypto device would be reversed.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

@Wouter, thanks very much for the detailed steps; it's now clear (to me at least!) what you mean by the multi-phase commit in terms of the actions taken.

TRIAGE:

It seems clear that the charm should validate that each previous step completed properly before moving on, and also attempt to 'un-do' any actions if they fail. A series of nested "try: except: " statements would enable the except handling to handle undoing the current action, and the nested ones to undo their's. Only if no exception is thrown would everything happen. (obviously, there's the issue of handling exceptions in the 'undo' phase, but that 'should' be easier).

Changed in charm-ceph-osd:
status: Incomplete → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.