dont intialize cryptsetup if vault cannot be reached

Bug #1818165 reported by Wouter van Bommel
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Invalid
Undecided
Unassigned
vaultlocker
Fix Committed
Medium
Unassigned

Bug Description

Currently ceph will start encrypting the disk, if a relation exists with a vault.

If we run into the situation that the vault cannot be reached, this means that there are encrypted disks of which the key is not saved. Recovery is hard, as this basically means that all the disks have to be 'unformatted' and re-added to the cluster.

tags: added: canonical-bootstack
Revision history for this message
Wouter van Bommel (woutervb) wrote :

For reference I added the resulting lsblk of this host here: https://pastebin.canonical.com/p/Mwfzw4tzXY/

This host is suffering from a hook failed: "secrets-storage-relation-joined" status in juju, which was caused by network issues not detected before the ceph charm was added.

Revision history for this message
Paul Collins (pjdc) wrote :

This isn't a total disaster; you can get the master key from dmsetup table --showkeys while the mapping still exists, and cryptsetup luksAddKey has a --master-key-file parameter you can use to add a known passphrase for adding to vault. (I
m not sure if --master-key-file wants hex-encoded data, as printed by dmsetup, or binary.)

Revision history for this message
Wouter van Bommel (woutervb) wrote :

Great that does not mean data loss at this point in time. But having to manually doing the passphrase setup on 40 disks is not something I am looking forward to.
As there is no data written on them yet. It would be great if there would a command that we can use to get this revolved.

Revision history for this message
James Page (james-page) wrote :

Does the zap-disk action not manage to purge any cryptsetup from the block devices?

If not we should probably fix that.

Also worth noting that the charm has already managed to contact Vault to retrieve the AppRole secret by the point in time that disk encryption started.

Revision history for this message
James Page (james-page) wrote :

Looking at the vaultlocker code, we could attempt to store and validate the key prior to actually using it to format the disk - this would leave the disk pristine in the event of a vault related error.

Changed in charm-ceph-osd:
status: New → Invalid
Changed in vaultlocker:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
James Page (james-page) wrote :

The alternative is that vaultlocker could purge the block device if the storage of the key fails in vault for some reason - ensuring that the disk is left in a pristine state.

Changed in vaultlocker:
assignee: nobody → Marco Filipe Moutinho da Silva (mfmsilva)
Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :
Changed in vaultlocker:
status: Triaged → In Progress
Changed in vaultlocker:
status: In Progress → Fix Committed
assignee: Marco Filipe Moutinho da Silva (mfmsilva) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.