curtin

Bug #1835091
Comment #14

Comment 14 for bug 1835091

Revision history for this message

Ryan Harper (raharper) wrote on 2019-07-08:

#14

The shutdown plan requires that the config for a device being cleared to not include the preserve: True config. The "raid" partition, vda3 is explicitly marked as preserved: true:

  - device: disk-vda
    size: 20947402752
    flag: linux
    preserve: true
    type: partition
    id: partition-vda3

Curtin would not be able to clear raid metadata from this partition without wipe: set and preserve not present.

That said, I do think that curtin can do a few things to resolve this:

1) include partial raids in the discovered config, and can either
a) add a field to indicate whether the array is healthy/partial/degrated; array_state maybe
b) defer to subiquity to use curtin.block.mdadm.md_check to determine if it wants to include or mark members of the array with wipe so they can be used in other configs.

2) Update how we run clear-holders; Currently we only pass in a list of block devices, of type disk which have 'wipe' set and do not have 'preserve' enabled. This fails the case here where we'd like to wipe vda3 but it has a holder.

Concretely; I'd curtin discover would return:

    - type: raid
        level: 0
        name: md127
        devices:
        - partition-vda3
        spare_devices: []
        array_state: failed

Now, to do that, probert will need to also include partial raids as well. Not sure; it's odd that pyudev didn't have a /dev/md127 entry in the context.

The alternative for probert is to run some mdadm commands on the devices which have the ID_FSTYPE set to raid. I'll add a probert task for this bug as well.

For clear-holders, curtin will also accept type: partition. I expect the final config to set wipe: superblock on vda3 since it's a raid member; and that clear-holders is called with devices=['/dev/vda3']

  - device: disk-vda
    size: 20947402752
    flag: linux
    wipe: superblock
    type: partition
    id: partition-vda3

From there, clear-holders would find /dev/md127 has a holder of type: raid and then the normal curtin shutdown plan would show us stopping md127, wipe each array member (/dev/vda3).

The shutdown plan requires that the config for a device being cleared to not include the preserve: True config.  The "raid" partition, vda3 is explicitly marked as preserved: true:

- device: disk-vda
    size: 20947402752
    flag: linux
    preserve: true
    type: partition
    id: partition-vda3

Curtin would not be able to clear raid metadata from this partition without wipe: set and preserve not present.

That said, I do think that curtin can do a few things to resolve this:

1) include partial raids in the discovered config, and can either
   a) add a field to indicate whether the array is healthy/partial/degrated; array_state maybe
   b) defer to subiquity to use curtin.block.mdadm.md_check to determine if it wants to include or mark members of the array with wipe so they can be used in other configs.

2) Update how we run clear-holders;  Currently we only pass in a list of block devices, of type disk which have 'wipe' set and do not have 'preserve' enabled.  This fails the case here where we'd like to wipe vda3 but it has a holder.

Concretely; I'd curtin discover would return:

-   type: raid                                                             
        level: 0                                                               
        name: md127                                                            
        devices:                                                               
        - partition-vda3                                                       
        spare_devices: []                                                      
        array_state: failed

Now, to do that, probert will need to also include partial raids as well.  Not sure;  it's odd that pyudev didn't have a /dev/md127 entry in the context.

The alternative for probert is to run some mdadm commands on the devices which have the ID_FSTYPE set to raid.  I'll add a probert task for this bug as well.

For clear-holders, curtin will also accept type: partition.  I expect the final config to set wipe: superblock on vda3 since it's a raid member; and that clear-holders is called with devices=['/dev/vda3']

- device: disk-vda
    size: 20947402752
    flag: linux
    wipe: superblock
    type: partition
    id: partition-vda3

From there, clear-holders would find /dev/md127 has a holder of type: raid and then the normal curtin shutdown plan would show us stopping md127, wipe each array member (/dev/vda3).