subiquity crashes upon reusing failed to assemble raid member partition

Bug #1835091 reported by Dimitri John Ledkov on 2019-07-02
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin
High
Unassigned
subiquity
Medium
Unassigned
probert (Ubuntu)
High
Unassigned

Bug Description

subiquity crashes upon reusing failed to assemble raid member partition

so following up from the previous bug #1835087 I removed the second drive, such that i only had:
- grub-partition
- /boot ext4 partition
- just half of a raid0 member as a partition

that raid0 member, got added to the failed-to-start md127 raid0. But otherwise failed to assemble into a functioning raid.

Upon reusing that partition for ext4 /, mke2fs failed, as vda3 is "in use" by mdadm.

Somehow, partial raid needs to be represented. Or we should try harder - remove device from raid, wipe raid signatures, then mke2fs.

Attaching screenshots.

Related branches

Dimitri John Ledkov (xnox) wrote :
Dimitri John Ledkov (xnox) wrote :

The partition marked unused, is actually in fact used by an active md127 that failed to start.

Ryan Harper (raharper) wrote :

Interesting. Let's look at the json and see if we can figure out a way to indicate we've a partial raid and either exclude partial raid members, or include raid in the config with partial members so that members of the raid are forced to be wiped.

Dimitri John Ledkov (xnox) wrote :

Also the "wipe & do full disk install" crashes too. As it tries to do exclusive open of /dev/vda and fails as it is part of raid.

I guess I can by-hand destroy raid, and unbreak the setup, but ew.

If a random person picks up an unused disk of the shelve, they should be able to install onto it, irrespective that it used to be a partial raid member or not.

Similarly with lvm2 / btrfs / zfs.

Dimitri John Ledkov (xnox) wrote :

probe-data.json indicates that it detected that vda3 is a raid-member, but the merged storage config does not mention vda3 at all =(

Dimitri John Ledkov (xnox) wrote :

So i did mdadm --manage --stop /dev/md127 to stop the raid and continue with reuse existing partitions....

.... however curtin was helpful enough to assemble md0 back again, even though we didn't ask for that to happen.

Ryan Harper (raharper) wrote :

> If a random person picks up an unused disk of the shelve, they should be able to install onto it, > irrespective that it used to be a partial raid member or not.
>
> Similarly with lvm2 / btrfs / zfs.

We're not in disagreement. We're figuring out best communicate between curtin block-discover and subiquity.

The only answer to your request is to *wipe* the underlying partition or device;

> So i did mdadm --manage --stop /dev/md127 to stop the raid and continue with reuse existing partitions....
>
>.... however curtin was helpful enough to assemble md0 back again, even though we didn't ask for that to happen.

Curtin needs to "awaken" any possible block layer so that it can remove/wipe/clean the data so that when you boot up into the target you don't have a surprise md127 that starts recovering. In this case, subiquity doesn't yet know enough from the curtin discover data that it cannot use preserve on the partition that's a raid member.

This is all a bit messy. I don't see a clean way to indicate in the model we currently use "this partition|disk is a raid member but we don't know where the rest of the raid is". That said, I'm inclined to think of this as a curtin bug too I'm afraid. From the journal.txt in the tarball xnox attached:

Jul 02 16:31:38 ubuntu-server curtin_log.1573[1727]: Current device storage tree:
Jul 02 16:31:38 ubuntu-server curtin_log.1573[1727]:
Jul 02 16:31:38 ubuntu-server curtin_log.1573[1727]: Shutdown Plan:
Jul 02 16:31:38 ubuntu-server curtin_log.1573[1727]:
Jul 02 16:31:38 ubuntu-server curtin_log.1573[1727]: finish: cmd-install/stage-partitioning/builtin/cmd-block-meta/clear-holders: SUCCESS:

This doesn't seem right? There should be some kind of shutdown plan for half-a-RAID?

Ryan Harper (raharper) wrote :

The shutdown plan requires that the config for a device being cleared to not include the preserve: True config. The "raid" partition, vda3 is explicitly marked as preserved: true:

  - device: disk-vda
    size: 20947402752
    flag: linux
    preserve: true
    type: partition
    id: partition-vda3

Curtin would not be able to clear raid metadata from this partition without wipe: set and preserve not present.

That said, I do think that curtin can do a few things to resolve this:

1) include partial raids in the discovered config, and can either
   a) add a field to indicate whether the array is healthy/partial/degrated; array_state maybe
   b) defer to subiquity to use curtin.block.mdadm.md_check to determine if it wants to include or mark members of the array with wipe so they can be used in other configs.

2) Update how we run clear-holders; Currently we only pass in a list of block devices, of type disk which have 'wipe' set and do not have 'preserve' enabled. This fails the case here where we'd like to wipe vda3 but it has a holder.

Concretely; I'd curtin discover would return:

    - type: raid
        level: 0
        name: md127
        devices:
        - partition-vda3
        spare_devices: []
        array_state: failed

Now, to do that, probert will need to also include partial raids as well. Not sure; it's odd that pyudev didn't have a /dev/md127 entry in the context.

The alternative for probert is to run some mdadm commands on the devices which have the ID_FSTYPE set to raid. I'll add a probert task for this bug as well.

For clear-holders, curtin will also accept type: partition. I expect the final config to set wipe: superblock on vda3 since it's a raid member; and that clear-holders is called with devices=['/dev/vda3']

  - device: disk-vda
    size: 20947402752
    flag: linux
    wipe: superblock
    type: partition
    id: partition-vda3

From there, clear-holders would find /dev/md127 has a holder of type: raid and then the normal curtin shutdown plan would show us stopping md127, wipe each array member (/dev/vda3).

Changed in probert (Ubuntu):
importance: Undecided → High
status: New → Triaged
Changed in curtin:
importance: Undecided → High
status: New → Confirmed

I fixed subiquity to wipe disks harder, which might well have fixed this. Will check today.

So, no, neither case here is fixed in the latest subiquity :/ I'll try to make vmtest testcases for curtin.

So https://code.launchpad.net/~mwhudson/curtin/+git/curtin/+merge/369918 now has matching test cases. Will attach failures from the tests (warning, contains sparse disk images) and from my testing in KVM.

Changed in subiquity:
status: New → Triaged
importance: Undecided → Medium
tags: added: reuse
tags: added: id-5d40f920ea9865754db787bb

This bug is fixed with commit 7a22938d to curtin on branch master.
To view that commit see the following URL:
https://git.launchpad.net/curtin/commit/?id=7a22938d

Changed in curtin:
status: Confirmed → Fix Committed

This bug is believed to be fixed in curtin in version 19.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status: Fix Committed → Fix Released
To post a comment you must log in.