Comment 49 for bug 557429

Revision history for this message
Phillip Susi (psusi) wrote : Re: [Bug 557429] --incremental should not auto-remove arbitrary segments with conflicting changes

On 4/20/2010 3:21 PM, ceg wrote:
>> If I plug in one disk and make some changes, then unplug it,
>> plug in the other disk, and make some changes to it,
>
> What would be your use-case?

I don't understand this question. The use case is described in the text
you replied to.

> In most cases the next thing one would probably want
> after conflicting changes are present in a system is to sync, in an
> easy way. (Not to keep rebooting or reattaching much. Reattaching is
> just a simple way to determine the order.)
>
> As your case does not sound like a hot-plug use-case. Probably handle
> that with --remove?

Handle what?

> No, you must prevent data-corruption or loss. But don't do things like

Of course. The question is HOW?

> --remove(ing) parts or fixing ordering in a hotplug environment
> (and mdadm --incremental is just for that), because it would break
> further management of the raid devices in a hot-plugging manner.

This is the HOW part. The removing does not break anything. It
prevents you from continuing to flip flop which disk you are using after
they have been forked, and thus making things worse.

> But your comments are a little irritating. We are actually talking
> hot-plugging here, right? Plus ubuntu's no config, no intervention
> necessary approach. Everything should just work.

Until everything goes all pear shaped, at which point "doing the right
thing" is not clear, so manual intervention is required. Once the array
has been forked, the best thing you can do is not make things any worse.
 Fixing it has to be done by hand.

> Are you actually aware what that means? I am not saying it is not
> possible to create a new array from parts of an existing array without
> loosing the data, but is sure isn't a trivial mdadm command. And then
> you are really breaking up the array and won't be able to just sync the
> other parts and still have the same (UUID) array.

The array is already broken up. Resyncing will destroy data. If you
want to rescue that data you must move the other disk to its own array
so you can mount it. After you have rescued any data, then you can drop
it back into the original array and it will sync.

> Yes, yes and yes again, this needs to be done in *any* case of
> conflicting changes. If mdadm --incremental (the mdadm hotplug manager)
> sets up the confliciting parts on separate md devices they will both
> even appear on the desktop.

Sure, automatically splitting the array would be a nice feature, but the
minimum action required to fix the bug is to simply reject the second
disk, updating its metadata in the process.

> No, it really makes things worse! It prevents the user/admin from
> managing arrays (parts in this case) by simply plugging disks.

No it does not. What it does is prevent the damage from growing worse
without being noticed.

> And what would be the gain of auto-removing writing metadate? If the
> disks are connected during boot the disks will almost always stay in
> the same order anyway, eliminating the gain to save that order
> to metadata. If you want a specific order from the start, you need
> to manually issue mdadm commands anyway. But now also if you need
> another order than what was written to metadata. And all that mdadm
> commands need to be issued in between an active hot-plugging
> system (interference/no map file updating), instead of just re-plugging
> your disks in order.

As I already said, the gain is to prevent continued flip-flopping back
and forth between the two divergent filesystems based only on which disk
is detected first. Almost always != always.

> It's especially worse if the order in the metadata written does not
> conform with the sync direction you want and you are required to
> --zero-superblock, setup a new array making sure not to loose
> the data from the arbitrary --removed part etc. Because after removing
> the raid superblock blkid will report the partion to contain the
> filesystem with the UUID that the md device is containing. And this
> can cause an unsync that is not preventable by mdadm anymore, when the
> fs on the partition gets mounted instead of the one on the right md
> device!

You don't need to --zero-superblock to migrate the rejected disk to a
new array so you can mount it. You seem to be suggesting that the user
physically disconnect one disk if they wish to access data on the other
disk, rather than run mdadm. This does not sound like a good idea.

> Working with degraded arrays is not uncommon. The standard and
> documented procedure to convert a non-raid system into a raid system is
> to copy&modify the system into degraded arrays first and to sync
> afterward as desired if everything went well.

This scenario does not really have anything to do with this discussion
since you don't have have both legs of the mirror being independently
activated and modified.

> And a nice and analog way to dist-upgrade systems while still being able
> to quickly revert back is to detach a mirror disk from the system arrays
> (as a backup/snapshot) prior to doing the dist-upgrade. If you want to
> revert, then you just need to boot with only the previously detached
> disk attached and plug and sync the other (already --run degraded)
> drive later.

And right now, doing this can corrupt your filesystem. If it works the
way I have suggested then if you decide to keep the new system, when you
reconnect the second disk it will be accepted and synced. If you decide
to go back to the old system, then you will need to run mdadm to throw
out the changes on the second disk and put it back in the array. Doing
this automatically is not sane because the array may have been forked by
accident, and the resync destroys data. Never destroy data automatically.

> Summary to support safe hot-pluggable segmentation of arrays:
> (arrays are only --run degraded manually or if required and incomplete
> during boot)
>
> * --incremental should stop auto re-adding "removed" members (so
> that --remove provides a manual means turn hot-plugging off)
> * When arrays are --run degraded missing members should be marked
> "failed" but not "removed".
> * Always check for conflicting "failed" states in
> superblocks, to detect conflicting changes.
> + always report (console and --monitor event) if conflicting changes
> are detected
> + require --force with --add for a manual re-sync of conflicting
> changes (unlike with resyncing an outdated device, in this case
> changes will get lost)
> * To facilitate inspection --incremental should assemble array
> components with conflicting changes into auxiliary devices with
> mangled UUIDs (safe and easy diffing, merging, etc. even on desktop
> level)

Yes, that pretty much sums it up.