[FATAL] mdadm --grow adds dirty disk to RAID1 without recovery

Bug #1801555 reported by xor
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

On Kubuntu 18.04.1 it is possible to cause a (non-bitmap!) RAID1 to consume an out-of-sync disk without recovery as if it were in sync.
"$ cat /proc/mdstat" immediately shows the dirty disk as "U" = up after addition, WITHOUT a resync.

I was able to reproduce this twice.

This means arbitrary corruption to the filesystem can happen as you will have two mixed filesystem states in the RAID1.
RAID1 balances reads across all disks so the state of either disk will be returned randomly.

Steps to reproduce:

1. Install via network installer, create the following partition layout manually:
{sda1, sdb1} -> md RAID1 -> btrfs -> /boot
{sda2, sdb2} -> md RAID1 -> dm-crypt -> btrfs -> /

2. After the system is installed ensure the raid array has no bitmap. I won't provide instructions for this as my 16 GiB disks were small enough to avoid creation of a bitmap apparently. Check "$ cat /proc/mdstat" to confirm there is no bitmap.

3. Boot with sdb physically disconnected. Boot will now hang at "Begin: Waiting for encrypted source device ...". That will timeout after a few minutes and drop to an initramfs shell, complaining that the disk doesn't exist. This is a separate bug, filed at #1196693
To make it bootable again, do the following workaround in the initramfs shell:
$ mdadm --run /dev/md0
$ mdadm --run /dev/md1
# Reduce size of array to stop the initramfstools from waiting for sdb forevery.
$ mdadm --grow -n 1 --force /dev/md0
$ mdadm --grow -n 1 --force /dev/md1
$ reboot

After "$ reboot", boot up the system fully with sdb still disconnected.
Now the state of the two disks should be out of sync - booting surely produces at least one write.
Reboot and apply the same procedure to sdb, with sda disconnected.

4. Boot from one of the disks and do this:
$ mdadm /dev/md0 --add /dev/sdb1
$ mdadm /dev/md1 --add /dev/sdb2
# The sdb partitions should now be listed as (S), i.e. spare
$ cat /proc/mdstat
# Grow the array to use up the spares
$ mdadm /dev/md0 --grow -n 2
$ mdadm /dev/md1 --grow -n 2
# Now the bug shows: mdstat will say the array is in sync immediately:
$ cat /proc/mdstat
# And the kernel log will show that a recovery was started
# - BUT completed within less than a second:
$ dmesg
[144.255918] md: recovery of RAID array md0
[144.256176] md: md0: recovery done
[151.776281] md: recovery of RAID array md1
[151.776667] md: md1: recovery done

Notice: I'm not sure whether this is a bug in mdadm or the kernel. Filing this as mdadm bug for now, if you figure out this is a kernel bug then please re-assign.

Revision history for this message
xor (xor) wrote :

I have reproduced this a third time and it seems was also able to confirm that the md device returns out of sync data:

# Tell btrfs to check checksums of files and metadata.
$ btrfs scrub start -B /dev/mapper/md1_crypt

# It reported:
# - 106 errors
# - 2 correctable, 104 uncorrectable
# - 98 errors were checksum errors.
# If you check /var/log/kern.log you'll see all the spam about checksum errors.

What seems critical in reproducing it is applying step 3 to both disks as I had originally described. FYI by "Reboot" at the end of step 3 I meant "shut down".
And you don't have to reboot at the start of step 4, its enough to keep the already booted session of the second disk of step 3.

Revision history for this message
xor (xor) wrote :

Package versions:

linux-(image|modules|modules-extra)-4.15.0-38-generic: version 4.15.0-38.41
mdadm: version 4.1~rc1-3~ubuntu18.04.1

If you need the versions of other stuff just ask.
In turn please if possible refrain from asking me to run the tool which uploads tons of data to launchpad, I would very much not enjoy to have to read through the hundreds of pages of data it produces to check for private stuff to remove.

Revision history for this message
Cristian Aravena Romero (caravena) wrote :
Revision history for this message
xor (xor) wrote :

@caravena wrote:
> https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices

This bug is NOT about btrfs!

The bug happens *below* btrfs, in the md RAID layer!
I merely used btrfs for its checksum capabilities so I can prove the corruption in the md layer.

Sorry if I hadn't made that clear enough.

Besides: btrfs' own RAID is not a suitable replacement for the md RAID layer because it does not support full disk encryption yet.

Revision history for this message
xor (xor) wrote :

Confirmed on version 18.10 of Kubuntu amd64 as well.

Package versions:
linux-(headers|image|modules|modules-extra)-4.18.0-12-generic: version 4.18.0-12.13
mdadm: 4.1~rc1-4ubuntu1

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mdadm (Ubuntu):
status: New → Confirmed
Revision history for this message
gertigfi (gertigfi) wrote :

Also happens on Kubuntu 19.10!

uname -a: Linux 5.3.0-51-generic
mdadm --version: v4.1 - 2018-10-01

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.