Fails to install on fully degraded RAID10

Bug #497039 reported by Loïc Minier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Binary package hint: grub2

Hi,

I was reinstalling Ubuntu on LVM over mdadm RAID10, this failed because the array was too degraded (missing 2 disks out of 4).

My full device config was on the previous install:
/boot on /dev/md0 == RAID1(sda1 + sdb1)
/ on LVM LV "root" in VG vg0 backed by single PV /dev/md1 == RAID1(sda1 + sdb1)

sda and sdb were partitioned using regular PC partitions

New install was towards:
/ on LVM LV "root" in VG fox-raid10 backed by single PV /dev/md2 == RAID10(sdc2 + missing + sdb2 + missing)

sdc and sdd were partitioned using GPT with a bios_grub partition at the start for core.img

I'm not sure that's the first error I've hit, but trying to debug why grub-install was failing, I got errors from grub-probe which I could reproduce with:
grub-probe --device-map=/boot/grub/device.map --target=fs --device /dev/mapper/fox--raid10-root
it failed with "error: unknown filesystem"

I rebuilt grub2 from bzr tip (/grub/trunk/grub r1938) and reproduced the error. Poking in gdb I would see that md2 would never be iterated. Poking at grub_raid_iterate(), I found that the array was skipped because is_array_readable() returns false.

The logic in is_array_readable() will return false at this test:
  if (array->nr_devs >= array->total_devs - n)
    return 1;
in my case, array->nr_devs is 2, array->total_devs is 4, and with raid10 and my layout of n2 (the current default for raid10), n equals 1.

So it seems GRUB2 considers the array broken, but it's in fact a working degraded config, where I lost all mirrors on the same stripe.

Thanks,

ProblemType: Bug
Architecture: amd64
Date: Tue Dec 15 17:29:40 2009
DistroRelease: Ubuntu 9.10
LiveMediaBuild: Ubuntu-Server 9.10 "Karmic Koala" - Release amd64 (20091027.2)
NonfreeKernelModules: nls_iso8859_1 hfsplus hfs raid10 raid1 xfs exportfs reiserfs jfs ntfs vfat fat intel_agp atl1 mii nls_cp437 isofs fbcon tileblit font bitblit softcursor vga16fb vgastate vesafb usb_storage usbhid floppy ohci1394 ieee1394
Package: grub-pc 1.97~beta4-1ubuntu4.1
ProcEnviron:
 PATH=(custom, no user)
 SHELL=/bin/sh
ProcVersionSignature: Ubuntu 2.6.31-14.48-generic
SourcePackage: grub2
Uname: Linux 2.6.31-14-generic x86_64

Revision history for this message
Loïc Minier (lool) wrote :
Revision history for this message
Loïc Minier (lool) wrote :

The array was created with:
mdadm --create /dev/md2 --homehost=fox --raid-devices=4 --level=10 --layout=n2 --name=raid10 /dev/sdc2 missing /dev/sdd2 missing

Revision history for this message
Loïc Minier (lool) wrote :

Result of grub-probe --verbose --verbose run with same args

Revision history for this message
Loïc Minier (lool) wrote :

As pointed out by cjwatson, mdadm's util.c gets this right:
int enough(int level, int raid_disks, int layout, int clean,
           char *avail, int avail_disks)
{
        int copies, first;
        switch (level) {
        case 10:
                /* This is the tricky one - we need to check
                 * which actual disks are present.
                 */
                copies = (layout&255)* ((layout>>8) & 255);
                first=0;
                do {
                        /* there must be one of the 'copies' form 'first' */
                        int n = copies;
                        int cnt=0;
                        while (n--) {
                                if (avail[first])
                                        cnt++;
                                first = (first+1) % raid_disks;
                        }
                        if (cnt == 0)
                                return 0;

                } while (first != 0);
                return 1;

Revision history for this message
Andrew Cranwell (andrew-cranwell) wrote :

Is this actually a bug with grub, a bug with mdadm or an intended design?

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

This release of Ubuntu is no longer receiving maintenance updates. If this is still an issue on a maintained version of Ubuntu please let us know.

Changed in grub2 (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for grub2 (Ubuntu) because there has been no activity for 60 days.]

Changed in grub2 (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.