Ubuntu

mdadm 0.9 metadata at the end of the disk confuses grub

Reported by Fabian Zeindl on 2012-05-14
40
This bug affects 8 people
Affects Status Importance Assigned to Milestone
grub
Unknown
Unknown
grub2 (Ubuntu)
High
Unassigned

Bug Description

When using mdadm metadata format 0.9 and the partition is at the end of the disk, grub can not tell whether it is the partition or the whole disk that is the raid component, resulting in errors like:

error: found two disks with the index 0 for RAID md0.
error: found two disks with the index 3 for RAID md0.
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).

This can cause unexpected failure after upgrading from 10.04.

Phillip Susi (psusi) wrote :

Can you please post the output of "sudo mdadm -D /dev/md?" and "sudo fdisk -lu"?

Changed in grub2 (Ubuntu):
status: New → Incomplete
Fabian Zeindl (fabian-xover) wrote :
Download full text (3.4 KiB)

/dev/md0:
        Version : 0.90
  Creation Time : Sat Aug 28 16:19:54 2010
     Raid Level : raid5
     Array Size : 1465148736 (1397.27 GiB 1500.31 GB)
  Used Dev Size : 488382912 (465.76 GiB 500.10 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon May 14 18:12:08 2012
          State : active
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 9e41c020:85eb05dd:a3f83a12:f3b86535
         Events : 0.9459

    Number Major Minor RaidDevice State
       0 8 49 0 active sync /dev/sdd1
       1 8 17 1 active sync /dev/sdb1
       2 8 65 2 active sync /dev/sde1
       3 8 33 3 active sync /dev/sdc1

       4 8 1 - spare /dev/sda1

Disk /dev/sda: 500.1 GB, 500107862016 bytes
18 heads, 30 sectors/track, 1808839 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000bf7f

   Device Boot Start End Blocks Id System
/dev/sda1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0001391b

   Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 976768064 488383008+ fd Linux RAID autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
81 heads, 63 sectors/track, 191411 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0002f2f3

   Device Boot Start End Blocks Id System
/dev/sdc1 * 2048 976773167 488385560 fd Linux RAID autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
24 heads, 14 sectors/track, 2907063 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00037497

   Device Boot Start End Blocks Id System
/dev/sdd1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
81 heads, 63 sectors/track, 191411 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xe50d27b1

   Device Boot Start End Blocks Id System
/dev/sde1 * 2048 976773167 488385560 fd Linux RAID autodetect

Disk /dev/md0: 1500.3 GB, 1500312305664 bytes
2 heads, 4 sectors/track, 366287184 cylinders, total 2...

Read more...

Fabian Zeindl (fabian-xover) wrote :

Right now i'm trying to resize the individual partition to leave some space at the end of the disk, since i think my bug might be related to this: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/776422

Phillip Susi (psusi) wrote :

That's what I was thinking. By the way, how did you find that bug?

Fabian Zeindl (fabian-xover) wrote :

By googling for a few hours, i can't reproduce my query though. Maybe one should mention that since i don't want to rebuild my entire raid, apparently it's ok for mdadm if you fail a disk, shrink the partition a bit and re-add it. Apparently there's margin for error, at least mdadm doesn't complain, I hope it doesn't break my RAID :/.

Fabian Zeindl (fabian-xover) wrote :

*sigh* I resized all my disks to be a bit smaller, but it still doesn't help. Though the message has changed, now it's only

error: found two disks with the index 3 for RAID md0.

the index 0 message went away.

Any tips?

Fabian Zeindl (fabian-xover) wrote :
Download full text (10.4 KiB)

Ill post the output of

/usr/sbin/grub-probe --device-map="/boot/grub/device.map" --target=fs -v /boot/grub

I resized the partition to be from sector 2048 to 976772000 (of 976773168).

/usr/sbin/grub-probe: info: cannot open `/boot/grub/device.map'.
/usr/sbin/grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd0.
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd1.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd2.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd3.
/usr/sbin/grub-probe: info: the size of hd3 is 976773168.
/usr/sbin/grub-probe: info: the size of hd3 is 976773168.
/usr/sbin/grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd4.
/usr/sbin/grub-probe: info: the size of hd4 is 976773168.
/usr/sbin/grub-probe: info: the size of hd4 is 976773168.
/usr/sbin/grub-probe: info: scanning hd0 for LVM.
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: no LVM signature found.
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: scanning hd1 for LVM.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: no LVM signature found.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: scanning hd2 for LVM.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/grub-probe: info: no LVM signature found.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/grub-probe: info: scanning hd3 for LVM.
/usr/sbin/grub-probe: info: the size of hd3 is 976773168.
/usr/sbin/grub-probe: info: no LVM signature found.
/usr/sbin/grub-probe: info: the size of hd3 is 976773168.
/usr/sbin/grub-probe: info: scanning hd4 for LVM.
/usr/sbin/grub-probe: info: the size of hd4 is 976773168.
/usr/sbin/grub-probe: info: the size of hd4 is 976773168.
/usr/sbin/grub-probe: info: the size of hd4 is 976773168.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk lvm-data.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk lvm-swap.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk lvm-root.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk hd0.
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: Found array md0 (mdraid09).
/usr/sbin/grub-probe: info: the size of hd0 is 976773168.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk hd1.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: the size of hd1 is 976773168.
/usr/sbin/grub-probe: info: Scanning for mdraid09 RAID devices on disk hd2.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/grub-probe: info: the size of hd2 is 976773168.
/usr/sbin/gr...

Phillip Susi (psusi) wrote :

I think you left the original metadata in place so both sets are being recognized. Try using dd to zero the sectors from the end of the partition to the end of the drive.

Fabian Zeindl (fabian-xover) wrote :

You, Sir, are a pleasure to work with.

I did

dd if=/dev/zero bs=512 seek=976772001 of=/dev/sdX on each disk.

Though i'm not sure if seek should have been one less, do you know that?
Anyway, grub-install now only shows:

sudo grub-install /dev/sda
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).
error: superfluous RAID member (4 found).
Installation finished. No error reported.

What does it mean? Is that bad?

Fabian Zeindl (fabian-xover) wrote :

Also: Does grub2 still need /boot/grub/device.map? The installer didn't create one.

Fabian Zeindl (fabian-xover) wrote :

Will this remove the "superfluous RAID member" warnings? Because they are the only ones left.

I'm asking because i'm remotely logged into the machine and can't stop the RAID to zero the superblock

Phillip Susi (psusi) wrote :

device.map is depreciated, so no, you shouldn't have one. Whether the sector is correct depends on your new partition table ( fdisk -lu ). Those errors still don't sound good but I'm not sure of the cause, maybe try adding the verbose/debug flags back in?

Phillip Susi (psusi) wrote :

If you shrank the partition enough and zeroed the free space following the partition, that should fix the errors. What is your current fdisk -lu show?

Fabian Zeindl (fabian-xover) wrote :

Hm. But at least it said "installation finished".

Disk /dev/sda: 500.1 GB, 500107862016 bytes
18 heads, 30 sectors/track, 1808839 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000bf7f

   Device Boot Start End Blocks Id System
/dev/sda1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
63 heads, 30 sectors/track, 516811 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0001391b

   Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
27 heads, 30 sectors/track, 1205892 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0002f2f3

   Device Boot Start End Blocks Id System
/dev/sdc1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
24 heads, 14 sectors/track, 2907063 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00037497

   Device Boot Start End Blocks Id System
/dev/sdd1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
27 heads, 30 sectors/track, 1205892 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xe50d27b1

   Device Boot Start End Blocks Id System
/dev/sde1 * 2048 976772000 488384976+ fd Linux RAID autodetect

Disk /dev/md0: 1500.3 GB, 1500312305664 bytes
2 heads, 4 sectors/track, 366287184 cylinders, total 2930297472 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 196608 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table

Phillip Susi (psusi) wrote :

I though that 512k was enough room to prevent the partition metadata from being detected on the whole disk, but maybe it was 1mb? Try grub-probe --target=fs -v /boot/grub again.

Fabian Zeindl (fabian-xover) wrote :
Download full text (7.8 KiB)

grub-probe: info: cannot open `/boot/grub/device.map'.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: Scanning for dmraid_nv RAID devices on disk hd4.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: scanning hd0 for LVM.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: scanning hd1 for LVM.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: scanning hd2 for LVM.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: scanning hd3 for LVM.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: scanning hd4 for LVM.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: no LVM signature found.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: Scanning for mdraid09 RAID devices on disk hd4.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: the size of hd4 is 976773168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd0.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: the size of hd0 is 976773168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd1.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: the size of hd1 is 976773168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd2.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: the size of hd2 is 976773168.
grub-probe: info: Scanning for mdraid1x RAID devices on disk hd3.
grub-probe: info: the size of hd3 is 976773168.
grub-probe: info: the size of...

Read more...

Phillip Susi (psusi) on 2012-05-16
summary: - Upgrade vom 10.04 LTS Server to 12.04 LTS Server breaks grub
+ mdadm 0.9 metadata at the end of the disk confuses grub
Changed in grub2 (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → High
description: updated
Fabian Zeindl (fabian-xover) wrote :

I still have troubles here. I found out something interesting: I used zero-superblock with --force on the disk and the partition, and i used dd to zero over the end of the partition and disks.

It works now, but when i add the spare-drive to the RAID i get the "error: found two disks with the index 4 for RAID md0." again.
When i remove the spare and zero the superblock, it works again.

Fabian Zeindl (fabian-xover) wrote :

What can i do to get my spare working?

Phillip Susi (psusi) wrote :

Had you zeroed the spare disk? You might try switching to metadata format 1.0.

Fabian Zeindl (fabian-xover) wrote :

I did a reboot now without the spare (i did all this because i had a remote server without grub ;) ) and after the reboot from kernel 2.6 to kernel 3.2 adding the spare worked for some reason.

Dave (dave-nobodynet) wrote :

I just hit this exact bug after changing my sda drive. sda4 ran all the way to the end of the disk and was part of md1, and, when booting, sda got detected as being in that array.

On almost every boot, things roughly worked, but with error messages, but on *one* boot, md1 got started before md0, meaning that sda became the member of that array, and all the other partitions became inaccessible. I've a feeling this may have hosed the Win7 install on sda1 as well at this point, but I had only just mirrored it the day before, so that was easy to resurrect. It also meant that sda2 (part of md0, my root file system) got booted out of it's array, and sda3 failed to be available for swapping. Nasty!

Doing an mdadm --examine on sda gave identical results as doing it on sda4. Doing an mdadm --zero-superblock on sda to try and remove it, also removed it on sda4.

It took a while to work out what was going on, until we realised the metadata is at the end of the partition. Eventually I shrank sda4 by a couple of MB, did another --zero-superblock on sda, and re-added sda4 to the array. Now, everything appears to work, and I don't think the other partitions will get hosed again.

FYI, I'm running Ubuntu 10.04LTS 64-bit. sda is a Western Digital Red (advanced format drive), and sdb is a Seagate Green, both 2TB. Now I've fixed it, I don't think any of that is relevant to the problem, just including it for info.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.