boot fails: mdadm not looking for UUIDs but hostname in superblocks

Bug #226484 reported by Jaakko Kyro
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Confirmed
Low
Unassigned

Bug Description

Binary package hint: mdadm

Release: Hardy Heron

I tried converting my Hardy system from a single-disk to RAID-1 by the usual procedure of making the new second disc a degraded RAID-1. Everything went fine but rebooting with the md device as root failed due to the fact that there is no hostname in the initramfs environment. Once it is set to the correct hostname in the initramfs busybox shell the boot can resume normally.

To reproduce, make sure that there is two discs, one of which contains a working Hardy installation, after that:

1. Partition the empty disc, set the partition type to FD
2. Create a md array using mdadm:
# mdadm --create /dev/md0 --level=raid1 --raid-devices=2 missing <your_partition>
3. mkfs and mount the new raid partition, copy the system over
4. reboot, edit grub commandline so that root=/dev/md1

The boot fails, and the initramfs environment:
$ mdadm --assemble --scan --verbose
says:
mdadm: /dev/md0: not built for host (none)

Once the hostname is set to the one the system used when booted up, mdadm is able to assemble /dev/md0 and mount it as root.

I don't know whether this also breaks the bootup when creating the raid-1 system upon installation.

Revision history for this message
ceg (ceg) wrote :

The installer seems to work around the issue of an unset hostname in initramfs by putting an ARRAY line into the mdadm.conf file, or by having an unset hostname when creating the md device. Whichever, this error does not appear after raid installations.

Revision history for this message
Simon Eisenmann (longsleep) wrote :

I can confirm this issue. Hostname is none in initrd context. Thus mdadm fails when auto scan of md arrays. One workaround is to manually set HOMEHOST (to the hostname which was used while initially creating the array) and rebuild the initial ramdisk with mkinitramfs.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I can confirm this bug as described by the reporter on the latest jaunty system.

:-Dustin

Changed in mdadm:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I'm marking this 'confirmed' and priority 'low'. This is something that should be fixed obviously, but it's not happening on raid installs from cd.

:-Dustin

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: boot from manually constructed raid1 root fails because of missing hostname in initramfs

Okay, I've spent a little more time on this, and I think there is a lot more involved from a configuration perspective to manually take a non-RAID system to a root-on-RAID1 system.

Here are the steps I was able to use in a Jaunty KVM to get this to work.

Install Jaunty into a KVM, and boot it with a single disk:
 HOST# qemu-img create -f qcow2 disk1.img 4G
 HOST# kvm -hda disk1.img -cdrom jaunty-server-amd64.iso

Ensure that system is happy, hunkey, and dorey.

Create a new disk.
 HOST# qemu-img create -f qcow2 disk2.img 4G

Launch the guest with both disks now:
 HOST# kvm -hda disk1.img -hdb disk2.img

The rest of these commands are to be run in the guest....

Partition:
 # fdisk /dev/sdb
n
p
(enter)
(enter)
t
fd
w

Create /dev/md0:
 # mdadm --create /dev/md0 --level=raid1 --raid-devices=2 missing /dev/sdb1

Append your raid configuration file:
 # mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Configure your raid. Note that this will update your initramfs with the new mdadm.conf file!!!
 # dpkg-reconfigure mdadm

Fix /etc/fstab, which is configured for /dev/sda1 to provide /. Update this to /dev/md0 (or the UUID of /dev/md0).
 # vi /etc/fstab
...

Add the new disk to your bootloader device map:
 # echo "(hd1) /dev/sdb" >> /boot/grub/device.map

Format the filesystem:
 # mkfs.ext3 /dev/md0

Mount it:
 # mount /dev/md0 /mnt

Copy the data over from the first dist to the second. Make sure you exclude the dynamically created kernel filesystems.
 # rsync -aP --exclude /dev/ --exclude /mnt/ --exclude --/proc/ --exclude /sys / /mnt
 # mkdir /mnt/dev /mnt/mnt /mnt/proc /mnt/sys

Install the bootloader on the raid device.
 # grub-install /dev/md0

Reboot.

Hit (esc) to enter the grub menu. Hit (e) to edit your kernel line. Replace root=(*) with root=/dev/md0.

:-Dustin

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

This issue is really a matter of process and configuration.

I'm converting it from a bug, to a question/answer.

Cheers,
:-Dustin

Changed in mdadm:
status: Confirmed → Invalid
Revision history for this message
Cédric Jeanneret deactivated (cjeanneret-c2c-deactivated) wrote :

for your information :
grub-install /dev/md0 just crashed my installation...... seems it's really NOT a good idea.
server keeps on doing bios -> loading grub -> bios -> ....

Revision history for this message
ceg (ceg) wrote :

This is a bug with mdadm --incremental not doing hotplug, because it looks for permission to do so in mdadm.conf.

On hotplug systems mdadm.conf should not have contain any specific references, maybe it should explicity mention "any" like this:?

DEVICE <any>
HOMEHOST <any>
ARRAY <any>

This whole bussiness of locking down array assembly (homehost,ARRAY) may just be due to the historical (suboptimal) mdadm design to assamble raids going by major/minor numbers saved in the superblocks. Those should always be considered being dynamic(depreciated). Using --assemble --uuid mdadm takes a UUID and match this unique but same UUID on all member devices.

Of course we should refrain from running arrays that are not complete (avoid leading them to desync by subsequent writes), unless we are required to recover data from a specific failed array and --run it manually or by a startup script.

ceg (ceg)
summary: - boot from manually constructed raid1 root fails because of missing
- hostname in initramfs
+ boot fails: mdadm not looking for UUIDs but hostname in superblocks
ceg (ceg)
Changed in mdadm (Ubuntu):
status: Invalid → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.