boot failure with raid1 array on jaunty

Bug #367934 reported by Mathias Kende on 2009-04-27
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

I have done yesterday the update (with the Gnome update manager) from ubuntu 8.10 to 9.04, and now I can no longer boot my computer. The update seems to have worked properly but gaved a lot of small errors (to which I did not pay attention at the time), including missing modules in X server and the fact that it was unable to connect to some configuration server. But these things are not related I think to a early boot problem.

At the grub prompt I only have two different kernels : 2.6.28-11-generic and 2.6.27-11-generic. And the same happen for both of them. After some seconds of boot I get the following message :

-------------------------
Gave up waiting for root device. Common problem:
 - Boot arg (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough ?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/md0 does not exists. Dropping to a shell!
-------------------------

From there, if I enter "exit" I get the same message again, even if I wait for a long time (several minutes) or if I use rootdelay=90 in the boot command line (as suggested here http://www.ubuntu.com/getubuntu/releasenotes/904#Boot%20failures%20on%20systems%20with%20Intel%20D945%20motherboards ).

In the busybox shell I can enter the following commands (I copy manually the output as I can't do anything else on this faulty computer) :

-------- cat /proc/modules
r8169
mii
floppy
raid10
raid456
async_xor
async_memcpy
async_tx
raid1
raid0
multipath
linear
vesafb
fbcon
tileblit
font
bitblit
softcursors
--------------------------

------------- uname -a
Linux (none) 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 unknown
--------------------------

If a look into /dev, I can see all of my hard drive and partition except for /dev/md0 which is the root device.

Booting with 2.6.27-11-generic yields the same result but after a longer boot, during which I can see that all my hard drive are also properly recognised.

My hardware is an ASUS P5B motherboard with an Intel P965 chipset, SATA hard drives on an Intel ICH8 controller and UDMA optical drives on a JMicron JMB363 PATA controller.

My root device is /dev/md0 which is a software raid1 array (with two devices /dev/sda2 and /dev/sdb2).

This may be the same than bug #335619 whick lacks a lot of information, but I think that it is not bug #290153 (or any other bug corrected by waiting at the busybox shell before proceeding with the boot) as adding rootdelay=90 (or a greater value) does not solve my problem. It is neither bug #33269 as the error is different.

Tell me if I can give more information, but I can't boot that computer so every thing will have to be copied by hand.

affects: ubuntu → usplash (Ubuntu)
Mathias Kende (mathias-kende) wrote :

I am not using usplash so I doubt that it is the source of this bug. I am changing back the "affects" field to "linux" because I can't revert it to "ubuntu". But it may be a bug in mdadm rather than in the kernel.

affects: usplash (Ubuntu) → linux (Ubuntu)
Mathias Kende (mathias-kende) wrote :

I kind of solved the issue. In fact everything is "working" but my RAID array is not assembled automatically at boot time. So if I enter "mdadm --assemble /dev/md0" in the busybox shell, and exit from it, then the computer boot properly (except for X errors, but they are linked to the update).

But I don't know how to enable it automatically.

Rich Wales (richw) wrote :

I think I may be having the same (or a related) problem. I'm trying to convert a Jaunty server (2.6.28-11-server) to use RAID 1 using mdadm. I believe mdadm is working because I have several RAID partitions which get mounted out of /etc/fstab after the system has successfully booted. But when I tried to convert my root to a RAID partition, I got boot errors and a busybox shell.

I tried putting /boot in a regular partition, and / and /var in two separate RAID partitions, and the system started to boot, but then it couldn't mount / or /var, and I got some errors about stuff in / being missing, and then a busybox prompt.

In case it might help, I went into fdisk and marked each RAID partition with type FD ("Linux raid autodetect"), but this didn't help.

Is there something else I need to do in order for my RAID root to be mounted early enough in the boot process?

Mathias Kende (mathias-kende) wrote :

The problem solved itself when I updated to 2.6.28-13-generic kernel.

Rich Wales (richw) wrote :

I also managed to get rid of this problem, though it wasn't trivial.

After updating to 2.6.28-15, I edited /etc/mdadm/mdadm.conf with ARRAY statements listing all my arrays (including the array for my root file system). Then, I did "dpkg-reconfigure initramfs-tools" to recreate the RAM disk image with the new mdadm.conf. After doing all this, I was able to boot straight into my root RAID array, without requiring rootdelay=, and without having to type any busybox shell commands.

The "dpkg-reconfigure initramfs-tools" was the non-obvious step for me. Until I did this, my changes to the mdadm.conf file would not show up in anything I did in the busybox shell.

Aron (aron-aron) wrote :

you just have to invoke "update-initramfs -u", dpkg-reconfigure initramfs-tools does the same thing, it updates your initrd image.

Aron (aron-aron) on 2010-02-01
Changed in linux (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers