mdadm : boot failed sometimes, no devices found

Bug #120504 reported by AUCLAIR
30
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: mdadm

at 1/10 time, mdadm says :
mdadm : No devices listed in conf file were found.

i just have to CTLR+ALT+DEL or push the reset button and it's OK, the approximatively next 10 boots work finely !?!

i am on Feisty up to date, kernel based on x86_64.

the content of mdadm.conf :
$ cat /etc/mdadm/mdadm.conf
# md0>/boot ; md1>LVM ; md2>swap
DEVICE /dev/sda*
DEVICE /dev/sdb*
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=46c0fc49:6a912381:61f75e68:2a1e43ae
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=6b41f1de:db3d4408:b1f2e1cb:9f30c639
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=3aea7d28:cc86726b:d1872f7b:5ba40b17
MAILADDR root

i found UUIDs with this command :
$ sudo /usr/share/mdadm/mkconf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=46c0fc49:6a912381:61f75e68:2a1e43ae
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=6b41f1de:db3d4408:b1f2e1cb:9f30c639
ARRAY /dev/md3 level=raid1 num-devices=2 UUID=3aea7d28:cc86726b:d1872f7b:5ba40b17

My PC is not a server, so it's not very important, but i don't understand why it doesn't work sometimes...

Thanks for your help, and VIVA UBUNTU ;-)

Revision history for this message
AUCLAIR (frederic-auclair) wrote :

I just see that mkconf return a definition of /dev/md3, where i define a /dev/md2 in /etc/mdadm/mdadm.conf

i don't think it causes the problem (yes/no ?)

i suppose that mkconf says md3 because i have a hole beetween the partitions, from cylinder 12223 to cylinder 30139.
To explain i paste the result of fdisk :

$ sudo fdisk -l

Disque /dev/sda: 250.0 Go, 250059350016 octets
255 têtes, 63 secteurs/piste, 30401 cylindres
Unités = cylindres de 16065 * 512 = 8225280 octets

Périphérique Amorce Début Fin Blocs Id Système
/dev/sda1 * 1 64 514048+ fd Linux raid autodetect
/dev/sda2 65 12222 97659135 fd Linux raid autodetect
/dev/sda3 30140 30401 2097466+ fd Linux raid autodetect

Disque /dev/sdb: 250.0 Go, 250059350016 octets
255 têtes, 63 secteurs/piste, 30401 cylindres
Unités = cylindres de 16065 * 512 = 8225280 octets

Périphérique Amorce Début Fin Blocs Id Système
/dev/sdb1 * 1 64 514048+ fd Linux raid autodetect
/dev/sdb2 65 12222 97659135 fd Linux raid autodetect
/dev/sdb3 30140 30401 2104515 fd Linux raid autodetect

Disque /dev/md0: 526 Mo, 526319616 octets
2 têtes, 4 secteurs/piste, 128496 cylindres
Unités = cylindres de 8 * 512 = 4096 octets

Disque /dev/md0 ne contient pas une table de partition valide

Disque /dev/md1: 100.0 Go, 100002824192 octets
2 têtes, 4 secteurs/piste, 24414752 cylindres
Unités = cylindres de 8 * 512 = 4096 octets

Disque /dev/md1 ne contient pas une table de partition valide

Disque /dev/md3: 2147 Mo, 2147680256 octets
2 têtes, 4 secteurs/piste, 524336 cylindres
Unités = cylindres de 8 * 512 = 4096 octets

Disque /dev/md3 ne contient pas une table de partition valide

Revision history for this message
reliable-robin-22 (nicolasdiogo) wrote :

hi,

i am having the very same problem here.

RAID 1 /boot
RAID 1 swap
RAID 1 /
RAID 1 /var

running ubuntu 7.04 alternative CD. Using AMD64 with arch 64

but many NEVER boots correctly.

Thanks

Revision history for this message
Chuck Bridgeland (chuckbri) wrote :

I'm seeing the same thing, intermittently.

Freshly set up system,

uname -a returns "Linux pcserver2 2.6.20-16-generic #2 SMP Thu Jun 7 20:19:32 UTC 2007 i686 GNU/Linux"

mdadm --version returns "mdadm - v2.5.6 - 9 November 2006"

With the splash screen turned off, each and every time the system boots I see "mdadm: no devices listed in conf file were found". A couple lines later I will see "trying to resume from /dev/sda3" (which is swap on my system, and not mirrored). At this point it will either boot or fail.

I basically did an expert install from the alternate cd, and let the installer set up RAID 1, with /, /home and /var on separate partitions, mirrored.

fstab is:
 # /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# /dev/md0
UUID=97be724f-68a6-4d58-a65b-27323cc80c39 / ext3 defaults,errors=remount-ro 0 1
# /dev/md2
UUID=84027007-13ea-48b0-9815-2a1d8632de30 /home ext3 defaults 0 2
# /dev/md1
UUID=d0bf6dc7-5309-4155-a8f7-19210b5d5831 /var ext3 defaults 0 2
# /dev/sda3
UUID=e9a1b493-440f-4c83-8642-c06f2ce562d8 none swap sw 0 0
# /dev/sdb3
UUID=02a63f99-9514-42a8-93e2-8c75cf234a82 none swap sw 0 0
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto 0 0

mdadm.conf is
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=06a965d4:15aea309:2aa916b5:4fcc1448
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=de6bc0e6:2499b63f:bd517c7d:60c4d64a
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=ba8e6066:60110e41:1551e85f:dd92c280

# This file was auto-generated on Sun, 24 Jun 2007 22:23:56 +0000
# by mkconf $Id: mkconf 261 2006-11-09 13:32:35Z madduck $

The grub menu.lst entry is:
title Ubuntu, kernel 2.6.20-16-generic
root (hd0,0)
kernel /boot/vmlinuz-2.6.20-16-generic root=/dev/md0 ro quiet
initrd /boot/initrd.img-2.6.20-16-generic
quiet
savedefault

Do I need to maybe be setting up an unmirrored /boot partition?

Revision history for this message
AUCLAIR (frederic-auclair) wrote :

i've got the same message about /dev/sda3 which is the swap (mirrored) on my system.

$ uname -a
Linux ubuntuAmd64alt 2.6.20-16-generic #2 SMP Thu Jun 7 19:00:28 UTC 2007 x86_64 GNU/Linux

part of menu.lst :

title Ubuntu, kernel 2.6.20-16-generic
root (hd0,0)
kernel /vmlinuz-2.6.20-16-generic root=/dev/mapper/vg01-rootlv ro quiet splash locale=fr_FR
initrd /initrd.img-2.6.20-16-generic
quiet
savedefault

part of /etc/fstab (ouch ! not a very clean one :-|) :

# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0

UUID=d4542a5e-81d5-4d46-88d8-340b839d9fd8 /boot ext3 defaults 0 0

/dev/mapper/vg01-rootlv / ext3 defaults,errors=remount-ro 0 1
/dev/mapper/vg01-homelv /home ext3 defaults 0 2
/dev/mapper/vg01-tmplv /tmp ext3 defaults 0 2
/dev/mapper/vg01-usrlv /usr ext3 defaults 0 2
/dev/mapper/vg01-varlv /var ext3 defaults 0 2
/dev/vg01/optlv /opt ext3 defaults 1 2

UUID=55be2672-97b2-425d-bbe3-b4fede1a200f none swap sw 0 0

/dev/cdrom /media/cdrom0 udf,iso9660 user,noauto 0 0
/dev/ /media/floppy0 auto rw,user,noauto 0 0
/dev/sdc1 /media/LD auto rw,user,noauto 0 0

But the real problem is that : many times it works !

Thanks for your posts, i'm not alone in the world ;-)

Revision history for this message
reliable-robin-22 (nicolasdiogo) wrote :

hi Chuck Bridgeland,

this problem seems to be related to the fact that mdadm is not updatig teh UUID for devices properly.

note that on your fstab every single UUID is different from those in the mdadm.conf file

you will need to manually (using a liceCD) amend those file to use the correct UUID.
you can get this value by:

sudo vol_id /dev/md0
sudo vol_id /dev/md1
sudo vol_id /dev/md2

since you are using RAID 1 you will amend on both disks (i think).

Revision history for this message
David McNeill (davemc) wrote :

We hit this too

 * Brand new HP ML 115 G1 Server with 2 x SATA disks, SBS Bundle ;-)
 * Fiesty Alternate install with AMD64
 * On each drive: /boot partition 100mb , Raid 1 partition huge , Swap partition 1gb
 * LVM on Raid
 * Everything else in install as default

To fix, boot to Fiesty live CD

Install mdadm with
 * sudo apt-get install mdadm

Install LVM with
 * sudo apt-get install lvm2

Start LVM
 * sudo /etc/init.d/lvm start

Make a mount point
 * sudo mkdir /mnt/newroot

Mount your new system
 * sudo mount /dev/vg0/lv0 /mnt/newroot

Edit fstab in the new system, to change from UUID to device
 * sudo nano /mnt/newroot/etc/fstab
Change the UUID to /dev/sda1 for the /boot mount point.

It will now start, might grizzle, but at least your up and running , and can sort out the issues with a working system.

Revision history for this message
AUCLAIR (frederic-auclair) wrote :

Always the problem...

If I wait for a certain time, error messages appear after "...no devices listed in config file" :

Loading multipath module
FATAL : Module dm_multipath not found
Loading multipath daemon
DM multipath kernel driver not loaded
last_lba(): I don't know how to handle files with mode 21b0
read error, sector 0
read error, sector 1
read error, sector 29

and after another few moment, i get the initramfs prompt.

My motherboard is a Gigabyte GA-M55plus-S3G. I think there is a problem with Sata controler and the 2 hard drives which makes multipath not OK... I will try a new BIOS if it exists !

bye,

Revision history for this message
AUCLAIR (frederic-auclair) wrote :

OK, I've upgrade BIOS from f6 to f12 (GA-M55plus-S3G Rev 1 : nVidia6100 + nForce 430)

I will see if it goes better in a few time...

Revision history for this message
Chuck Bridgeland (chuckbri) wrote :

I resolved this problem in my case by running "dpkg-reconfigure mdadm". I changed the default "all" to only the partition that houses "/".

Revision history for this message
AUCLAIR (frederic-auclair) wrote :

So the Bios didn't change anything.

But with dpkg-reconfigure mdadm, it's OK since 15 days !...

Thanks Chuck :-)

Revision history for this message
Steven Harms (sharms) wrote :

dpkg-reconfigure resolves this.

Changed in mdadm:
status: New → Fix Released
Changed in mdadm (Ubuntu):
status: Fix Released → Incomplete
status: Incomplete → Fix Released
assignee: nobody → vingslagsvisvidare@gmail.com (bjerry)
Revision history for this message
ceg (ceg) wrote :

What's your business messing with bug status? Please explain.

Changed in mdadm (Ubuntu):
assignee: vingslagsvisvidare@gmail.com (bjerry) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.