Ubuntu
udev package

Race condition at system-boot: md-RAID not always ready in time

Bug #610107 reported by Arno Wagner on 2010-07-26

This bug affects 5 people

Affects		Status	Importance	Assigned to	Milestone
	udev (Ubuntu)	Confirmed	Undecided	Unassigned

Bug Description

Binary package hint: udev

Almost each time I start the System I get error-messages concerning my md-RAID devices.

Example:
"udevd-work[77]: inotify_add_watch(6, /dev/md1, 10) failed: No such file or directory"

Sometimes only one of my several RAIDs is concerned, sometimes more of them.
If this error shows up for the RAID my root directory is located on, the system won't boot but drop to a shell.
If only data-drives are concerned, the system finishes the boot process normally and all RAID drives are up by then.

There are no remarkable entries in the log.

The problem occurs on all three Lucid Installations I have made so far. (One was an upgrade from Karmic, the next was a fresh install, both on real hardware. The third is the one this report is done with, and this is a fresh Installation in a VirtualBox)

/etc/mdadm/mdadm.conf:
# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=cd601f2c:76c2a84f:2d20de61:3cd29610
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=0.90 UUID=5057c6bb:a8f652b9:2d20de61:3cd29610

blkid:
/dev/sda1: UUID="712d6c10-d9a5-4471-83b8-1e1f2749f817" TYPE="ext4"
/dev/sda5: UUID="696c8371-1612-4654-ac21-a7b08b35c950" TYPE="swap"
/dev/sdb1: UUID="cd601f2c-76c2-a84f-2d20-de613cd29610" TYPE="linux_raid_member"
/dev/sdc1: UUID="cd601f2c-76c2-a84f-2d20-de613cd29610" TYPE="linux_raid_member"
/dev/sdd1: UUID="5057c6bb-a8f6-52b9-2d20-de613cd29610" TYPE="linux_raid_member"
/dev/sde1: UUID="5057c6bb-a8f6-52b9-2d20-de613cd29610" TYPE="linux_raid_member"
/dev/md1: UUID="tM2vUv-zY1H-i1LW-wlDb-3qit-23uE-33fxP6" TYPE="LVM2_member"
/dev/md0: UUID="5MJTwi-LIkA-oxV4-328f-nqD6-nev6-fFzRHD" TYPE="LVM2_member"
/dev/mapper/vg1-test: UUID="de6e51c6-4cce-4810-9cb9-47d1a730ca6d" TYPE="jfs"

Ubuntu-Release:
Description: Ubuntu 10.04.1 LTS
Release: 10.04

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: udev 151-12
ProcVersionSignature: Ubuntu 2.6.32-24.38-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic i686
Architecture: i386
CustomUdevRuleFiles: 70-xorg-vboxmouse.rules 60-vboxadd.rules
Date: Mon Jul 26 16:18:20 2010
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta i386 (20100318)
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: innotek GmbH VirtualBox
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=712d6c10-d9a5-4471-83b8-1e1f2749f817 ro quiet splash
ProcEnviron:
LANG=de_DE.utf8
SHELL=/bin/bash
SourcePackage: udev
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH

Tags:

Revision history for this message

Arno Wagner (arnow-deactivatedaccount) wrote on 2010-07-26:

BootDmesg.txt Edit (28.8 KiB, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (2.5 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (1.3 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (4.5 KiB, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (564 bytes, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (999 bytes, text/plain; charset="utf-8")
ProcModules.txt Edit (2.5 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (82.5 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (171.3 KiB, text/plain; charset="utf-8")

Launchpad Janitor (janitor) on 2011-08-18

Changed in udev (Ubuntu):
status:	New → Confirmed

Revision history for this message

Gernot Hillier (gernot-hillier) wrote on 2011-09-20:

We also see somehow similar issues here sporadically on a number of machines with 10.04.1 - in our case, we only have a data-RAID which is not necessary for mounting root partition. And this data raid will stay half-way assembled on some boots.

Currently, we think it's caused by udevd being killed in the middle of its operation, see #613273.

If "mdadm --incremental" is interrupted at the wrong moment, it seems to cause a lot of weird issues - ranging from a leftover /dev/.tmp.md.8:xx which makes mdadm bailing out with "Strange error loading metadata for /dev/md0" for all future operations until reboot to completely damaged data structures in the kernel with wrong device numbers, half-busy devices and the like.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntuudev package