[karmic] software RAID not assembled at boot - blkid hangs using 100% CPU

Bug #377395 reported by Daniel Hahler
48
This bug affects 8 people
Affects Status Importance Assigned to Milestone
cryptsetup (Ubuntu)
Invalid
Undecided
Unassigned
mdadm (Ubuntu)
Invalid
Undecided
Unassigned
udev (Ubuntu)
Invalid
Undecided
Unassigned
util-linux (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Binary package hint: udev

Since upgrading to Karmic yesterday (through "apt-get dist-upgrade", not update-manager), I cannot boot anymore: the encrypted root filesystem does not get found ("Waiting for encrypted source device").

After some time, I get dropped to a busybox, where apparently, the RAID has not been scanned/assembled (/proc/mdstat does not exist).
I can manually open the device using "mdadm -A -s", "cryptsetup luksOpen /dev/md1 name" and "vgscan; vgchange -a y".

I guess that udev does not handle the "opening" of the raid correctly.

I've observed a lot of "/sbin/blkid -o udev -p /dev/.tmp-block-8:X" processes (see ps_aux) - once before (when I've exited busybox and the system kept booting (but with a r/o FS)), there were 16 of those processes.

I'm attaching also busybox-udev.tgz, which is /dev/.udev (from within the busybox), and find-dev (which is find /dev), maybe that contains some clues.

I'll also attach the initramfs-debug output, but apparently that file got lost/not saved correctly - so this needs another round of failure-and-LiveCD booting.

Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :

I've opened tasks for mdadm and cryptsetup, too - because I don't know where the problem really comes from.

I've tried hacking /scripts/local-top/cryptroot and added a call to "/sbin/mdadm -A -s" in setup_mapping after the "modprobe -q dm_crypt" call.

However, when dropped into busybox, only /dev/md2 was visible in /proc/mdstat.
Executing "mdadm -A -s" in busybox assembled all RAID arrays though.

I suspect this may have something to do with the changes in udev (e.g. using blkid instead of vol_id - the hanging blkid processes are suspect).

Revision history for this message
Daniel Hahler (blueyed) wrote :

This might be related to / fixed by:
http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commit;h=4ce48a971d322cc1f690dd036f906fc54261c657 (in util-linux-ng).

I'm in the process of rebuilding, will test it and report back.

Revision history for this message
Daniel Hahler (blueyed) wrote :

No luck, same result.
But there appear to be no more hanging blkid processes. I'm trying another patch now.

Revision history for this message
Daniel Hahler (blueyed) wrote :

The bug is in util-linux and fixed for (2.15.1).
The following two patches fixed it for me:
  - http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=4ce48a971d322cc1f690dd036f906fc54261c657
  - http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=7103157c8bbf37a56c7c385198267120f72b7866

I've uploaded a util-linux package with those patches to my PPA, in case somebody else falls into the same trap.

Changed in cryptsetup (Ubuntu):
status: New → Invalid
Changed in mdadm (Ubuntu):
status: New → Invalid
Changed in udev (Ubuntu):
status: New → Invalid
Changed in util-linux (Ubuntu):
importance: Undecided → High
status: New → Triaged
summary: - Karmic: encrypted root device (RAID+cryptsetup+LVM) gets not found
+ Karmic: encrypted root device (RAID+cryptsetup+LVM) gets not found;
+ might fail for RAID in general
Revision history for this message
Max Bowsher (maxb) wrote : Re: Karmic: encrypted root device (RAID+cryptsetup+LVM) gets not found; might fail for RAID in general

I confirm that this issue applies to software RAID in general, and that applying the two aforementioned upstream git changesets fixes it.

summary: - Karmic: encrypted root device (RAID+cryptsetup+LVM) gets not found;
- might fail for RAID in general
+ [karmic] software RAID not assembled at boot - blkid hangs using 100%
+ CPU
Revision history for this message
Max Bowsher (maxb) wrote :

Oh, and don't forget to run update-initramfs after installing a fixed blkid! (As I did at first.)

Revision history for this message
Olav Kolbu (olav-kolbu) wrote :

This solved a similar problem for me. Kernel took 5 minutes booting up and multiple long running blkid processes using up all the cpu were present after that. No sw raid or crypto involved however. See bug #378930. Thanks!

OK

Revision history for this message
Yannis Tsop (ogiannhs) wrote :

I have the same problem. mdadm -A -s seems to work and it really finds the md devices. But how do I boot after that??

Revision history for this message
Daniel Hahler (blueyed) wrote :

Yannis, try booting by exiting the busybox (ctrl-D). It did not work for me though (probably due to hanging processes).
Anyway, the best solution is to install the fixed packages (e.g. after booting from a Live CD and mounting your real filesystem, then chrooting into it etc).

Revision history for this message
Max Bowsher (maxb) wrote :

I find this bug to be fixed in util-linux 2.15.1~rc1-1ubuntu1 - marking as such. If anyone has further problems, please reopen or file separate new bugs, as appropriate.

Changed in util-linux (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
darren (darrenm) wrote :

I did an upgrade from Jaunty to Karmic yesterday and it has stopped my software RAID from working and I can't boot. I have 2 md devices (/dev/md0 and /dev/md1). After about 30s it drops me to a busybox prompt. My /dev/sda2 and /dev/sdb2 partitions are there but when I try to re-assemble the array it says /dev/sda2 in use.

I'll try again and see what is using the device and then report back.

This bug seems remarkably like what I'm experiencing so from my point of view it doesn't seem to be fixed. I'll check the version of util-linux when I can get to the box also.

Revision history for this message
robegue (r087r70) wrote :

I'm also encountering this bug after upgrading to karmic (2.6.31-12).
I can still boot using an older kernel (2.6.27-14).

Revision history for this message
Max Bowsher (maxb) wrote :

robegue: Please do not follow up on a closed bug unless you are certain you are experiencing the same problem. In your case, the fact that a different kernel version avoids the problem shows that you are NOT experiencing _this_ bug. Please open a new bug.

Jim Persson (blejdfist)
Changed in util-linux (Ubuntu):
status: Fix Released → Invalid
status: Invalid → Fix Released
Revision history for this message
Jonah (jonah) wrote :

i have tried a fresh install and i get stuck at initramfs with this error message:

ALERT! /dev/mapper/nvidia_dcfadeef2 does not exist. Dropping to a shell!

Revision history for this message
Max Bowsher (maxb) wrote :

Jonah: Please do not follow up on a closed bug unless you are certain you are experiencing the same problem. Please open a new bug.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.