Nested RAID levels aren't started after reboot

Bug #1171945 reported by RoyK on 2013-04-23
136
This bug affects 28 people
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Undecided
Unassigned

Bug Description

If creating a RAID5+0 or similar, the lower RAID-5s are started, but not the RAID-0 on top of them. I've tested this with Lucid (works), Precise (does not work) and Raring (does not work). A successive mdadm --assemble --scan finds the new RAID-0 and allows it to be mounted. On Lucid, however, this is found automatically. Something funky with udev?

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.5-1ubuntu0.2
ProcVersionSignature: Ubuntu 3.2.0-40.64-generic-pae 3.2.40
Uname: Linux 3.2.0-40-generic-pae i686
ApportVersion: 2.0.1-0ubuntu17.2
Architecture: i386
CurrentDmesg: [ 17.144075] eth0: no IPv6 routers present
Date: Tue Apr 23 19:28:48 2013
InstallationMedia: Ubuntu-Server 12.04.1 LTS "Precise Pangolin" - Release i386 (20120817.3)
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Bochs Bochs
MarkForUpload: True
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm-color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-40-generic-pae root=/dev/mapper/hostname-root ro
SourcePackage: mdadm
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/01/2007
dmi.bios.vendor: Bochs
dmi.bios.version: Bochs
dmi.chassis.type: 1
dmi.chassis.vendor: Bochs
dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr:
dmi.product.name: Bochs
dmi.sys.vendor: Bochs
etc.blkid.tab: Error: [Errno 2] No such file or directory: '/etc/blkid.tab'

RoyK (roysk) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mdadm (Ubuntu):
status: New → Confirmed

Also affecting my 3x2 8.0 GB RAID5+0 USB array in Ubuntu Server 12.10 64-bit.

variona (variona) wrote :

It crashed my RAID-1 on 2 SATA disks and ext4 FS, including filesystem damage.
It affected me after Kernel Update to Ubuntu 3.2.0-40-generic-pae (32bit)
(with 3.2.0.23 it worked fine)

variona (variona) wrote :

All I found till now points to a problem with AMD64 CPUs
http://ubuntuforums.org/showthread.php?t=2140322
http://www.ubuntu-forum.de/artikel/61514/kernel-upgrade-3-2-0-39-generic-system-bootet-nicht-mehr.html

Although I use the generic 32bit kernel, I also have an AMD64 processor!

RoyK (roysk) wrote :

I seriously doubt this is related to CPU architecture. It works on Lucid AMD64, but it's broken on Precise and later. I guess this is part of the mdadm assembly, and AFAIK this is done in the startup scripts, but then, I have no idea where…

John Weirich (weirich-j) wrote :

This also toasted my newly created raid-1 array the other day.

Details:
Ubuntu 12.04 64 bit
intel core i3 CPU

Two 2TB WDEARX20 green drives (SATA)

upon updating to 3.2.0-40-generic kernel, boot results in blank purple screen and md0 array is degraded, ejected one of the drives.

If I choose 3.2.0-39-generic from grub, boot happens fine and my array is intact.

RoyK (roysk) wrote :

anyone working on this one?

Seth Arnold (seth-arnold) wrote :

John, I think you may have a different problem that probably deserves a new bug report -- RoyK's problem is specifically with nested md devices.

RoyK (roysk) wrote :

Just tried to reproduce bug on Debian Wheezy, and couldn't. Wheezy assembles the nested raid without issues. Talked to xnox on #ubuntu-bugs, and he told me to look into mdadm's udev rules. I did, and compared it with the one in Wheezy, and the difference is pasted below. If I change the first line (both in /lib/udev/rules.d and in the initrd) from Wheezy, it works as before (ok with normal raids, not nested raids) - if I replace the whole file with the one from Wheezy or just add the GOTO below, it fails to assemble all raids.

Anyone that can help out here? I'm lost after a lot of digging…

thanks

roy

--- /lib/udev/rules.d/64-md-raid.rules 2013-05-30 14:28:58.966754000 +0200
+++ 64-md-raid.rules-debian 2013-05-30 14:13:41.850203999 +0200
@@ -3,11 +3,15 @@
 SUBSYSTEM!="block", GOTO="md_end"

 # handle potential components of arrays (the ones supported by md)
-ENV{ID_FS_TYPE}=="linux_raid_member", GOTO="md_inc"
+ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", GOTO="md_inc"
 GOTO="md_inc_skip"

 LABEL="md_inc"

+## DISABLED: Incremental udev assembly disabled
+## ** this is a Debian-specific change **
+GOTO="md_inc_skip"
+
 # remember you can limit what gets auto/incrementally assembled by
 # mdadm.conf(5)'s 'AUTO' and selectively whitelist using 'ARRAY'
 ACTION=="add", RUN+="/sbin/mdadm --incremental $tempnode"
root@raidtest:~#

RoyK (roysk) wrote :

Can someone please explain where Precise and later versions assemble the RAIDs? This test raid is not part of the root.

RoyK (roysk) wrote :

Hello? Are bugs like this one ignored by Canonical etc?

I'm able to get the nested RAID to work by booting into recovery mode, using the recovery menu to drop to a shell and running 'mdadm -As' ('mdadm --assemble --scan') then exiting the shell and selecting to continue a normal boot. There does not seem to be a way to boot non-interactively and have the top level RAID0 assemble correctly (at least that I've found).

RoyK (roysk) wrote :

What I'm seeing is that the top level assembles correctly, but the lower one(s) do not.

RoyK, how does that work? How does the top level raid assemble when the lower level devices aren't present?

RoyK (roysk) wrote :

perhaps we're disagreeing on top and bottom here. with a raid 0-1, with raid-0 being the initial devices setup and a raid-1 set across those, I name the latter the lower level.

RoyK (roysk) wrote :

Confirmed on current 14.04 as well. Isn't nested raids meant to be supported on ubuntu?

RoyK (roysk) wrote :

Still just as bad as earlier. Do any developers even read this?

François Guerraz (fguerraz) wrote :

Please fix this bug, we have to do horrible things to work around this problem!

RoyK (roysk) wrote :

The solution to this bug is to run Debian :P

tarantoga (tarantoga-2) wrote :

I am also affected by this regression.
It looks like Wheezy disables udev for md and uses an rc-script for assembling at boot; therefore it will not automatically assemble arrays after boot.
Trusty assembles real devices during and after boot, but seems to ignore virtual devices like /dev/md0.

So I added a new rule for virtual devices to the existing udev rules in Trusty. I created /etc/udev/rules.d/85-mdadm.rules with this single line rule:

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid_member", DEVPATH=="*/virtual/*", RUN+="/sbin/mdadm --incremental $tempnode"

This rule (plus the existing ones) assemble my nested raid during boot or any time later, but might not work for the root filesystem unless a initramfs-tools/hooks script is created (not tested).

tarantoga (tarantoga-2) wrote :
RoyK (roysk) wrote :

Thank you for this.

Tested in a 14.04 VM, created two 3-drive raid-5s (md0 and md1) and a raid-0 on top (md10), added them to mdadm.conf, added the udev rule, ran update-initramfs -u, put a vg+lv on md10, put a filesystem on the lv, filled up with some bogus data, rebooted, works. Unpacked the initrd, and I can confirm that the rules file isn't in the initrd. I do get a message during bootup: "The disk drive for /raidtest is not ready yet or not present. keys:Continue to wait…" (see screenshot). However, booting continues immediately, so it should be a problem.

Please include this in the next update. I guess this fix should work for 12.04 too, although I have not tested yet.

roy

tarantoga (tarantoga-2) wrote :

I do not see a message like "The disk drive for /... is not ready yet or not present" on my system. I'm using a "real" installation (no VM) of 14.04.1 on a spinning disk and I do not use lvm. I'm guessing that this message appears because lvm has to wait for mdadm to assemble the raid.

RoyK (roysk) wrote :

Probably just a glitch in the matrix :P

Anyway - I think this should go in, even without it being in initrd. I would guess very few use nested raids for their root…

Khee Hong Loke (khloke) wrote :

I have a hybrid nested RAID5 (3x 3TB + 2x 1.5TB (RAID0)) and this bug affects me. The solution posted by tarantoga works for me. Thanks!

Luis Alvarado (luisalvarado) wrote :

This is affecting me on 14.04 64-bit. I thought I was crazy. The workaround works but I was hoping to have a steady system.

Sasa Paporovic (melchiaros) wrote :

The trusty tag was added, due user comments.

tags: added: trusty
Greg Eginton (gregedgi) wrote :

I'm running Lubuntu 14.04 and confirm the same issue.
Thank you for this work around worked treat, I've been searching for a solution to this issue for weeks.
No News of a proper fix for mdadm?

RoyK (roysk) wrote :

I guess the fix suggested is the proper fix. No idea why it's being ignored like this. I reported it more than two years ago, it's in an LTS and its fix is a single udev line.

Luis Alvarado (luisalvarado) wrote :

I have not being able to test this on 16.04, but is this still happening on 16.04 and above?

Give me some time - I have a test VM somewhere to test this… It's just a wee bit late now (CEST)

Vennlig hilsen / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
Da mihi sis bubulae frustrum assae, solana tuberosa in modo Gallico fricta, ac quassum lactatum coagulatum crassum. Quod me nutrit me destruit.

----- Original Message -----
> From: "Luis Alvarado" <email address hidden>
> To: "Roy Sigurd Karlsbakk" <email address hidden>
> Sent: Thursday, 15 September, 2016 04:16:52
> Subject: [Bug 1171945] Re: Nested RAID levels aren't started after reboot

> I have not being able to test this on 16.04, but is this still happening
> on 16.04 and above?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1171945
>
> Title:
> Nested RAID levels aren't started after reboot
>
> Status in mdadm package in Ubuntu:
> Confirmed
>
> Bug description:
> If creating a RAID5+0 or similar, the lower RAID-5s are started, but
> not the RAID-0 on top of them. I've tested this with Lucid (works),
> Precise (does not work) and Raring (does not work). A successive mdadm
> --assemble --scan finds the new RAID-0 and allows it to be mounted. On
> Lucid, however, this is found automatically. Something funky with
> udev?
>
> ProblemType: Bug
> DistroRelease: Ubuntu 12.04
> Package: mdadm 3.2.5-1ubuntu0.2
> ProcVersionSignature: Ubuntu 3.2.0-40.64-generic-pae 3.2.40
> Uname: Linux 3.2.0-40-generic-pae i686
> ApportVersion: 2.0.1-0ubuntu17.2
> Architecture: i386
> CurrentDmesg: [ 17.144075] eth0: no IPv6 routers present
> Date: Tue Apr 23 19:28:48 2013
> InstallationMedia: Ubuntu-Server 12.04.1 LTS "Precise Pangolin" - Release i386
> (20120817.3)
> Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> MachineType: Bochs Bochs
> MarkForUpload: True
> ProcEnviron:
> LANGUAGE=en_US:en
> TERM=xterm-color
> PATH=(custom, no user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-40-generic-pae
> root=/dev/mapper/hostname-root ro
> SourcePackage: mdadm
> UpgradeStatus: No upgrade log present (probably fresh install)
> dmi.bios.date: 01/01/2007
> dmi.bios.vendor: Bochs
> dmi.bios.version: Bochs
> dmi.chassis.type: 1
> dmi.chassis.vendor: Bochs
> dmi.modalias:
> dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr:
> dmi.product.name: Bochs
> dmi.sys.vendor: Bochs
> etc.blkid.tab: Error: [Errno 2] No such file or directory: '/etc/blkid.tab'
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1171945/+subscriptions

RoyK (roysk) wrote :
Download full text (5.6 KiB)

Seems to work well on 16.04, so the bug isn't apparent there. It probably still is on older versions, so as far as those older distros are supported, the bug should not be closed. I guess this bug's root is somewhere in the upstart parts.

Vennlig hilsen / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
Da mihi sis bubulae frustrum assae, solana tuberosa in modo Gallico fricta, ac quassum lactatum coagulatum crassum. Quod me nutrit me destruit.

----- Original Message -----
> From: "Roy Sigurd Karlsbakk" <email address hidden>
> To: "Roy Sigurd Karlsbakk" <email address hidden>
> Sent: Thursday, 15 September, 2016 05:22:49
> Subject: Re: [Bug 1171945] Re: Nested RAID levels aren't started after reboot

> Give me some time - I have a test VM somewhere to test this… It's just a
> wee bit late now (CEST)
>
> Vennlig hilsen / Best regards
>
> roy
> --
> Roy Sigurd Karlsbakk
> (+47) 98013356
> http://blogg.karlsbakk.net/
> GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
> --
> Da mihi sis bubulae frustrum assae, solana tuberosa in modo Gallico fricta, ac
> quassum lactatum coagulatum crassum. Quod me nutrit me destruit.
>
> ----- Original Message -----
>> From: "Luis Alvarado" <email address hidden>
>> To: "Roy Sigurd Karlsbakk" <email address hidden>
>> Sent: Thursday, 15 September, 2016 04:16:52
>> Subject: [Bug 1171945] Re: Nested RAID levels aren't started after reboot
>
>> I have not being able to test this on 16.04, but is this still happening
>> on 16.04 and above?
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1171945
>>
>> Title:
>> Nested RAID levels aren't started after reboot
>>
>> Status in mdadm package in Ubuntu:
>> Confirmed
>>
>> Bug description:
>> If creating a RAID5+0 or similar, the lower RAID-5s are started, but
>> not the RAID-0 on top of them. I've tested this with Lucid (works),
>> Precise (does not work) and Raring (does not work). A successive mdadm
>> --assemble --scan finds the new RAID-0 and allows it to be mounted. On
>> Lucid, however, this is found automatically. Something funky with
>> udev?
>>
>> ProblemType: Bug
>> DistroRelease: Ubuntu 12.04
>> Package: mdadm 3.2.5-1ubuntu0.2
>> ProcVersionSignature: Ubuntu 3.2.0-40.64-generic-pae 3.2.40
>> Uname: Linux 3.2.0-40-generic-pae i686
>> ApportVersion: 2.0.1-0ubuntu17.2
>> Architecture: i386
>> CurrentDmesg: [ 17.144075] eth0: no IPv6 routers present
>> Date: Tue Apr 23 19:28:48 2013
>> InstallationMedia: Ubuntu-Server 12.04.1 LTS "Precise Pangolin" - Release i386
>> (20120817.3)
>> Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> MachineType: Bochs Bochs
>> MarkForUpload: True
>> ProcEnviron:
>> LANGUAGE=en_US:en
>> TERM=xterm-color
>> PATH=(custom, no user)
>> LANG=en_US.UTF-8
>> SHELL=/bin/bash
>> ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-40-generic-pae
>> root=/dev/mapper/hostname-root ro
>> SourcePackage: mdadm
>> UpgradeStatus: No upgrade log present (probably fresh install)
>> dmi.b...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers