dmraid broken for large drives

Bug #1054948 reported by JK
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Attempting to set up a fresh Quantal system on a system with Intel 'fakeraid' (imsm, 1.3) and 3TB SATA RAID1. When running the live CD, dmraid identifies the drives as being of size 746GB (Same effect with a Windows 7 setup without loading the corresponding Intel drivers prior to partitioning).
dmraid -an does not release the mappings. The work-around for the installation is to boot with nodmraid, install mdadm and assemble manually prior to starting the installation. dmraid seems to be severly broken for a pretty much standard PC setup.
mdadm can handle Intel imsm firmware-backed RAID just fine.

After installation, several measures still have to be taken when the system /root is also part of that array/. See #1054773 and references for major issues regarding this.

SInce mdadm is actively supported by Intel and dmraid is at best in a maintenance state, deprecation of dmraid in favor of mdadm would be welcome.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

How does one migrate from dmraid to mdadm when using IMSM?

Revision history for this message
JK (j-c-k) wrote :

For new installations, mdadm should be the default. Migration is not necessary, dmraid could be post-installed. As dmraid isn't under development anymore and drive capacities and the kind of platform won't go away, this will sooner rather than later become a widespread critical issue.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

The problem was that after upgrading mdadm, previously working systems with dmraid fail to boot due to fighting over isms raid drives with mdadm. There are still many formats supported by dmraid that are not supported by mdadm, somehow we need to make two co-exist happily.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

hence the later upload of disabling auto-assembly of isms drives in mdadm.

Revision history for this message
JK (j-c-k) wrote :

Just for reference, with disable auto-assembly you mean the changes in mdadm 3.2.5-1ubuntu0.2 (#1030292)?
Would one be required to set up custom udev rules or which package version contains rules for mdadm auto-assembly?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

the 1030292 removed incremental udev rules based auto assembly of external metadata raid arryas, isms and ddf.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

within the mdadm package.

Revision history for this message
JK (j-c-k) wrote :

But wouldn't that simply only imply that dmraid and mdadm are not supposed to co-exist? I'm a bit puzzled about this change. As far as I understand this, the auto-assemble rules have been removed just to guarantee the function of dmraid - temporarily or not.
Automated installation procedure aside, it is currently not possible to remove dmraid and add mdadm to get an imsm 1.3 (external metadata) running out of the box, right? If no dmraid is present, the cut-down of mdadm does not seem necessary.

It seems this can also be user-fixed without jumping through many hoops by restoring previous rules, right?

From my perspective it then boils down to the problem that, even if auto-assembly would be enabled, mdadm currently does not support the root being member of the array itself. As I mentioned in 1028677, mdmon is missing in the initrd which prohibts the any assembly at that point. The effect is that mdadm sets of read-only devices, instead of auto-read-only. With adding mdmon this boots but during shutdown the array seems to be taken down too early so the system does not finish that gracefully. There have been attempts to fix this in debian (1054773). See also [1].

[1] http://www.spinics.net/lists/raid/msg37335.html

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1054948] Re: dmraid broken for large drives

On 24 September 2012 11:09, Jan Kleinsorge <email address hidden> wrote:
> But wouldn't that simply only imply that dmraid and mdadm are not supposed to co-exist? I'm a bit puzzled about this change. As far as I understand this, the auto-assemble rules have been removed just to guarantee the function of dmraid - temporarily or not.

They should. As one may have both dmraid & mdadm arrays (none of which
are isms) and possibly dmraid & mdadm & isms arrays (managed by either
dmraid / mdadm). I do agree that we should now default to mdadm when
dealing with isms raid arrays, but we need (i) not break existing
dmraid isms arrays (ii) co-existing with dmraid managed isms raid
arrays (optionally) (iii) default to mdadm for isms.

Server, net-install and alternative installer images have both dmraid
& mdadm. Desktop CDs have dmraid. It was planned for this cycle for
Desktop CDs to also have mdadm, but this is now post-poned to
R-series.

> Automated installation procedure aside, it is currently not possible to remove dmraid and add mdadm to get an imsm 1.3 (external metadata) running out of the box, right? If no dmraid is present, the cut-down of mdadm does not seem necessary.
>

No, it is not currently not possible, as most likely the system will
fail-to boot out of the box due to lack of initramfs integration for
isms arrays.

> It seems this can also be user-fixed without jumping through many hoops
> by restoring previous rules, right?
>

(a) include mdmon in initramfs
(b) grab upstream mdadm.udev rules and drop them into
/etc/udev/rules.d/ with same file name as the one in
/lib/udev/rules.d/
(c) remove dmraid

update-initramfs and attempt to reboot =)
keep dmraid capable initramfs as a backup.

> >From my perspective it then boils down to the problem that, even if
> auto-assembly would be enabled, mdadm currently does not support the
> root being member of the array itself. As I mentioned in 1028677, mdmon
> is missing in the initrd which prohibts the any assembly at that point.
> The effect is that mdadm sets of read-only devices, instead of auto-
> read-only. With adding mdmon this boots but during shutdown the array
> seems to be taken down too early so the system does not finish that
> gracefully. There have been attempts to fix this in debian (1054773).
> See also [1].
>

Yes, yes, yes, yes, yes.

I am planning on reviewing patches proposed in Debian for inclusion in
Ubuntu. In particular I'd like to include mdmon in the initramfs as
well as add the hooks to shutdown cleaner.

It would be nice to include mdmon in the initramfs conditionally, if
isms/ddf are detected.

The next step would be to see if isms-capable mdadm can co-exist with
dmraid. If yes, all is great. If not, I would be considering
alternative routes, e.g. patching out isms support out of dmraid =/
but that looks ugly.

The other concern is that unfortunately I do not have isms hardware
available so I cannot test this myself =(
I'm busy adding LVM support to ubiquity right now. But I will
hopefully look into all of this towards the end of the week.

Regards,

Dmitrijs.

Revision history for this message
JK (j-c-k) wrote :

If co-existence of dmraid and mdadm would be just an incremental step for later versions, exclusive-or would at least a step forward for mdadm and I would be happy to test that (isms root + shutdown).

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

On 24 September 2012 15:46, Jan Kleinsorge <email address hidden> wrote:
> If co-existence of dmraid and mdadm would be just an incremental step
> for later versions, exclusive-or would at least a step forward for mdadm
> and I would be happy to test that (isms root + shutdown).
>

Ok. I will ping you, after I have something ready. Probably without
the installer though....

Revision history for this message
JK (j-c-k) wrote :

Great.

Revision history for this message
Andreas Allacher (andreas-allacher) wrote :

I am just wondering if mdadm will add support for promise metadata - as it would be required by AMD RAID controllers ?

Revision history for this message
Andreas Allacher (andreas-allacher) wrote :

Ah. Forgot to mention more related to this bug.
I am using a 990FX chipet and have connected two 3TB drives to the RAID controller.
I am using Ubuntu 12.10 and if I now create a RAID on those 3TB drives, it does not matter if it only has a size of 2TB or 1TB, it isn't recognized by dmraid at all (I do not even get a mapper or if I use dmraid -ay -v it only says no raid disks.

Any idea on how I could fix this?

Revision history for this message
Brian Candler (b-candler) wrote :

In comment #9, xnox says:

"(b) grab upstream mdadm.udev rules and drop them into /etc/udev/rules.d/ with same file name as the one in /lib/udev/rules.d/"

Taking upstream as git://neil.brown.name/mdadm, it contains two udev rules. Combining them in a way which matches the Ubuntu ones, I get the differences pasted below.

Some of this seems reasonable, but what about the change from $tempnode to $devnode? Would it be better just to change the ENV{ID_FS_TYPE} line and leave everything else alone?

A google suggests that only very recent versions of udev have $devnode:
http://bugs.funtoo.org/browse/FL-720

Regards,

Brian.

--- udev-orig 2013-10-08 14:42:32.504173249 +0100
+++ udev-combined 2013-10-08 14:54:23.958998034 +0100
@@ -3,14 +3,14 @@
 SUBSYSTEM!="block", GOTO="md_end"

 # handle potential components of arrays (the ones supported by md)
-ENV{ID_FS_TYPE}=="linux_raid_member", GOTO="md_inc"
+ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", GOTO="md_inc"
 GOTO="md_inc_skip"

 LABEL="md_inc"

 # remember you can limit what gets auto/incrementally assembled by
 # mdadm.conf(5)'s 'AUTO' and selectively whitelist using 'ARRAY'
-ACTION=="add", RUN+="/sbin/mdadm --incremental $tempnode"
+ACTION=="add", RUN+="/sbin/mdadm --incremental $devnode --offroot"
 ACTION=="remove", ENV{ID_PATH}=="?*", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
 ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="/sbin/mdadm -If $name"

@@ -27,11 +27,11 @@
 # container devices have a metadata version of e.g. 'external:ddf' and
 # never leave state 'inactive'
 ATTR{md/metadata_version}=="external:[A-Za-z]*", ATTR{md/array_state}=="inactive", GOTO="md_ignore_state"
-TEST!="md/array_state", GOTO="md_end"
-ATTR{md/array_state}=="|clear|inactive", GOTO="md_end"
+TEST!="md/array_state", ENV{SYSTEMD_READY}="0", GOTO="md_end"
+ATTR{md/array_state}=="|clear|inactive", ENV{SYSTEMD_READY}="0", GOTO="md_end"
 LABEL="md_ignore_state"

-IMPORT{program}="/sbin/mdadm --detail --export $tempnode"
+IMPORT{program}="/sbin/mdadm --detail --export $devnode"
 ENV{DEVTYPE}=="disk", ENV{MD_NAME}=="?*", SYMLINK+="disk/by-id/md-name-$env{MD_NAME}", OPTIONS+="string_escape=replace"
 ENV{DEVTYPE}=="disk", ENV{MD_UUID}=="?*", SYMLINK+="disk/by-id/md-uuid-$env{MD_UUID}"
 ENV{DEVTYPE}=="disk", ENV{MD_DEVNAME}=="?*", SYMLINK+="md/$env{MD_DEVNAME}"
@@ -40,7 +40,7 @@
 ENV{DEVTYPE}=="partition", ENV{MD_DEVNAME}=="*[^0-9]", SYMLINK+="md/$env{MD_DEVNAME}%n"
 ENV{DEVTYPE}=="partition", ENV{MD_DEVNAME}=="*[0-9]", SYMLINK+="md/$env{MD_DEVNAME}p%n"

-IMPORT{program}="/sbin/blkid -o udev -p $tempnode"
+IMPORT{builtin}="blkid"
 OPTIONS+="link_priority=100"
 OPTIONS+="watch"
 ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"

Revision history for this message
Brian Candler (b-candler) wrote :

As an experiment, I did the following:

* rebuild /sbin/mdadm and /sbin/mdmon from source, install in /sbin
* vi /usr/share/initramfs-tools/hooks/mdadm
    - add "copy_exec /sbin/mdmon /sbin"
* vi /lib/udev/rules.d/64-md-raid.rules
    - ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", GOTO="md_inc"
    - ACTION=="add", RUN+="/sbin/mdadm --incremental $tempnode --offroot"
* /usr/share/mdadm/mkconf >/etc/mdadm/mdadm.conf
* apt-get remove dmraid; apt-get autoremove
* vi /etc/fstab, set root to be /dev/md/Volume0p1
* reboot and set root=/dev/md/Volume0p1 on the command line

It did actually come up, with /dev/md125p1 as the root filesystem; cat /proc/mdstat showed it resyncing.

However there were a number of problems:

(1) the following messages at boot time

mdadm: CREATE user root not found
mdadm: CREATE group disk not found

(2) if I try to run update-grub:

# update-grub
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-52-generic
Found initrd image: /boot/initrd.img-3.2.0-52-generic
/usr/sbin/grub-probe: error: no such disk.
/usr/sbin/grub-probe: error: no such disk.
...
Found linux image: /boot/vmlinuz-3.2.0-29-generic
Found initrd image: /boot/initrd.img-3.2.0-29-generic
/usr/sbin/grub-probe: error: no such disk.
/usr/sbin/grub-probe: error: no such disk.
/usr/sbin/grub-probe: error: no such disk.
Found memtest86+ image: /boot/memtest86+.bin
done

(2a) A later reboot showed the kernel command line had root=/dev/md125p1 - which works, however I would have preferred the more persistent /dev/md/Volume0p1 since the md device numbers are prone to renumbering.

(3) @sbin/mdmon is still running, which suggests that the initramfs instance hasn't been replaced (--takeover)

(4) reboot hung at this point:

 * Stopping MD monitoring service mdadm --monitor [ OK ]
 * Asking all remaining processes to terminate [ OK ]
 * All processes ended within 5 seconds.... [ OK ]
<< wait >>
[ 1324.030873] INFO: task jbd2/md125p1-8:765 blocked for more than 120 seconds.
[ 1324.030957] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1324.031222] INFO: task dd:5714 blocked for more than 120 seconds.
[ 1324.031298] "echo 0 >/proc/sys/kernel/hung_task_timeout_secs" disables this message.
<< wait >>
[ 1444.030497] INFO: task jbd2/md125p1-8:765 blocked ... etc
[ 1444.030807] INFO: task flush-9:125:2405 blocked ... etc
[ 1444.031243] INFO: task dd:5714 blocked ... etc

and it needed a hard reset. But after the hard reset, it did come up OK, and md125 was still in sync.

(5) obviously, any local changes to /sbin/mdadm, /sbin/mdmon or config files will be lost if the packages are updated

So I'd say this approach is not something to recommend for production use yet, but if it makes it to 14.04 LTS that would be great.

I also tried reverting all the changes listed at the top of this message, to turn it back into dmraid. That worked, although on first boot I had to give the full root=/dev/mapper/isw_XXXXXXXXXX_Volume0p1 parameter on the kernel command line (to replace root=/dev/md125p1). After this, "update-grub" worked as usual, and subsequent reboots didn't hang.

Revision history for this message
Brian Candler (b-candler) wrote :

Aside: also just tried booting from 13.10beta2 ISO in rescue mode.

It is still not using mdadm. It sees /dev/mapper/isw_XXXXXXXXXX_Volume0 but not /dev/mapper/isw_XXXXXXXXXX_Volume0p1.

As a result it did not offer this as a root filesystem for the rescue shell; and I couldn't see how to rescan it for partitions (there is no 'rescan' under /sys/block/dm-0)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mdadm (Ubuntu):
status: New → Confirmed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Re: Brian Candler

Thanks for that. Why rebuilding /sbin/mdadm & /sbin/mdmon? Both are provided in the standard package.

Also I'd expect for you to revert initramfs-tools scripts which generate, filter and replace a different mdadm.conf file into initramfs.

Revision history for this message
vvro (vvro) wrote :

Since I've upgraded to 13.10, my raid5 on the intel software raid doesn't mount with the dmraid. I use an older 3.8 kernel I was using on my 13.04.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1241086

Continues to not work for me. But at the initramfs prompt I'm able to manually mount the raid with mdadm, but not dmraid as it gives
device-mapper: table252:0:raid45:unknown target type

Has there been any development in replacing dmraid with mdadm? I'll try later, on migrating it.

Revision history for this message
Brian Candler (b-candler) wrote :

> Why rebuilding /sbin/mdadm & /sbin/mdmon? Both are provided in the standard package.

The versions in the old standard package are not able to handle 'BIOS RAID' (aka 'fakeraid') metadata. And therefore booting such volumes relies on the old and crufty dmraid tools.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.