md sw raid5 not detected on boot [jaunty regression]

Bug #376984 reported by Ben Bucksch
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
New
Undecided
Unassigned

Bug Description

Binary package hint: mdadm

Reproduction:
1. # mdadm --create /dev/md0 -l 5 -n8 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
2. # cat /proc/mdstat
3. # reboot
4. # cat /proc/mdstat

Actual result:
In step 2:
All is fine, md0 is shown as active (and syncing, and after a day, synced)
md0 : active raid5 sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      3418705472 blocks level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
unused devices: <none>

In step 4:
It wrongly detect some "md_d0" which I never created, on only sdd.
md_d0 : inactive sdd[3](S)
      488386496 blocks

I have no idea how it comes to that md_d0, nor why it singles our sdd.

Expected result:
Step 4 matches step 2.

Version:
Works in 8.10 Intrepid. Appeared after do-dist-upgrade to 9.04 Jaunty.
Array created in either 8.10 or 9.04 is detected correctly when I boot 8.10, but when I boot 9.04, I get the above problem. (I reinstalled an old backup of 8.10 in a different partition to try this out.)

Cost:
5? hours

Detailed information:

Creation:
# mdadm --stop /dev/md_d0 (clean up problem, to be able to re-try)

# mdadm --create /dev/md0 -l 5 -n8 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /d
ev/sdf /dev/sdg /dev/sdh
# echo "10000000" > /proc/sys/dev/raid/speed_limit_max

# pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created
# vgcreate array2 /dev/md0
  Volume group "array2" successfully created
# lvcreate array2 --name backupall --size 2T
  Logical volume "backupall" created
# lvdisplay
  --- Logical volume ---
  LV Name /dev/array2/backupall
  VG Name array2
  LV UUID oWSHAO-cdFK-d7nw-yRzf-OytB-j1Z7-MprUOk
  LV Write Access read/write
  LV Status available
  # open 0
  LV Size 2,00 TB
  Current LE 524288
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 252:0

# mkfs.xfs /dev/array2/backupall -L backup-all
# df -h
...
/dev/mapper/array2-backupall
                      2,0T 4,2M 2,0T 1% /backupall

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdh[8] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      3418705472 blocks level 5, 64k chunk, algorithm 2 [8/7] [UUUUUUU_]
      [============>........] recovery = 60.6% (296171520/488386496) finish=501.0min speed=6392K/sec

unused devices: <none>

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90
  Creation Time : Thu May 14 06:06:31 2009
     Raid Level : raid5
     Array Size : 3418705472 (3260.33 GiB 3500.75 GB)
  Used Dev Size : 488386496 (465.76 GiB 500.11 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu May 14 06:21:40 2009
          State : clean, degraded, recovering
 Active Devices : 7
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 1
...

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      3418705472 blocks level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]

unused devices: <none>

-------------------
After reboot:

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md_d0 : inactive sdd[3](S)
      488386496 blocks

# mdadm -E /dev/sda
/dev/sda:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 13e3f335:5cec114d:49feb062:8f17c717 (local to host volte)
  Creation Time : Thu May 14 06:06:31 2009
     Raid Level : raid5
  Used Dev Size : 488386496 (465.76 GiB 500.11 GB)
     Array Size : 3418705472 (3260.33 GiB 3500.75 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 0

    Update Time : Fri May 15 15:20:47 2009
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : a44802d3 - correct
         Events : 22

         Layout : left-symmetric
     Chunk Size : 64K

      Number Major Minor RaidDevice State
this 0 8 0 0 active sync /dev/sda

   0 0 8 0 0 active sync /dev/sda
   1 1 8 16 1 active sync /dev/sdb
   2 2 8 32 2 active sync /dev/sdc
   3 3 8 48 3 active sync /dev/sdd
   4 4 8 64 4 active sync /dev/sde
   5 5 8 80 5 active sync /dev/sdf
   6 6 8 96 6 active sync /dev/sdg
   7 7 8 112 7 active sync /dev/sdh

# mdadm -E /dev/sdd
/dev/sdd:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 13e3f335:5cec114d:49feb062:8f17c717 (local to host volte)
  Creation Time : Thu May 14 06:06:31 2009
     Raid Level : raid5
  Used Dev Size : 488386496 (465.76 GiB 500.11 GB)
     Array Size : 3418705472 (3260.33 GiB 3500.75 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 0

    Update Time : Fri May 15 15:20:47 2009
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : a4480309 - correct
         Events : 22

         Layout : left-symmetric
     Chunk Size : 64K

      Number Major Minor RaidDevice State
this 3 8 48 3 active sync /dev/sdd

   0 0 8 0 0 active sync /dev/sda
   1 1 8 16 1 active sync /dev/sdb
   2 2 8 32 2 active sync /dev/sdc
   3 3 8 48 3 active sync /dev/sdd
   4 4 8 64 4 active sync /dev/sde
   5 5 8 80 5 active sync /dev/sdf
   6 6 8 96 6 active sync /dev/sdg
   7 7 8 112 7 active sync /dev/sdh

# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
#DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# cfdisk /dev/sda (same for sda-sdh):

                         cfdisk (util-linux-ng 2.14.2)

                              Disk Drive: /dev/sdd
                       Size: 500107862016 bytes, 500.1 GB
             Heads: 255 Sectors per Track: 63 Cylinders: 60801

    Name Flags Part Type FS Type [Label] Size (MB)
 ------------------------------------------------------------------------------
                            Pri/Log Free Space 500105,25

# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdg doesn't contain a valid partition table

Disk /dev/sdh: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/sdh doesn't contain a valid partition table

Disk /dev/sdi: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00034265

   Device Boot Start End Blocks Id System
/dev/sdi1 * 1 243 1951866 83 Linux
/dev/sdi2 244 1459 9767520 83 Linux
/dev/sdi3 1460 2675 9767520 83 Linux
/dev/sdi4 2676 38913 291081735 5 Extended
/dev/sdi5 2676 3161 3903763+ 83 Linux
/dev/sdi6 3162 38913 287177908+ 83 Linux

Tags: regression
Revision history for this message
Bryn Hughes (linux-nashira) wrote :

I'm getting this too!!

Revision history for this message
nealda (theword) wrote :

Here's the workaround I came up with.
I encountered this problem on a do-release-upgrade from Intrepid to Jaunty on my Mythbuntu box. (RAID1 array)
<code>
no-array@mythbuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md_d0 : inactive sdc1[1](S)
      488383936 blocks

unused devices: <none>
</code>

First, stop the bogus array:
<code>
no-array@mythbuntu:~$ sudo mdadm --misc -S /dev/md_d0
mdadm: stopped /dev/md_d0
</code>

Reassemble the array properly:
<code>
no-array@mythbuntu:~$ sudo mdadm --assemble -v /dev/md0 /dev/sdc1 /dev/sdb1
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 0.
mdadm: added /dev/sdc1 to /dev/md0 as 1
mdadm: added /dev/sdb1 to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives.
</code>

Now you shouldn't be bummin':
<code>
no-array@mythbuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[0] sdc1[1]
      488383936 blocks [2/2] [UU]

unused devices: <none>
</code>

If you have an LVM volume on the array you can start it manually:
<code>
no-array@mythbuntu:~$ sudo vgchange -a y mythtv_vg
  1 logical volume(s) in volume group "mythtv_vg" now active
</code>

Then make sure your volumes are mounted as per fstab:
<code>
no-array@mythbuntu:~$ sudo mount -a
</code>

Now, if you reboot, you STILL won't have an array (that's the bug - I guess mdadm isn't scanning the superblocks properly.)
To get your system to activate the array on bootup, issue this command AFTER you've activated the array manually:
<code>
no-array@mythbuntu:~$ sudo mdadm --detail --scan --verbose
mdadm: metadata format 00.90 unknown, ignored.
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=8363b3a4:7d3241a4:826d159e:91e99981
   devices=/dev/sdb1,/dev/sdc1
</code>

Copy the last two lines of your output (beginning with "ARRAY") and paste them into your /etc/mdadm/mdadm.conf file.

Now if you reboot your array should be active.

Another source of the problem could be the metadata format of "00.90." According to the post at http://ubuntuforums.org/showthread.php?p=6345457#post6345457 the major release format has changed so you might want to change "metadata=00.90" to "metadata=0.90" in your mdadm.conf file. After you do that the unknown format warning goes away in the mdadm detail scan (but it still reports "metadata=00.90.")

Revision history for this message
ceg (ceg) wrote :

A quick workaround to apply before rebooting in in Bug #252345

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.