md: detects stale members ahead of in-sync members

Bug #1568426 reported by Peter Cordes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
New
Undecided
Unassigned

Bug Description

My system boots from XFS on RAID10 on GPT partitions (no LVM). The RAID10 uses the "far2" layout, and has three component devices. I use grub-pc for non-EFI booting, because this system is old and doesn't support EFI (Intel DG965WH from 2008).

I added a fourth hard drive and shuffled my data around so I could re-partition the existing drives. (http://unix.stackexchange.com/questions/74924/how-to-safely-replace-a-not-yet-failed-disk-in-a-linux-raid5-array)

A few weeks the final `mdadm /dev/md0 --replace /dev/sda1 --with /dev/sdd1`, grub failed to boot. Error messages included "invalid arch-independent ELF magic", and `insmod linux` giving "not a regular file". Booting an Ubuntu live USB showed no problem with the FS, and none of dpkg-reconfigure grub-pc; grub-install /dev/sda ; update-grub helped. Before those attempts to fix it, grub was loading a messed-up menu but not quite booting Linux. After re-running grub-install, it stopped at the grub rescue> prompt.

sda is the first BIOS disk, but even having my BIOS boot a different disk didn't help. Presumably that doesn't affect the order GRUB detects them in.

I eventually solved the problem by swapping the SATA cables so the drive that didn't have a member of the boot array was not the first BIOS drive anymore. Now everything works perfectly.

I think GRUB's md code is including the first N members it sees, whether they're stale or not. Linux's MD code finds all candidates, and then picks N in-sync ones if available.

This was really hard to diagnose, because disk churn hadn't got the data so far out of sync that there were XFS errors. Directory listings of /boot/grub/i386-pc worked from the grub rescue shell, but the actual data in some of the files didn't match. (And even some of the inode contents were different, too, hence the "not a regular file")

I think wiping the RAID signature would have solved the problem as well. (mdadm --zero-superblock /dev/sda2, after making sure that was actually the stale device in the live-USB environment)

Here's mdadm -E from the stale component (which was sda2 before swapping cables, now it's sdd2).
This is what a component looks like after a --replace and --remove is done with it. After that: mdadm --detail /dev/md/root

peter@tesla:~$ sudo mdadm --examine /dev/sdd2
/dev/sdd2: #######
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0ad8202:4c270099:9f28ddd6:b597231d
           Name : tesla:root (local to host tesla)
  Creation Time : Thu Apr 16 14:26:50 2015 ### note that's 2015, last year.
     Raid Level : raid10
   Raid Devices : 3

 Avail Dev Size : 30703616 (14.64 GiB 15.72 GB)
     Array Size : 23027712 (21.96 GiB 23.58 GB)
    Data Offset : 16384 sectors
   Super Offset : 8 sectors
   Unused Space : before=16296 sectors, after=0 sectors
          State : clean
    Device UUID : 8ae879d7:b5c6b0ad:f2d6c787:49284d4b

    Update Time : Wed Mar 16 02:49:17 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1c62e134 - correct
         Events : 2708

         Layout : far=2
     Chunk Size : 1024K

   Device Role : Active device 2
   Array State : AAR ('A' == active, '.' == missing, 'R' == replacing)

/dev/sda2: ##### An in-sync component
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0ad8202:4c270099:9f28ddd6:b597231d
           Name : tesla:root (local to host tesla)
  Creation Time : Thu Apr 16 14:26:50 2015
     Raid Level : raid10
   Raid Devices : 3

 Avail Dev Size : 30703616 (14.64 GiB 15.72 GB)
     Array Size : 23027712 (21.96 GiB 23.58 GB)
    Data Offset : 16384 sectors
   Super Offset : 8 sectors
   Unused Space : before=16296 sectors, after=0 sectors
          State : clean
    Device UUID : 5d6bb778:1700264b:bd7aadba:11336f0b

    Update Time : Sat Apr 9 16:48:18 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4e39b4c0 - correct
         Events : 2740

         Layout : far=2
     Chunk Size : 1024K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)

peter@tesla:~$ sudo mdadm --detail /dev/md/root
/dev/md/root:
        Version : 1.2
  Creation Time : Thu Apr 16 14:26:50 2015
     Raid Level : raid10
     Array Size : 23027712 (21.96 GiB 23.58 GB)
  Used Dev Size : 15351808 (14.64 GiB 15.72 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Sat Apr 9 21:19:32 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : far=2
     Chunk Size : 1024K

           Name : tesla:root (local to host tesla)
           UUID : e0ad8202:4c270099:9f28ddd6:b597231d
         Events : 2740

    Number Major Minor RaidDevice State
       3 8 18 0 active sync /dev/sdb2
       4 8 2 1 active sync /dev/sda2
       6 8 34 2 active sync /dev/sdc2

ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: grub-pc 2.02~beta2-29ubuntu0.3
ProcVersionSignature: Ubuntu 4.2.0-35.40-generic 4.2.8-ckt5
Uname: Linux 4.2.0-35-generic x86_64
ApportVersion: 2.19.1-0ubuntu5
Architecture: amd64
CurrentDesktop: KDE
Date: Sat Apr 9 20:53:19 2016
SourcePackage: grub2
UpgradeStatus: Upgraded to wily on 2015-11-12 (149 days ago)

Revision history for this message
Peter Cordes (peter-cordes) wrote :
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.