IMSM fakeraid handled by mdadm: unclean mounted volumes on shutdown/reboot

Bug #1608495 reported by Stefan Bader
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Fix Committed
Critical
Dimitri John Ledkov
Xenial
Fix Committed
Critical
Dimitri John Ledkov
Yakkety
Won't Fix
Critical
Dimitri John Ledkov
Zesty
Won't Fix
Critical
Dimitri John Ledkov

Bug Description

Opening this report for Xenial and later as this problem does surface again due to moving to systemd.

Background:

mdadm is used to create md raid volumes based on Intel Matrix Storage Manager fakeraid metadata. The setup usually consists of a container set that contains one or more raid volumes. Which is the reason that those fakeraid volume are more affected by timing issues on shutdown/reboot.

In my specific setup I am using one of the IMSM raid volumes as a LVM PV and one LV of that is mounted as /home. The problem is that unmounting /home on shutdown/reboot will update the filesystem superblock which causes the raid state to become dirty for a small period of time. For that reason with sysvinit scripts there is a mdadm-waitidle script which *must* be run after the umountroot (or for /home at least after umountfs) script has run.

With Xenial both umountroot and umountfs are softlinks to /dev/null in /lib/systemd/system, so I am not sure they are good to cause the mdadm-waitidle to be delayed *after* all filesystems are unmounted.
Practically I see that if /home is mounted on shutdown/reboot, the raid set will go into a full resync the next time I boot (additional pain but different problem: the resync appears to be much more aggressive than in the past, delaying boot a lot and rendering the system barely usable until it finishes). If I manually unmount /home before the reboot, the raid set is good.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

annoying.

/lib/systemd/system-shutdown/mdadm.shutdown should be syncing everything up, but of course that's only in unstable / yakkety.

I have spare drives finally again, so I will setup ISMS on my xenial desktop and play around with it. Hopefully package from yakkety is "good enough" for SRU into xenial.

Changed in mdadm (Ubuntu):
assignee: nobody → Dimitri John Ledkov (xnox)
Revision history for this message
Stefan Bader (smb) wrote :

I can do some testing as well. Admittedly any unsuccessful run is a bit of a pain as the re-sync takes 1-2hrs.

Revision history for this message
Stefan Bader (smb) wrote :

I did a quick test and created a mdadm.shutdown in /lib/systemd/system-shutdown like the template in Yakkety (adapted path to binary and made it executable). Unfortunately that did not help. However there might be some step missing to activate it in systemd. I guess I need to make output more verbose and add some delay to be sure that this indeed is executed and when.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mdadm (Ubuntu):
status: New → Confirmed
Revision history for this message
Sergio Callegari (callegar) wrote :

See also 1320402 and 1587142

Changed in mdadm (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I have a reproducer test case, and a partial fix for xenial I think.
I will start a bileto PPA with test packages for this issue.

Changed in mdadm (Ubuntu Xenial):
status: New → Confirmed
Changed in mdadm (Ubuntu Yakkety):
status: New → Confirmed
Changed in mdadm (Ubuntu Xenial):
importance: Undecided → Critical
Changed in mdadm (Ubuntu Yakkety):
importance: Undecided → Critical
Changed in mdadm (Ubuntu Xenial):
milestone: none → xenial-updates
Changed in mdadm (Ubuntu Yakkety):
milestone: none → yakkety-updates
Changed in mdadm (Ubuntu Zesty):
milestone: none → ubuntu-17.03
Changed in mdadm (Ubuntu Yakkety):
assignee: nobody → Dimitri John Ledkov (xnox)
Changed in mdadm (Ubuntu Xenial):
assignee: nobody → Dimitri John Ledkov (xnox)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

as part of https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1722491 there is currently mdadm update available from xenial-proposed and zesty-proposed that might resolve this issue.

To test that solution please perform the following:

1) Install mdadm from xenial-proposed/zesty-proposed
   - See https://wiki.ubuntu.com/Testing/EnableProposed
   - Or download & install packages from
xenial
https://launchpad.net/ubuntu/+source/mdadm/3.4-4ubuntu0.1/+build/13596415
zesty
https://launchpad.net/ubuntu/+source/mdadm/3.3-2ubuntu7.5/+build/13596431

2) $ sudo apt install dracut-core

3) $ sudo systemctl enable mdadm-shutdown.service

4) $ sudo systemctl start mdadm-shutdown.service

After this the expectation is for shutdown/reboots to perform clean a shutdown, maintaining the raid array in a synced state, such that it comes up clean.

Please let me know if above resolves shutdown/reboot issues for you.

Regards,

Dimitri.

Changed in mdadm (Ubuntu):
status: Confirmed → Fix Committed
Changed in mdadm (Ubuntu Xenial):
status: Confirmed → Fix Committed
Changed in mdadm (Ubuntu Yakkety):
status: Confirmed → Won't Fix
Changed in mdadm (Ubuntu Zesty):
status: Confirmed → Fix Committed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Just a quick clarification, first boot/shutdown will still not be clean, but subsequent ones (those that are booted with the updated mdadm package) should be clean.

Revision history for this message
Stefan Bader (smb) wrote :

Followed the instructions from comment #7:
Base installation: Xenial/16.04
ii mdadm 3.4-4ubuntu0.1
ii dracut-core 044+3-3

Using IMSM based mdadm mounted on /home:
/dev/mapper/datavg01-home 197G 121G 67G 65% /home

PV VG Fmt Attr PSize PFree
/dev/md126p6 datavg01 lvm2 a-- 831,50g 353,85g

dev/md126p6:
      Container : /dev/md/imsm0, member 0
     Raid Level : raid5
     Array Size : 871895713 (831.50 GiB 892.82 GB)
  Used Dev Size : unknown
   Raid Devices : 3
  Total Devices : 3

          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-asymmetric
     Chunk Size : 64K

           UUID : cc707f7d:77869bd6:de8d52a2:ca21e329
    Number Major Minor RaidDevice State
       2 8 16 0 active sync /dev/sdb
       1 8 32 1 active sync /dev/sdc
       0 8 48 2 active sync /dev/sdd

Now done 2 reboots, one from text console directly after installing the new packages and one from the GUI. In both cases the array came up in sync.

Revision history for this message
John Center (john-center) wrote :

I saw this fix & installed it before your update went out. I did it by hand, so I didn't know about the dracut-core package or enabling/starting the mdadm-shutdown.service. Even without dracut-core, it worked for me.

I did make some changes to 64-md-raid-assembly.rules. I commented out the ENV{ANACONDA}=="?*", GOTO="md_inc_end" line, because I didn't think this applied to Ubuntu. I also made the following change:

ACTION=="add|change", IMPORT{program}="/sbin/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}"

I saw this was updated on the linux-raid list:
[PATCH 2/4] Use correct syntax for passing DEVLINKS to mdadm from udev

I also did 2 reboots & everything seems to be working correctly. One question: Does this mean that Ubuntu 18.04 will use mdadm by default during installation instead of dmraid?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

@john-center

I believe mdadm is already used by default during installation. However, when reinstalling my desktop it was not smooth at all. It seems like underlying devices got probed / mounted by something before mdadm-udeb was available in the d-i and thus i had to manualy unmount things and assemble the array from the d-i shell before continuing the installation. Apart from that ftab / bootloaders / initramfs / etc were all setup correctly.

w.r.t. changing / updating udev rules. Things are odd on xenial vs later releases, due to not using mdadm units correctly. =/ thus using offroot is currently broken on xenial, where rootfs is on the intel matrix raid, but I can double check if anything else can be improved.

Revision history for this message
John Center (john-center) wrote :

I don't think it's used by default for IMSM raid installation. That's why your devices were likely set up by dmraid first. When you mentioned in a previous comment about installing dracut-core, this automatically installs dmraid with kpartx. Mdadm never gets involved. When I've manually installed mdadm for IMSM raid, I first remove dmraid. That's the only way I was able to install Ubuntu 16.04 on top of IMSM raid1 via mdadm reliably.

Steve Langasek (vorlon)
Changed in mdadm (Ubuntu Zesty):
status: Fix Committed → Won't Fix
Revision history for this message
Joshua Diamant (joshdi) wrote :

I am having this issue using IMSM / VROC 6.2 on Ubuntu 18.04 LTS, Kernel 5.3.0-28, mdadm - v4.1-rc1 - 2018-03-22.

I am running a bcache cache device on one of the RAID 1 IMSM (VROC) arrays. At the very minimum, every reboot the device needs to resync (not shutting down clean).

What other information is required to help us debug this issue?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

TO correctly shutdown IMSM/VROC intel matrix raid devices, I expect you to need to install finalrd & add mdadm finalrd hook.

$ sudo apt install finalrd

Download the attached hook, place it into /usr/share/finalrd/mdadm.finalrd, mark it executable (sudo chmod +x /usr/share/finalrd/mdadm.finalrd)

Then reboot twice.

After the second reboot, i expect the raid to come up clean. Please report back if that helps, and then I should schedule to SRU this change to bionic.

This is how clean shutdown for IMSM/VROC is implemented in later releases.

Revision history for this message
Joshua Diamant (joshdi) wrote :

thank you so much, Dimitri. I will test and confirm

Revision history for this message
Joshua Diamant (joshdi) wrote :

Anything special needed for bcache for finalrd?

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1608495] Re: IMSM fakeraid handled by mdadm: unclean mounted volumes on shutdown/reboot

On Fri, 7 Feb 2020, 17:25 Joshua Diamant, <email address hidden>
wrote:

> Anything special needed for bcache for finalrd?

I don't know.

Revision history for this message
Joshua Diamant (joshdi) wrote :

First reboot: IMSM/VROC RAID1 resynced (as you stated)

I waited for the RAID volume to become clean and then rebooted.
Second reboot: Raid did not resync and bcache volume successfully mounted!
Third reboot: Raid did not resync and bcache volume successfully mounted!

Thank you so much, Dimitri! You fixed my issue

Revision history for this message
Joshua Diamant (joshdi) wrote :

It seems like the fix stopped working - On every reboot now, the IMSM VROC raid drive is being resynced

Is this related to a bug that produces this error? I see this a few times right before reboot

systemd shutdown [1] Failed to wait for process : protocol error

Revision history for this message
Joshua Diamant (joshdi) wrote :

I find that the issue is resolved if I run "/usr/bin/finalrd" after I boot so that the /run/initramfs directory is populated.

Do I need to run "/usr/bin/finalrd" on start (I see it runs on shutdown)?

Do I need any "/etc/initramfs-tools" hooks or scripts?

Revision history for this message
Joshua Diamant (joshdi) wrote :

I think this issue is occuring because I am running a bcache cache device on a VROC IMSM 'fake raid' device. Once I followed Dimitri's finalrd steps in post #14 (this is required), I also had to add the following script '/lib/systemd/system-shutdown/bcache_stop'

Please make sure you run 'chmod +x /lib/systemd/system-shutdown/bcache_stop' after creating the file with the contents below:

#!/bin/bash

for stop in /sys/block/bcache[0-9]*/bcache/stop
do
        [ -f "$stop" ] || continue
        #echo "Stopping $stop"
        echo 1 > $stop
        echo 1 > $stop
done

for stop in /sys/fs/bcache/*/stop
do
        [ -f "$stop" ] || continue
        #echo "Stopping $stop"
        echo 1 > $stop
        echo 1 > $stop
done

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.