degraded NON-root raids never --run on boot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| mdadm (Ubuntu) |
Undecided
|
Unassigned | ||
| mountall (Ubuntu) |
Wishlist
|
Unassigned | ||
Bug Description
Systems with say /home on raid won't come up booting when raid was degraded during downtime.
An init script like /etc/init.
Because the proper mdadm --incremental mode command is not available (Bug #251646) a workaround needs to be used:
mdadm --remove <incomplete-
mdadm --incremental --run <arbitrary-
(See https:/
---
The possibility that large server RAIDs may take minutes until they come up, but regular ones are quick, can be handled nicely:
"NOTICE: /dev/mdX didn't get up within the first 10 seconds.
We continue to wait up to a total of xxx seconds complying to the ATA
spec before attempting to run the array degraded.
(You can lower this timeout by setting the rootdelay= parameter.)
<counter> seconds to go.
Press escape to stop waiting and to enter a rescue shell.
summary: |
- home array not run when degraded on boot + non-root raids fail to run degraded on boot |
present in 9.10
debian init script has been removed but no upstart job has been created to start/run necessary regular (non-rootfs) arrays degraded.
description: | updated |
summary: |
- non-root raids fail to run degraded on boot + degraded non-root raids are not run on boot |
description: | updated |
description: | updated |
summary: |
- degraded non-root raids are not run on boot + degraded non-root raids don't appear on boot |
description: | updated |
summary: |
- degraded non-root raids don't appear on boot + degraded NON-root raids never --run on boot |
description: | updated |
Sorry about the spam there, hit the wrong button.
The problem with this approach is that we generally don't *know* that a given filesystem is on a degraded RAID, because the RAID is not activated - so we can't see the filesystem UUID inside it.
mountall already provides the ability to drop to a shell, where the user can run mdadm --run
Does this not suffice?
Changed in mountall (Ubuntu): | |
status: | New → Invalid |
status: | Invalid → Triaged |
importance: | Undecided → Wishlist |
status: | Triaged → Incomplete |
Raid systems introduce redundancy to be able to keep working even if parts of the system fail.
I think the init.d/mdadm scripts (early/late or simillar) that debian uses to assemble and run degraded arrays on boot have been removed because all arrays are set up using udev. But we don't have any replacement functionality to run degraded non-root arrays on boot.
If we fail and drop to a recovery console on boot, the system isn't really failure tolerant.
Auto-running *only selected* arrays if they are found degraded on boot probably requires a watchlist:
* For each filesystem mentioned in fstab that depends on a an array, the watchlist file needs to describe its dependency tree of raid devices. The file needs to be (auto)recreated during update-initramfs.
* initramfs should only watch out for and run rootfs dependencies if necessary.
* later at boot mountall watches for and runs other (bootwait) filesystems mentioned in the watchlist.
* Is there a way to nicely auto-update the raid dependency trees of non-rootfs in the watch list upon changes?
* The file could be updated/validated on every shutdown.
For more context look for MD_COMPLETION_WAIT and "How would you decide what devices are needed?" at https:/
You said:
* For each filesystem mentioned in fstab that depends on a an array
This is the problem; fstab only gives us a filesystem UUID or LABEL in many cases, we simply *DO NOT KNOW* that's going to turn out to be on a RAID array.
I understand the md device dependencies are not available in fstab, I am not sure though what should prevent going through the filesystems and identifying their dependencies in the running system?
In the suggestion the fstab gets merely used as a starting point, a list of filesystems that get set up on boot. And prior to rebooting a list/tree is derived from it, that contains the md devices required to boot and to set up all filesystems mentioned in the fstab.
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 259145] Re: degraded NON-root raids never --run on boot | #9 |
On Mon, 2010-04-19 at 09:39 +0000, ceg wrote:
> I understand the md device dependencies are not available in fstab, I am
> not sure though what should prevent going through the filesystems and
> identifying their dependencies in the running system?
>
Because they might change after a reboot?
We *explicitly* support people doing that.
Scott
--
Scott James Remnant
<email address hidden>
> Because they might change after a reboot?
> We *explicitly* support people doing that.
Dropping to a rescue shell is the support if a new raid set up with another (rescue) system comes up degraded upon reboot. But I don't see why supporting that should prevent a proper raid setup. One that will --run unchanged arrays that come up degraded on reboot and are needed for a clean boot.
On Tue, 2010-04-20 at 16:22 +0000, ceg wrote:
> > Because they might change after a reboot?
> > We *explicitly* support people doing that.
>
> Dropping to a rescue shell is the support if a new raid set up with
> another (rescue) system comes up degraded upon reboot. But I don't see
> why supporting that should prevent a proper raid setup. One that will
> --run unchanged arrays that come up degraded on reboot and are needed
> for a clean boot.
>
Patches Welcome.
Scott
--
Scott James Remnant
<email address hidden>
Changed in mountall (Ubuntu): | |
status: | Incomplete → Triaged |
Steve Langasek (vorlon) wrote : | #12 |
This is an mdadm issue, not a mountall one.
Changed in mountall (Ubuntu): | |
status: | Triaged → Invalid |
Dimitri John Ledkov (xnox) wrote : | #13 |
we don't know which raids are required for rootfs, but using deduction, once rootfs did appear we know which raids were not required.
So we could for each incomplete non-rootfs raid do:
mdadm --remove <incomplete-
mdadm --incremental --run <arbitrary-
Dimitri John Ledkov (xnox) wrote : | #14 |
But this needs re-testing with recent RAID package.
This issue has been separated out from Bug #120375 in order to track it separately.
(Don't mark this as a duplicate, like 4 others before.)