grub-install breaks when ESP is on raid

Bug #1466150 reported by Tony Middleton
146
This bug affects 29 people
Affects Status Importance Assigned to Milestone
grub-installer (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

I run a server with mirrored (RAID1) disks using grub-efi.

Root and /boot and /boot/grub are on mirrored partitions.

I have EFI partitions on both disks but it is not possible to RAID1 these as they are FAT32. On an EFI system grub-install will only install to one of the EFI partitions and so after running install-grub you have to remember to copy the EFI file across.

Could grub configuration and grub-install be amended to automatically install to multiple disks?

Searching around there seem to be many people asking this question without any elegant solution.

Revision history for this message
Phillip Susi (psusi) wrote :

Choice of filesystem has nothing to do with raid. You can put any FAT32 partition, including the ESP, on a raid1 if you want. You just need to make sure to use md format 0.9 or 1.0 instead of 1.1 or 1.2 so the firmware will still recognize it.

Changed in grub2 (Ubuntu):
status: New → Invalid
Revision history for this message
Tony Middleton (ximera) wrote :
Download full text (4.0 KiB)

Thank you for the reply. I had read a number of articles on this which put me off trying that option and implied rather clumsy solutions which was why I raised the request. However, I have now tried it and barring one problem it works and I can boot off either disk.

The problem occurred when I ran grub-install. It failed at the efibootmgr stage.

Here is the end of the log when installing to a non-raid ESP.

grub-install: info: copying `/boot/grub/x86_64-efi/core.efi' -> `/boot/efi/EFI/ubuntu/grubx64.efi'.
grub-install: info: Registering with EFI: distributor = `ubuntu', path = `\EFI\ubuntu\grubx64.efi', ESP at hostdisk//dev/sda,gpt1.
grub-install: info: executing efibootmgr --version </dev/null >/dev/null.
grub-install: info: executing modprobe -q efivars.
grub-install: info: executing efibootmgr -b 0000 -B.
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0005,0006,0002,0003,0004
Boot0002* Hard Drive
Boot0003* CD/DVD Drive
Boot0004* Removable Drive
Boot0005* UEFI: SanDisk Cruzer Edge 1.26
Boot0006* UEFI: ST31500341AS
grub-install: info: executing efibootmgr -c -d /dev/sda -p 1 -w -L ubuntu -l \EFI\ubuntu\grubx64.efi.
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0005,0006,0002,0003,0004
Boot0002* Hard Drive
Boot0003* CD/DVD Drive
Boot0004* Removable Drive
Boot0005* UEFI: SanDisk Cruzer Edge 1.26
Boot0006* UEFI: ST31500341AS
Boot0000* ubuntu
Installation finished. No error reported.

And here is the log for a raid1 ESP

grub-install: info: copying `/boot/grub/x86_64-efi/core.efi' -> `/boot/efi/EFI/ubuntu/grubx64.efi'.
grub-install: info: Registering with EFI: distributor = `ubuntu', path = `\EFI\ubuntu\grubx64.efi', ESP at mduuid/f1f50fa9a6d7d446dddc9c93a8fc41a3.
grub-install: info: executing efibootmgr --version </dev/null >/dev/null.
grub-install: info: executing modprobe -q efivars.
grub-install: info: executing efibootmgr -c -d.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 0.11.0
usage: efibootmgr [options]
        -a | --active sets bootnum active
        -A | --inactive sets bootnum inactive
        -b | --bootnum XXXX modify BootXXXX (hex)
        -B | --delete-bootnum delete bootnum (hex)
        -c | --create create new variable bootnum and add to bootorder
        -D | --remove-dups remove duplicate values from BootOrder
        -d | --disk disk (defaults to /dev/sda) containing loader
        -e | --edd [1|3|-1] force EDD 1.0 or 3.0 creation variables, or guess
        -E | --device num EDD 1.0 device number (defaults to 0x80)
        -g | --gpt force disk with invalid PMBR to be treated as GPT
        -H | --acpi_hid XXXX set the ACPI HID (used with -i)
        -i | --iface name create a netboot entry for the named interface
        -l | --loader name (defaults to \EFI\redhat\grub.efi)
        -L | --label label Boot manager display label (defaults to "Linux")
        -n | --bootnext XXXX set BootNext to XXXX (hex)
        -N | --delete-bootnext delete BootNext
        -o | --bootorder XXXX,YYYY,ZZZZ,... explicitly set BootOrder (hex)
        -O | --delete-bootorder delete BootOrder
        -p | --part part (defaults to 1...

Read more...

Revision history for this message
Phillip Susi (psusi) wrote :

Looks like grub-install got confused by the raid and failed to pass the proper device to efibootmgr.

Changed in grub2 (Ubuntu):
importance: Undecided → Low
status: Invalid → Triaged
summary: - Feature request: For EFI system grub-install should be able to install
- to multiple disks
+ grub-install breaks when ESP is on raid
Revision history for this message
Alek_A (ackbeat) wrote :

Hi! I have same issue. We are running Ubuntu 16.04 on our servers,
/boot/efi is on /dev/md0 (which is raid1 metadata 0.90 array of 4 little partitions, each on the beginning of one of the 4 disks)
System boots normally, but I believe that is because the EFI entries were created before, when the partitions were not in the mirror. Or maybe BIOS detects them somehow!

# grub-install
Installing for x86_64-efi platform.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 0.12
usage: efibootmgr [options]
        -a | --active sets bootnum active
        -A | --inactive sets bootnum inactive
        -b | --bootnum XXXX modify BootXXXX (hex)
        -B | --delete-bootnum delete bootnum (hex)
        -c | --create create new variable bootnum and add to bootorder
        -C | --create-only create new variable bootnum and do not add to bootorder
        -D | --remove-dups remove duplicate values from BootOrder
        -d | --disk disk (defaults to /dev/sda) containing loader
        -e | --edd [1|3|-1] force EDD 1.0 or 3.0 creation variables, or guess
        -E | --device num EDD 1.0 device number (defaults to 0x80)
        -g | --gpt force disk with invalid PMBR to be treated as GPT
        -i | --iface name create a netboot entry for the named interface
        -l | --loader name (defaults to \EFI\redhat\grub.efi)
        -L | --label label Boot manager display label (defaults to "Linux")
        -n | --bootnext XXXX set BootNext to XXXX (hex)
        -N | --delete-bootnext delete BootNext
        -o | --bootorder XXXX,YYYY,ZZZZ,... explicitly set BootOrder (hex)
        -O | --delete-bootorder delete BootOrder
        -p | --part part (defaults to 1) containing loader
        -q | --quiet be quiet
        -t | --timeout seconds set boot manager timeout waiting for user input.
        -T | --delete-timeout delete Timeout.
        -u | --unicode | --UCS-2 pass extra args as UCS-2 (default is ASCII)
        -v | --verbose print additional information
        -V | --version return version and exit
        -w | --write-signature write unique sig to MBR if needed
        -@ | --append-binary-args file append extra args from file (use "-" for stdin)
        -h | --help show help/usage
Installation finished. No error reported.

Revision history for this message
AnrDaemon (anrdaemon) wrote :

Setting up linux-signed-generic-lts-xenial (4.4.0.47.34) ...
Setting up linux-libc-dev:amd64 (3.13.0-101.148) ...
Setting up shim-signed (1.19~14.04.1+0.8-0ubuntu2) ...
Installing for x86_64-efi platform.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 0.5.4
usage: efibootmgr [options]
        -a | --active sets bootnum active
        -A | --inactive sets bootnum inactive
        -b | --bootnum XXXX modify BootXXXX (hex)
        -B | --delete-bootnum delete bootnum (hex)
        -c | --create create new variable bootnum and add to bootorder
        -d | --disk disk (defaults to /dev/sda) containing loader
        -e | --edd [1|3|-1] force EDD 1.0 or 3.0 creation variables, or guess
        -E | --device num EDD 1.0 device number (defaults to 0x80)
        -g | --gpt force disk with invalid PMBR to be treated as GPT
        -H | --acpi_hid XXXX set the ACPI HID (used with -i)
        -i | --iface name create a netboot entry for the named interface
        -l | --loader name (defaults to \elilo.efi)
        -L | --label label Boot manager display label (defaults to "Linux")
        -n | --bootnext XXXX set BootNext to XXXX (hex)
        -N | --delete-bootnext delete BootNext
        -o | --bootorder XXXX,YYYY,ZZZZ,... explicitly set BootOrder (hex)
        -O | --delete-bootorder delete BootOrder
        -p | --part part (defaults to 1) containing loader
        -q | --quiet be quiet
           | --test filename don't write to NVRAM, write to filename.
        -t | --timeout seconds set boot manager timeout waiting for user input.
        -T | --delete-timeout delete Timeout.
        -u | --unicode | --UCS-2 pass extra args as UCS-2 (default is ASCII)
        -U | --acpi_uid XXXX set the ACPI UID (used with -i)
        -v | --verbose print additional information
        -V | --version return version and exit
        -w | --write-signature write unique sig to MBR if needed
        -@ | --append-binary-args file append extra args from file (use "-" for stdin)
Installation finished. No error reported.

Revision history for this message
AnrDaemon (anrdaemon) wrote :

ESP is also on RAID1 with 0.90 meta.

# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Tue Aug 16 11:08:49 2016
     Raid Level : raid1
     Array Size : 266176 (259.98 MiB 272.56 MB)
  Used Dev Size : 266176 (259.98 MiB 272.56 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Nov 18 15:10:03 2016
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 8738f3e7:91896e1d:e368bf24:bd0fce41
         Events : 0.23

    Number Major Minor RaidDevice State
       0 8 1 0 active sync /dev/sda1
       1 8 17 1 active sync /dev/sdb1

Revision history for this message
Alek_A (ackbeat) wrote :

Thanks for reporting this, AnrDaemon! I'm not shure where exactly the issue is, probably grub script should be modified in a way that if it detects that the ESP on the raid it looks into /proc/mdstat and iterates the process with all the component devices. Or it should pass some extra args to efibootmgr, as reported :)

Revision history for this message
Seva Gluschenko (gvs-ya) wrote :

The bug is still reproducible in 16.04.2LTS. It is particularly funny that the grub-install reports no errors in the end:

...
grub-install: info: copying `/usr/lib/shim/shimx64.efi.signed' -> `/boot/efi/EFI/Ubuntu/shimx64.efi'.
grub-install: info: copying `/usr/lib/grub/x86_64-efi-signed/grubx64.efi.signed' -> `/boot/efi/EFI/Ubuntu/grubx64.efi'.
grub-install: info: copying `/usr/lib/shim/mmx64.efi.signed' -> `/boot/efi/EFI/Ubuntu/mmx64.efi'.
grub-install: info: copying `/usr/lib/shim/fbx64.efi.signed' -> `/boot/efi/EFI/Ubuntu/fbx64.efi'.
grub-install: info: copying `/boot/grub/x86_64-efi/load.cfg' -> `/boot/efi/EFI/Ubuntu/grub.cfg'.
grub-install: info: Registering with EFI: distributor = `Ubuntu', path = `\EFI\Ubuntu\shimx64.efi', ESP at mduuid/37a190814f3ecd3b3eba8b653989e9c1.
grub-install: info: executing efibootmgr --version </dev/null >/dev/null.
grub-install: info: executing modprobe -q efivars.
grub-install: info: executing efibootmgr -c -d.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 0.12
usage: efibootmgr [options]
 -a | --active sets bootnum active
 -A | --inactive sets bootnum inactive
 -b | --bootnum XXXX modify BootXXXX (hex)
 -B | --delete-bootnum delete bootnum (hex)
 -c | --create create new variable bootnum and add to bootorder
 -C | --create-only create new variable bootnum and do not add to bootorder
 -D | --remove-dups remove duplicate values from BootOrder
 -d | --disk disk (defaults to /dev/sda) containing loader
 -e | --edd [1|3|-1] force EDD 1.0 or 3.0 creation variables, or guess
 -E | --device num EDD 1.0 device number (defaults to 0x80)
 -g | --gpt force disk with invalid PMBR to be treated as GPT
 -i | --iface name create a netboot entry for the named interface
 -l | --loader name (defaults to \EFI\redhat\grub.efi)
 -L | --label label Boot manager display label (defaults to "Linux")
 -n | --bootnext XXXX set BootNext to XXXX (hex)
 -N | --delete-bootnext delete BootNext
 -o | --bootorder XXXX,YYYY,ZZZZ,... explicitly set BootOrder (hex)
 -O | --delete-bootorder delete BootOrder
 -p | --part part (defaults to 1) containing loader
 -q | --quiet be quiet
 -t | --timeout seconds set boot manager timeout waiting for user input.
 -T | --delete-timeout delete Timeout.
 -u | --unicode | --UCS-2 pass extra args as UCS-2 (default is ASCII)
 -v | --verbose print additional information
 -V | --version return version and exit
 -w | --write-signature write unique sig to MBR if needed
 -@ | --append-binary-args file append extra args from file (use "-" for stdin)
 -h | --help show help/usage
Installation finished. No errors reported.

Revision history for this message
John Robinson (john.robinson) wrote :

Just an FYI, this is still present in 16.04.4 which I just `apt-get upgrade`d to. I have /boot/EFI on a md RAID1 with 1.0 metadata.

I'm not sure whether the severity is correct; as far as I could tell when hacking about with the efibootmgr command, the process had removed the 'ubuntu' boot entry before failing to add the new entry, so I suspect my system was unbootable.

I have now used efibootmgr to add a new boot entry, but I'm working remotely and I'm not going to attempt to reboot until I'm sitting in front of the machine with a rescue disc/stick to hand! In particular, I am not clear what level the efibootmgr command is operating at: when I did the add on /dev/sda, subsequently listing it suggested it was also present on /dev/sdb... and I'm not sure whether in fact I ought to have two separate boot entries, one for each disc, so that if sda is a bit screwed but still present, booting can proceed with sdb.

Revision history for this message
Tony Middleton (ximera) wrote :

I agree that the severity is incorrect. Whenever grub-install is run - ie when the version of grub is upgraded - I end up with an unbootable system. Yes, my trusty rescue disk sorts it out but that's not a solution.

Referring to the comment above, I have two EFI entries, one for each disk. They both get trashed by grub-install so I recreate them by hand - either before reboot or after using the rescue disk.

Phillip Susi (psusi)
affects: grub2 (Ubuntu) → grub-installer (Ubuntu)
Changed in grub-installer (Ubuntu):
importance: Low → High
tags: added: id-5b16b1664f4c7a0f1fb8839f
Revision history for this message
Wladimir Mutel (mwg) wrote :

More and more people stumble upon this - https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1720572

Revision history for this message
Wladimir Mutel (mwg) wrote :

There is a patch at https://savannah.gnu.org/bugs/?46805 , why not try it ?

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

All that patch would do is show a pretty error message earlier, not actually make sure the problem is fixed -- upgrades would still fail; installs would still fail on RAID.

This has to do with grub not grokking the metadata format on disk, which is avoidable by using metadata 0.90.

Revision history for this message
Phillip Susi (psusi) wrote : Re: [Bug 1466150] Re: grub-install breaks when ESP is on raid

On 8/21/2018 12:29 PM, Mathieu Trudel-Lapierre wrote:
> This has to do with grub not grokking the metadata format on disk, which
> is avoidable by using metadata 0.90.

What? Grub understands all of the metadata formats.

Revision history for this message
AnrDaemon (anrdaemon) wrote :

Then why it fails to install bootloader?… Even on 0.90 meta?

Revision history for this message
AnrDaemon (anrdaemon) wrote :

# grub-install --recheck --no-floppy /dev/md0
Installing for x86_64-efi platform.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 0.5.4
usage: efibootmgr [options]

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

This might be the fix?

Handle partition name parsing and formatting for partitioned md
https://github.com/rhboot/efivar/commit/576f55b02d9ec478bd5157352c884e3543bcca58

Revision history for this message
Alejandro Mery (amery) wrote :

I wrote a not-so-little wrapper for efibootmgr to pretend grub-install isn't broken:

# cd /bin
# mv efibootmgr efibootmgr.real
# ln -s efibootmgr.sh efibootmgr

efibootmgr.sh
---
#!/bin/sh

die() {
        echo "$*" >&2
        exit 1
}

run_device() {
        local devdir="$1" label= label_set=
        local devname= dev= partition=
        shift

        if [ "x$1" = "x-L" ]; then
                label_set=true
                label="$2"
                shift 2
        fi

        devdir="$(cd "$devdir" && pwd -P)"

        if [ -s "$devdir/partition" ]; then
                read partition < "$devdir/partition"
                devname="${devdir##*/}"
                devdir="${devdir%/*}"
        fi
        dev="/dev/${devdir##*/}"

        if [ -n "$label_set" -a -z "$label" ]; then
                label=$devname
        else
                [ -n "$label" ] || label="$(lsb_release -si)"

                label="$label ($devname)"
        fi

        set -x
        "${0%.sh}.real" "$@" -L "$label" -d "$dev" ${partition:+-p $partition}
}

run_raid() {
        local x= argv=
        local label= label_set= label_next=
        local device= devdir=
        local md_level= md_disks=

        # extract label
        for x; do
                if [ "$x" = "-L" ]; then
                        label_next=true
                        label_set=
                        label=
                elif [ -n "$label_next" ]; then
                        label_next=
                        label_set=true
                        label="$x"
                else
                        x=$(echo -n "$x" | sed -e 's|"|\\"|g')
                        argv="$argv \"$x\""
                fi
        done

        if [ -n "$label_set" ]; then
                x=$(echo -n "$label" | sed -e 's|"|\\"|g')
                argv="-L \"$x\" $argv"
        fi

        device="$(grep ' /boot/efi ' /proc/mounts | cut -d' ' -f1)"
        [ -b "$device" ] || die "ESP not mounted"
        device="$(readlink -f "$device")"
        devdir=/sys/class/block/${device##*/}

        if read md_level < $devdir/md/level 2> /dev/null; then
                if [ "$md_level" = raid1 ]; then
                        read md_disks < $devdir/md/raid_disks
                        for i in `seq $md_disks`; do
                                set +x
                                eval "run_device '$devdir/md/rd$(($i - 1))/block' $argv"
                        done
                else
                        die "RAID $md_level not supported"
                fi
        else
                # not RAID
                set -x
                eval "run_device '$devdir' $argv"
        fi
        exit 0
}

run_normal() {
        exec "${0%.sh}.real" "$@"
}

set -eu

argv=
i=1
for x; do
        if [ "$x" = "-d" -a $i -eq $# ]; then
                # /boot/efi is /dev/md and grub-install can't handle it yet
                eval "run_raid $argv"
                die "never reached"
        fi

        : $((i = i+1))
        x=$(echo -n "$x" | sed -e 's|"|\\"|g')
        argv="$argv \"$x\""
done

set -x
eval "run_normal $argv"
---

Revision history for this message
AnrDaemon (anrdaemon) wrote :

I had to configure my BIOS manually to point to both ESP partitions.
Installing new kernel is still a thrilling experience, but so far reboots were smooth.

Revision history for this message
John Robinson (john.robinson) wrote :

I just upgraded my system yesterday, which meant an updated grub, which meant grub-install was run again, which meant this happened to me again. Now on Ubuntu 16.04.6 LTS (GNU/Linux 4.4.0-150-generic x86_64) with grub2 2.02~beta2-36ubuntu3.22 and its dependencies. Then I found that Alejandro Mery's script in #17 worked for me.

Revision history for this message
John Robinson (john.robinson) wrote :

I meant #18, sorry.

Revision history for this message
Jack Ostroff (ostroffjh) wrote :

I'm having the same problem with a new Gentoo install, but with a twist. I am using lvm2 with a vg of two whole disk pvs and raid1 lvs for /, /boot, and /boot/efi. Since I'm not using mdadm, the script in comment #18 doesn't work for me, and I'm not able to modify it for my case, although I try a bit more.

Revision history for this message
Ben Stanley (benstanley) wrote :

> I have EFI partitions on both disks but it is not possible to RAID1 these
> as they are FAT32.

Actually it is possible to RAID1 FAT32. The problem is that the default
RAID format places a header at the front of the partition, which stuffs up
the boot loader. But if you use the 0.9.0 format with mdadm, header is
placed at the end of the partition instead. This allows the boot loader to
function as normal while the partition is RAID1. Just don't expect Windows
to respect the RAID1.

However even using this trick my system won't grub-install to a RAID1
partition. Reports problems with EFI.

Sorry I'm not next to my system as I write this so details are from memory
instead of tested on the system.

On 1 July 2019 7:59:39 am Jack Ostroff <email address hidden> wrote:

> I'm having the same problem with a new Gentoo install, but with a twist.
> I am using lvm2 with a vg of two whole disk pvs and raid1 lvs for /,
> /boot, and /boot/efi. Since I'm not using mdadm, the script in comment
> #18 doesn't work for me, and I'm not able to modify it for my case,
> although I try a bit more.
>
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1466150
>
>
> Title:
> grub-install breaks when ESP is on raid
>
>
> Status in grub-installer package in Ubuntu:
> Triaged
>
>
> Bug description:
> I run a server with mirrored (RAID1) disks using grub-efi.
>
>
> Root and /boot and /boot/grub are on mirrored partitions.
>
>
> I have EFI partitions on both disks but it is not possible to RAID1
> these as they are FAT32. On an EFI system grub-install will only
> install to one of the EFI partitions and so after running install-grub
> you have to remember to copy the EFI file across.
>
>
> Could grub configuration and grub-install be amended to automatically
> install to multiple disks?
>
>
> Searching around there seem to be many people asking this question
> without any elegant solution.
>
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150/+subscriptions

Revision history for this message
John Robinson (john.robinson) wrote :

You can use 0.90 or 1.0 metadata. Both leave the partitions underneath looking like it has a simple (non-RAID) filesystem on each disc of a mirror when the EFI or BIOS looks at the drives. I think that's what they need to boot from, so I don't think there's any chance of booting from a /boot/efi which is an lv in a vg on several pvs. Maybe LinuxBIOS?

Revision history for this message
Jack Ostroff (ostroffjh) wrote :

I'm beginning to thing the problem is simply that the EFI System Partition really does need to be a real partition, and not a logical volume or part of a raid. It is not an issue with raid and fat32, it is just that the ESP (/boot/efi) needs to be reached from the EFI firmware, which doesn't know about anything except real GPT partitions. I'd really love to find otherwise, but that seems to be the conclusion of all my searching. Grub can boot to a kernel and initramfs on a logical or raid volume, but it seems that grubx64.efi (for example) needs to be on a physical partition.

Revision history for this message
John Robinson (john.robinson) wrote :

Yes Jack it does. That's why you can use a "trick" with md raid 1 with metadata 0.90 or 1.0, because those both result in having two (or more) discs with ESP partitions which are readable and bootable by the EFI BIOS.

This bug thread is more about making grub-install call efibootmgr correctly to install two (or more) EFI boot entries for each of the ESP partition mirrors, than anything else.

Revision history for this message
Ben Stanley (benstanley) wrote :

The way to make it work is to place the EFI on a separate GPT raided
partition, outside the LVM PV. Which I forgot about when I wrote the idea
earlier.

On 1 July 2019 8:54:49 am John Robinson <email address hidden> wrote:

> You can use 0.90 or 1.0 metadata. Both leave the partitions underneath
> looking like it has a simple (non-RAID) filesystem on each disc of a
> mirror when the EFI or BIOS looks at the drives. I think that's what
> they need to boot from, so I don't think there's any chance of booting
> from a /boot/efi which is an lv in a vg on several pvs. Maybe LinuxBIOS?
>
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1466150
>
>
> Title:
> grub-install breaks when ESP is on raid
>
>
> Status in grub-installer package in Ubuntu:
> Triaged
>
>
> Bug description:
> I run a server with mirrored (RAID1) disks using grub-efi.
>
>
> Root and /boot and /boot/grub are on mirrored partitions.
>
>
> I have EFI partitions on both disks but it is not possible to RAID1
> these as they are FAT32. On an EFI system grub-install will only
> install to one of the EFI partitions and so after running install-grub
> you have to remember to copy the EFI file across.
>
>
> Could grub configuration and grub-install be amended to automatically
> install to multiple disks?
>
>
> Searching around there seem to be many people asking this question
> without any elegant solution.
>
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150/+subscriptions

Revision history for this message
Tom Reynolds (tomreyn) wrote :

While installing ESP on top of mdadm (metadata version <= 1.0) RAID-1 is practically possible, supporting this is not: The UEFI specification (version 2.8, sections 13.3.1.1, 13.3.3) defines the ESP as a FAT32 file system which is located (directly) on a GPT partition. While this does not seem to be part of the specification, some UEFI implementations expect to be able to write to an ESP.

Having UEFI write to what it must assume is just a FAT32 file system, but is really a mdadm RAID device in a RAID-1 array hosting a FAT32 file system, is problematic. While technically possible, allowing direct writes to a single RAID member violates RAID-1 and mdadm's expectations, and will put the RAID array into an inconsistent state, requiring a re-sync, which can only happen after mdadm detects the fact that this array is inconsistent.

This conclusion has also been drawn in bug 1817066, which therefore suggests an alternative approach for improved data security for ESP: to support installing to multiple separate ESPs (located directly on GPT partitions).

This said, the current bug report seems to be about several things:

* grub-install cannot gracefully handle the (unsupported / non supportable) situation where the active ESP (FAT32 file system) is not located directly on a GPT partition (on a GPT partitioned disk), invokes efibootmgr with an empty -d (target device) argument; this can make grub-install fail, can make the grub package installation fail (pending dpkg configuration) and leave apt in a bad state, breaking installations

* Some Ubuntu installers may allow to create partition schemes where ESP file systems are not located directly on a partition on a GPT partitioned disk, but such layouts should not be supported by these installers

* Release upgrades may fail to handle (unsupported / unsupportable) partition schemes which contain a ESP which is not located directly on a partition on a GPT partitioned disk (an error)

Revision history for this message
AnrDaemon (anrdaemon) wrote :

Five years later: …

Revision history for this message
Osmin Lazo (osminlazo) wrote :

Adding --no-nvram to grub-install

grub-install --efi-directory=$mntpoint --target=x86_64-efi --no-nvram

fixes this, the --no-nvram bypasses updating efi vars with efibootmgr but this entries don't need to be modified after install and on every grub pkg update.

Modifying this line in /usr/lib/grub/grub-multi-install will allow the grub pkg update to complete succesfully...

The error seems to be caused by the efibootmgr command using mduuid of raid.

Revision history for this message
AnrDaemon (anrdaemon) wrote :

> Modifying this line in /usr/lib/grub/grub-multi-install will allow the grub pkg update to complete succesfully...

Where I can find this in Trusty? (14.04)
I need to upgrade my host to move forward, but with grub-install failing it's nontrivial.

Revision history for this message
Georg Sauthoff (g-sauthoff) wrote :

Tom Reynolds (tomreyn) wrote (https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150/comments/28):

> While installing ESP on top of mdadm (metadata version <= 1.0) RAID-1 is practically possible, supporting this is not: The UEFI specification (version 2.8, sections 13.3.1.1, 13.3.3) defines the ESP as a FAT32 file system which is located (directly) on a GPT partition. While this does not seem to be part of the specification, some UEFI implementations expect to be able to write to an ESP.

Can you name _one_ UEFI implementation that actually does write to an ESP?

FWIW, when installing Fedora with the default installer and selecting a RAID-1 scheme there, the EFI system partition is created on a superblock 1.0 RAID-1. Thus, the Fedora developers seem to be pretty sure that there isn't any UEFI implementation that expects to be able to write to an ESP ...

Revision history for this message
Kees Cook (kees) wrote :

https://outflux.net/blog/archives/2018/04/19/uefi-booting-and-raid1/

The UEFI on the Dell T30 I was testing on would write a "boot variable cache" file to the ESP. :(

Revision history for this message
Kees Cook (kees) wrote :

The only reference I could find was https://github.com/tianocore/tianocore.github.io/wiki/UEFI-Variable-Runtime-Cache which hints at a "device storage" for variables...

Revision history for this message
Kees Cook (kees) wrote :

(This may have only been present on older firmware versions, though, as I no longer see the behavior on a newer T30.)

Revision history for this message
Tony Middleton (ximera) wrote :

Since focal the change I originally asked for at the beginning of this thread has now been made. If you set up multi efi partitions, not raided, when you run "dpkg-reconfigure grub-efi-amd64" the system asks which efi partitions you wish to use. You can specify more than one and it then installs to each of them. It remembers the choice next time grub is updated via apt.

Revision history for this message
Rod Smith (rodsmith) wrote :

@Georg Sauthoff wrote:

> Can you name _one_ UEFI implementation that actually does write to an ESP?

Offhand, I don't know of an EFI that will write to the ESP automatically; HOWEVER, there are cases when it can happen because of user actions and/or EFI applications' operation. For instance:

* The EFI shell includes multiple commands that can write to the
  ESP when the user executes them -- file copies, file renames,
  a built-in text editor, etc.
* The rEFInd boot manager includes options to store rEFInd-specific
  variables on the ESP, as well as to create screen shots. The older
  rEFIt can also create screen shots.
* There's a whole Python implementation for EFI, so in theory,
  users could run Python scripts that write to the EFI.

Assuming that the user knows enough to not do these things seems like an unwise assumption, IMHO.

Revision history for this message
John Robinson (john.robinson) wrote :

@Rod Smith wrote:
> Assuming that the user knows enough to not do these things seems like an
> unwise assumption, IMHO.

If the user knows she's building her ESP on RAID, she probably knows enough not to be writing to the RAID's constituent devices independently.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

On Mon, Nov 30, 2020 at 05:38:10PM -0000, John Robinson wrote:
> If the user knows she's building her ESP on RAID, she probably knows
> enough not to be writing to the RAID's constituent devices
> independently.

Given how many broken grub configurations there are in the world that
come out of the woodwork when we do grub package updates, we shouldn't
automatically assume users know what they're doing. They're often
following (bad) advice they found online without understanding the
consequences of what they did.

The tooling now making it easy to install to multiple partitions is a
definite improvement over the previous state.

Thanks

Revision history for this message
Rod Smith (rodsmith) wrote :

I agree with Seth Arnold; assuming that the user won't make a mistake is a recipe for disaster.

Revision history for this message
Georg Sauthoff (g-sauthoff) wrote :

Tony Middleton (@ximera) wrote (https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150/comments/36):

> Since focal the change I originally asked for at the beginning of this thread has now been made. > If you set up multi efi partitions, not raided, when you run "dpkg-reconfigure grub-efi-amd64" the system asks which efi partitions you wish to use. You can specify more than one and it then installs to each of them. It remembers the choice next time grub is updated via apt.

I've just tested this and it works as described.

However, grub-efi-amd64 wasn't installed on my freshly installed 20.04 Ubuntu system.

Installed was grub-efi-amd64-signed.

And 'dpkg-reconfigure grub-efi-amd64-signed' didn't prompt for anything!

So, I installed grub-efi-amd64 and removed grub-efi-amd64-signed for the test.

---

Seth Arnold @seth-arnold wrote (https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150/comments/39):

> The tooling now making it easy to install to multiple partitions is a
definite improvement over the previous state.

I agree, it's an improvement. However, while searching for Ubuntu's solution to the making-ESP-somehow-redundant problem I couldn't find any documentation that points to this tooling/approach.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

On Sat, Dec 05, 2020 at 06:05:29PM -0000, Georg Sauthoff wrote:
> I've just tested this and it works as described.
>
> However, grub-efi-amd64 wasn't installed on my freshly installed 20.04
> Ubuntu system.
>
> Installed was grub-efi-amd64-signed.
>
> And 'dpkg-reconfigure grub-efi-amd64-signed' didn't prompt for anything!
>
> So, I installed grub-efi-amd64 and removed grub-efi-amd64-signed for the
> test.

I believe the preferred approach (for focal and newer?) is for
grub-efi-amd64-signed and grub-pc to be installed everywhere, and grub-pc
then 'owns' the debconf questions.

Thanks

Revision history for this message
Tony Middleton (ximera) wrote :

I had some problems when using grub-efi-amd64-signed rather than grub-efi-amd64. Unfortunately not consistent enough to create a coherent bug report. A number of times I ended up dropping into grub command line. That seemed not to happen with the unsigned version.

Revision history for this message
fmyhr (fmyhr-u) wrote :

Corresponding bug report on Debian:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765740

This Ubuntu support for multiple ESPs has not been upstreamed to Debian Bullseye. Is someone working on upstreaming to Debian? I hope so!

Jeff Lane  (bladernr)
tags: added: hwcert-server
Revision history for this message
Tony Middleton (ximera) wrote :

Just experimented with a "clean" Ubuntu system that had grub-efi-amd64-signed. To get the prompt for the efi partitions I had to call dpkg-reconfigure shim-signed. After that it seemed to work OK.

I wish I knew where all this was documented.

Revision history for this message
Tony Middleton (ximera) wrote :

Shouldn't this be moved to "Wont Fix"? The strategy seems to be that you don't use RAID for ESP, instead you have multiple ESP partitions.

Revision history for this message
Steve Langasek (vorlon) wrote :

> Shouldn't this be moved to "Wont Fix"? The strategy seems to be
> that you don't use RAID for ESP, instead you have multiple ESP
> partitions.

Yes: this is won't fix for the stated reason.

Changed in grub-installer (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.