grub-pc needs to detect when debconf points to invalid drive and stop in preinst, before unpacking files, and also treat this as a failure in postinst

Bug #1891680 reported by Steve Langasek on 2020-08-14
210
This bug affects 19 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Status tracked in Groovy
Xenial
Critical
Unassigned
Bionic
Critical
Unassigned
Focal
Critical
Unassigned
Groovy
Critical
Dimitri John Ledkov

Bug Description

[Impact]

 * grub-pc currently installs new core to MBR and installs new modules to /boot in an unsafe manner, which may lead to incompatible combination of MBR and modules resulting in failure to boot.

[Test Case]

 * Install using old point media, of an old release. I.e. 16.04.(p-1) for testing upgrades to 18.04 sru, in bios mode.

 * backup the contents of /boot

 * First we will test a case where target boot device exists, yet writes to it are denied, thus one can update modules, but cannot update the MBR.

 * install /etc/apparmor.d/usr.sbin.grub-install profile

"/usr/sbin/grub-install" {
  capability,
  mount,
  ptrace,
  signal,
  unix,
  file,
  deny /dev/* w,
}

   and load it with

  sudo apparmor_parser -r usr.sbin.grub-install

 * Upgrade to the package from next series-proposed, non-interactively

 * Observe the package installation has failed, the grub-pc package is in a broken state.

 * Compare the backup of /boot with current /boot, it should have remained the same, and is different to modules in /usr/lib/grub/i386-pc

 * Remove the apparmor profile /etc/apparmor.d/usr.sbin.grub-install

 * Reboot, reboot should be successful. If possible observe the version number in the grub menu, it should still be old.

 * Now we will test a case where a non-existing device ended up being configured in debconf. For example, due to old buggy cloud-init having been used during first boot, or because the VM got migrated from one hardware configuration to another (i.e. offline switch from SCSI sda, to VIRTIO vda).

 * Configure invalid grub-pc/install_devices to a non existing device (e.g. /dev/sdk)

 * Attempt non-interactive configuration of the grub-pc package

 * Observe the package fails, and the grub-pc package remains in a broken state.

 * Compare the backup of /boot with current /boot, it should have remained the same, and is different to modules in /usr/lib/grub/i386-pc

 * Reboot, reboot should be successful. If possible observe the version number in the grub menu, it should still be old.

 * Try to configure all the packages, interactively (i.e. using $ sudo dpkg --configure -a or by using $ sudo apt install -f) and ensure to select the right drive for grub installation offer

 * Observe that now /boot matches /usr/lib/grub/i386-pc contents, and is different from the backup taken at the start.

 * Reboot should be successful, and grub menu should have the new version number finally

[Regression Potential]

 * Existing call to grub-install, is now split into two. And when any
   devices fail to configure, non-interactively error is reported just
   like it was already done with the interactive case.

   It means, it will fail configuration of the package, where
   previously it would report success. However, it is now safer and
   keeps the system bootable, whilst having unconfigured
   packages. This mostly affects non-interactive upgrades, as the
   interactive ones have always shown critical errors trying to
   correct grub-pc installation problems.

   The first stage of grub-install only tries to update the MBR,
   whilst utilizing tmpdirectory to create the core image. This is a
   slight increase in disk space usage, as previously core was created
   in-pace in /boot. Then whilst tmpdir is still populated, /boot
   modules and core are upgraded.

   These changes do not address multi-mbr systems, or cases where
   updating modules fails. For example, it is possible that MBR update
   is successful, yet writting updated modules fails (out of disk space),
   in such scenario MBR is not rolled back to previous one. Or a case
   where MBR updates have succeeded, but only on some devices.
   A choice has been made to update modules in /boot, if at least one
   device has a successful MBR update. No backup, or rollback of MBR is
   performed if module updates fail. This is tricky to do, as it is
   uncertain if current MBR matches the core.img & boot.img from /boot, or
   if some other bootsectors code was in use before. Ideally in the
   future, grub-install itself will be able to stage module updates, and
   commit/rollback them upon successful MBR update.

[Other Info]

 * Original bug report description

Currently on upgrade if the debconf variable for the drive to install grub-pc to point to a non-existent drive, the grub package will nevertheless happily carry on and the postinst will exit 0 - as a result leaving the /boot/grub contents and the MBR in an inconsistent state, which due to recent ABI changes will leave the system unbootable on reboot.

Three changes required in order to make grub upgrades more resilient:

- exit non-zero from the postinst when the drive targets are invalid, so that we signal to the user that there is a problem BEFORE they reboot and give them the opportunity to deal with it. This is addressed by https://code.launchpad.net/~xnox/grub/+git/grub/+merge/388383
- include a check for target drive validity in the grub preinst, not just in the postinst, so that we avoid unpacking boot assets onto disk that might be incorrectly used by another package (despite grub-pc being in an unconfigured state) and still render the system unbootable; this will in general break release upgrades for affected users, but a failing postinst would do the same anyway, and failing early should leave the package manager in a more consistent state overall. This is addressed by https://code.launchpad.net/~ubuntu-core-dev/grub/+git/ubuntu/+merge/388423
- modify grub-install so that it handles the flaky part of the install - updating the BIOS disks - FIRST, and aborts if this fails; instead of the current behavior, which is that /boot/grub is updated on disk first, then it attempts to install to the BIOS disk, and if this part fails, no rollback of the contents of /boot/grub is possible.

Related branches

Steve Langasek (vorlon) on 2020-08-14
Changed in grub2 (Ubuntu):
importance: Undecided → Critical
Changed in grub2 (Ubuntu Focal):
importance: Undecided → Critical
Changed in grub2 (Ubuntu Bionic):
importance: Undecided → Critical
Changed in grub2 (Ubuntu Xenial):
importance: Undecided → Critical
description: updated
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu Bionic):
status: New → Confirmed
Changed in grub2 (Ubuntu Focal):
status: New → Confirmed
Changed in grub2 (Ubuntu Xenial):
status: New → Confirmed
Changed in grub2 (Ubuntu):
status: New → Confirmed
Dimitri John Ledkov (xnox) wrote :

First point => yes

Second point => mostly yes, but I am not sold on having it a must in preinst. dpkg-reconfigure does not call preinst, it only calls "prerm config postinst", and we must have the same checks and error recovery in postinst. (Because things can change between preinst and postinst, and/or fail to apply).

Third point => it is slightly more elaborate. Even when asking grub-install to not install any modules, or bootcode, the following files are still updated none-the-less

grub-install: info: copying `/usr/lib/grub/i386-pc/efiemu32.o' -> `/mnt/boot/grub/i386-pc/efiemu32.o'.
grub-install: info: copying `/usr/lib/grub/i386-pc/efiemu64.o' -> `/mnt/boot/grub/i386-pc/efiemu64.o'.
grub-install: info: copying `/usr/lib/grub/i386-pc/moddep.lst' -> `/mnt/boot/grub/i386-pc/moddep.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/command.lst' -> `/mnt/boot/grub/i386-pc/command.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/fs.lst' -> `/mnt/boot/grub/i386-pc/fs.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/partmap.lst' -> `/mnt/boot/grub/i386-pc/partmap.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/parttool.lst' -> `/mnt/boot/grub/i386-pc/parttool.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/video.lst' -> `/mnt/boot/grub/i386-pc/video.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/crypto.lst' -> `/mnt/boot/grub/i386-pc/crypto.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/terminal.lst' -> `/mnt/boot/grub/i386-pc/terminal.lst'.
grub-install: info: copying `/usr/lib/grub/i386-pc/modinfo.sh' -> `/mnt/boot/grub/i386-pc/modinfo.sh'.
grub-install: info: copying `/usr/share/grub/unicode.pf2' -> `/mnt/boot/grub/fonts/unicode.pf2'.
grub-install: info: grub-mkimage --directory '/usr/lib/grub/i386-pc' --prefix '(,msdos1)/boot/grub' --output '/mnt/boot/grub/i386-pc/core.img' --dtb '' --format 'i386-pc' --compression 'auto' 'ext2' 'part_msdos' 'biosdisk'
grub-install: info: copying `/usr/lib/grub/i386-pc/boot.img' -> `/mnt/boot/grub/i386-pc/boot.img'

And for example since we do not provide prebuild core.img it needs to be generated somewhere first before applying it to the MBR.

grub-install only takes one device at the time. So I'm not sure what to do with updating modules, if multiple devices need MBR updates, all exist, yet some of them fail to apply MBR. I feel like erroring on the optimistic side, if at least one MBR update was successful, update the modules.

What can be achieved, with very minimal amount of code is to separate bootcode updates from the /boot/ modules updates. And if any bootcode updates are successful, only then attempt to install modules.

Changed in grub2 (Ubuntu Groovy):
status: Confirmed → In Progress
assignee: nobody → Dimitri John Ledkov (xnox)
description: updated
description: updated
description: updated
description: updated
description: updated
Dimitri John Ledkov (xnox) wrote :

I disagree that preinst changes are needed.

I have fixed error reporting in non-interactive postinst case.

And made grub-install resilient, w.r.t. operating against non-existent devices or devices that refuse writing to. In all such cases, backup is created and restored. Thus there is no need to abort at preinst. Consistency between MBR & /boot is required and fixed here, without reimplementing postinst in preinst.

tags: added: id-5f36bab45785997ba0092e8a
Steve Langasek (vorlon) on 2020-09-02
Changed in grub2 (Ubuntu Groovy):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.04-1ubuntu29

---------------
grub2 (2.04-1ubuntu29) groovy; urgency=medium

  * grub-install: cherry-pick patch from grub-devel to make grub-install
    fault tolerant. Create backup of files in /boot/grub, and restore them
    on failure to complete grub-install. LP: #1891680
  * postinst.in: do not exit successfully when failing to show critical
    grub-pc/install_devices_failed and grub-pc/install_devices_empty
    prompts in non-interactive mode. This enables surfacing upgrade errors
    to the users and/or automation. LP: #1891680
  * postinst.in: Fixup postinst.in, to attempt grub-install upon explicit
    dpkg-reconfigure grub-pc. LP: #1892526

 -- Dimitri John Ledkov <email address hidden> Tue, 01 Sep 2020 20:04:44 +0100

Changed in grub2 (Ubuntu Groovy):
status: Fix Committed → Fix Released
Andrei Shevchuk (shevchuk) wrote :

Will this fix be backported to Focal? This is the only bug blocking upgrades from 18.04 to 20.04, afaik.

Tero Gusto (tero-gusto) wrote :

Yes, it looks that way:

"Focal Fossa (20.04.1 LTS) Point-Release Status Tracking"
https://discourse.ubuntu.com/t/focal-fossa-20-04-1-lts-point-release-status-tracking/17604

Changed in grub2 (Ubuntu Focal):
status: Confirmed → In Progress
ded (ded-launchpad) wrote :

Is there an ETA when we can finally do-release-upgrade from 18.04 to 20.04?

costinel (costinel) wrote :

@ded, the bug says for groovy
Started work: 2020-08-18
Completed: 2020-09-02

for focal,
Started work: 2020-09-08

so one can only guess it would take another two weeks, if the same work conditions apply

Hello Steve, or anyone else affected,

Accepted grub2 into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.04-1ubuntu26.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in grub2 (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
costinel (costinel) wrote :

@vorlon,
some of us here are innocent 18.04LTS users waiting for canonical to enable do-release-upgrade to focal, because this issue is blocking the LTS to LTS upgrade process (link in comment #9)

with that in mind, how can we help testing without access to focal?

On Wed, Sep 09, 2020 at 08:51:49PM -0000, costinel wrote:
> some of us here are innocent 18.04LTS users waiting for canonical to
> enable do-release-upgrade to focal, because this issue is blocking the LTS
> to LTS upgrade process (link in comment #9)

There is a test case in the description that explains how this SRU will be
verified.

If you are not sure how to access focal, it is best not to attempt to test.
The testing will be handled by the development team as a matter of course,
and the packages released when they are ready for general consumption.

All autopkgtests for the newly accepted grub2 (2.04-1ubuntu26.4) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

grubzfs-testsuite/0.4.10 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#grub2

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Dimitri John Ledkov (xnox) wrote :

Installed from 18.04.4 media, in bios mode.

Added and loaded apparmor profile.

Enabled focal, focal-updates, focal-proposed archives.

Exported DEBIAN_FRONTEND=noniteractive to ensure non-interactive updates.

Started upgrade of grub-pc grub-pc-bin grub2-common grub-common.

Configuration of grub-pc 2.04-1.ubuntu26.4 failed (GOOD) iF state

in /boot/grub git diff shows only one change => fonts/unicode.pf2 has changed. (not ideal, but nothing else important has changed, i.e. all modules are the same and modinfo.sh still says 2.02-2ubuntu8.18 as the package version) (GOOD)

executed
echo grub-pc grub-pc/install_devices multiselect /dev/sda | debconf-set-selections
to set install_devices to invalid /dev/sda, instead of the correct /dev/vda for the next test case. And removed the apparmor profile.

Reboot. Still worked (good)

exported DEBIAN_FRONTEND=noninteractive, and attempted to configure all the packages with dpkg --configure -a

configuration/installing onto /dev/sda failed (GOOD). No further changes in /boot (GOOD). Reboot works. (GOOD)

Finally performing dpkg --configure -a worked, got asked which drive to update, elected the correct /dev/vda and all modules got updated in the /boot/grub dir. Reboot works. (GOOD)

Just to be sure, executed dpkg-reconfigure grub-pc and all questions got asked and grub got installed onto /dev/vda again. (GOOD)

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Louis Fall (yoshi2602) on 2020-09-15
Changed in grub2 (Ubuntu Focal):
status: Fix Committed → Fix Released
Louis Fall (yoshi2602) wrote :

Sorry, didn't meant to change the status, I cannot revert the change.

Changed in grub2 (Ubuntu Focal):
status: Fix Released → Fix Committed

Hi, Louis.

You made me happy for a few minutes.

Regards.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.04-1ubuntu26.4

---------------
grub2 (2.04-1ubuntu26.4) focal; urgency=medium

  * grub-install: cherry-pick patch from grub-devel to make grub-install
    fault tolerant. Create backup of files in /boot/grub, and restore them
    on failure to complete grub-install. LP: #1891680
  * postinst.in: do not exit successfully when failing to show critical
    grub-pc/install_devices_failed and grub-pc/install_devices_empty
    prompts in non-interactive mode. This enables surfacing upgrade errors
    to the users and/or automation. LP: #1891680
  * postinst.in: do not attempt to call grub-install upon fresh install of
    grub-pc because it it a job of installers to do that after fresh
    install. Fixup for the issue unmasked by above. LP: #1891680
  * grub-multi-install: fix non-interactive failures for grub-efi like it
    was fixed in postinst for grub-pc. LP: #1891680
  * postinst.in: Fixup postinst.in, to attempt grub-install upon explicit
    dpkg-reconfigure grub-pc. LP: #1892526

 -- Dimitri John Ledkov <email address hidden> Tue, 08 Sep 2020 11:24:35 +0100

Changed in grub2 (Ubuntu Focal):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for grub2 has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Saurabh Minni (the100rabh) wrote :

Does this not need to be backported all the way to Xenial and Bionic ?

costinel (costinel) wrote :

ok, so now that this is fixed, why is ubuntu 18.04 to 20.04.1 still blocked?

Crashbit (crashbit-gmail) wrote :

As you can see, Bionic status is Confirmed, not fix released!

Hadmut Danisch (hadmut) wrote :

So for upgrading from release X to Y both X and Y must not have critical bugs?

On September 20, 2020 5:09:37 AM PDT, Hadmut Danisch <email address hidden> wrote:
>So for upgrading from release X to Y both X and Y must not have
>critical bugs?

Nothing of the sort. This bug is no longer a blocker now that it's fixed in focal, and I believe the intent is to turn on upgrades this coming week.

--
Steve Langasek

Saurabh Minni (the100rabh) wrote :
Chris (iversc) wrote :

Since this is the final blocker, I think they're just leaving that status alone until they actually publish 20.04.1 for upgrade, so they don't get a deluge of people asking why it's not released yet if there's no blockers.

Probably running final tests on the upgrade path.

Brian Murray (brian-murray) wrote :

On Fri, Sep 25, 2020 at 05:57:15PM -0000, Chris wrote:
> Since this is the final blocker, I think they're just leaving that
> status alone until they actually publish 20.04.1 for upgrade, so they
> don't get a deluge of people asking why it's not released yet if there's
> no blockers.
>
> Probably running final tests on the upgrade path.

That is correct, our intent is to turn on upgrades next week.

--
Brian Murray

BDisp (bdisp) wrote :

The strange is that the Ubuntu 20.04 LTS is already available from the Windows 10 store.
It's possible to install this version and transfer all the configuration, applications and repositories from Ubuntu 18.04.5 LTS to 20.04 LTS?

Roger Peris (rugeps) wrote :

Upgrade (from v18.04 to 20.04) is already released!
Thanks to everybody for your debbuging and support!

BDisp (bdisp) wrote :

I was unable to upgrade and this link (https://bugs.launchpad.net/ubuntu/+source/ubuntu-release-upgrader/+bug/1769446) solved the problem.

regenpfeifer (regenpfeifer) wrote :

Unfortunately, this bug seems not fixed for me.

I am using a remote server at 1&1 Hosting with a RAID 1 configuration and logical volumes (LVM). I am starting with a fresh image of Ubuntu 18.04 LTS. With do-release-upgrade I can upgrade to Ubuntu 20.04 LTS, but I am ending with this error:

Errors were encountered while processing:
 grub-pc
Exception during pm.DoInstall(): E:Sub-process /usr/bin/dpkg returned an error code (1)

After that my system is unusable. It cannot be updated, no packages can be installed. I can reboot, but when I try to fix the error with re-installing grub-pc it becomes unbootable.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers