[regression-in-stable] grub-multi-install fails with exit 1 when question grub-efi/install_devices_empty is skipped

Bug #1896608 reported by Michael Vogt on 2020-09-22
82
This bug affects 18 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Status tracked in Groovy
Bionic
Undecided
Unassigned
Focal
Undecided
Unassigned
Groovy
Undecided
Unassigned

Bug Description

[Impact]

 * When both grub-pc & grub-efi-*-signed are installed, it is ok for one of them to not install onto any devices and mark install_devices_empty as true such that machine is booted with just one boot loader type, and not both.
 * In such cases, the recently SRUed postinst will behave incorrectly upon reconfigure, instead demanding user to install both types of bootloaders somewhere.
* Error can only be triggered by abandonned grub-pc install, as grub-multi-install validates install_devices values and resets configuration logic.

[Test Case]

 * Install in efi mode. Ensure that grub-pc is installed, configure debconf to have seen grub-pc/install_devices_empty question and have it as true, and have grub-pc/install_devices seen and empty.

 * Attempt reconfigure of grub-pc, it should be successful.

All this can be done in chroot as follows:

mkdir -p /boot/grub/i386-pc
touch /boot/grub/i386-pc/core.img
dpkg-reconfigure grub-pc
(answer defaults, and choose to not install GRUB to any devices)

Then:
dpkg-reconfigure -u grub-pc
should succed, with exit code 0, when this bug is fixed. The current broken packages return exit code 1.

 * Populate grub-efi/install_devices & grub-pc/install_devices with invalid devices and set install_devices_empty to false, attempt non-interactive reconfigure. It should still fail. (A check that previous bug-fix does not regress).

[Regression Potential]

 * This partially revert previous update which made that code path error out. One too many error paths got added by xnox. The other bug, to ensure when installing onto missing devices noninteractively the package configure fails is still being check for (test #3).

[Other Info]

 * original bug report

A 20.04 machine with grub-efi-amd64-signed that recently got upgraded fails with:
"""
dpkg: error processing package grub-efi-amd64-signed (--configure):
 installed grub-efi-amd64-signed package post-installation script subprocess returned error exit status 1
"""

unfortunately not much more context was provided. After some debugging it turns out that there are the following lines in /usr/lib/grub/grub-multi-install:

"""
  db_get "$question"
  if [ -z "$RET" ]; then
    # Reset the seen flag if the current answer is false, since
    # otherwise we'll loop with no indication of why.
    db_get grub-efi/install_devices_empty
    if [ "$RET" = false ]; then
      db_fset grub-efi/install_devices_empty seen false
    fi
    if db_input critical grub-efi/install_devices_empty; then
      db_go
      db_get grub-efi/install_devices_empty
      if [ "$RET" = true ]; then
        break
      else
        db_fset "$question" seen false
        db_fset grub-efi/install_devices_empty seen false
      fi
    else
      exit 1 # noninteractive
    fi
"""
when grub-efi/install_devices_empty has the flag "seen" the line

    if db_input critical grub-efi/install_devices_empty; then

will return "30" if the question was already shown but the rfset eset happens only if the anser was "false" earlier. So the code jumps to "exit 1".

So either the code need to always reset the seen flag ( db_fset grub-efi/install_devices_empty seen false) or deal with exit code 30 from "db_input critical grub-efi/install_devices_empty" more gracefully.

I can provide a /var/cache/debconf/config.dat to reproduce. I'm not entirely sure how to reproduce from a clean install, the bug will only happen after the first upgrade.

Related branches

Michael Vogt (mvo) wrote :

Some debug lines from setting "sh -x" inside the grub-multi-install https://paste.ubuntu.com/p/9CBGGvJjz9/

and the debconf db grub entries: https://paste.ubuntu.com/p/j2DwtQHzSb/

I think the only relevant part here is that grub-efi/install_devices_empty was already seen *and* is set to "True" as this leads to the issue.

Michael Vogt (mvo) wrote :

Looking at the diff in http://launchpadlibrarian.net/497631450/grub2_2.04-1ubuntu26_2.04-1ubuntu26.4.diff.gz I see that this appears to be a new change during the SRU cycle, specifically:

"""
diff -Nru grub2-2.04/debian/grub-multi-install grub2-2.04/debian/grub-multi-install
--- grub2-2.04/debian/grub-multi-install 2020-04-15 11:31:27.000000000 +0000
+++ grub2-2.04/debian/grub-multi-install 2020-09-08 10:23:00.000000000 +0000
@@ -8,6 +8,13 @@

 . /usr/share/debconf/confmodule

+# shamelessly stolen from ucf:
+#
+# Load our templates, just in case our template has
+# not been loaded or the Debconf DB lost or corrupted
+# since then.
+db_x_loadtemplatefile "$(dpkg-query --control-path grub-common templates)" grub-common
+
 ###############################################################################
 # COPY FROM POSTINST
 ###############################################################################
@@ -373,7 +380,7 @@
         continue
       fi
     else
- break # noninteractive
+ exit 1 # noninteractive
     fi
   fi

@@ -395,7 +402,7 @@
         db_fset grub-efi/install_devices_empty seen false
       fi
     else
- break # noninteractive
+ exit 1 # noninteractive
     fi
   else
     break
"""

The new "exit 1" will break stable users that have answered the "grub-efi/install_devices_empty" question already with the answer "yes" because for those the question will be skipped (as it was already asked) and hence the if goes into the "exit 1" condition and the postinst fails.

summary: - grub-multi-install fails with exit 1 when question grub-
- efi/install_devices_empty is skipped
+ [regression-in-stable] grub-multi-install fails with exit 1 when
+ question grub-efi/install_devices_empty is skipped
Michael Vogt (mvo) wrote :

One more idea: maybe if we really need to exit, then instead of two "exit 1" (which is super generic) have "exit 22" and "exit 33" (or similar). And maybe a message to stderr. I think this would help the automatic errortracker reports too.

Łukasz Zemczak (sil2100) wrote :

We'll make sure this gets looked at and prioritize accordingly. Thank you for reporting it!

tags: added: regression-update rls-ff-incoming
Dimitri John Ledkov (xnox) wrote :

Name: grub-efi/install_devices
Template: grub-efi/install_devices
Value:
Owners: grub-common, grub-efi-amd64
Flags: seen
Variables:
 CHOICES =
 RAW_CHOICES =

Name: grub-efi/install_devices_empty
Template: grub-efi/install_devices_empty
Value: true
Owners: grub-common, grub-efi-amd64
Flags: seen

question is grub-efi/install_devices

  db_get "$question"
  if [ -z "$RET" ]; then
    # Reset the seen flag if the current answer is false, since
    # otherwise we'll loop with no indication of why.
    db_get grub-efi/install_devices_empty
    if [ "$RET" = false ]; then
      db_fset grub-efi/install_devices_empty seen false
    fi
    if db_input critical grub-efi/install_devices_empty; then
      db_go
      db_get grub-efi/install_devices_empty
      if [ "$RET" = true ]; then
        break
      else
        db_fset "$question" seen false
        db_fset grub-efi/install_devices_empty seen false
      fi
    else
      exit 1 # noninteractive
    fi
  else
    break
  fi

So, if install_devices is empty & install_devices_empty is true and seen, asking grub-efi/install_devices_empty will be skipped. and shouldn't error out.

I wonder if I cause this mistake in postinst, and grub-multi-install.

description: updated
Dimitri John Ledkov (xnox) wrote :

Hm, so before calling grub_multi_install we check if we have been installed into target as efi with

        if test -e /boot/grub/$target/core.efi; then
          db_get grub2/no_efi_extra_removable
          if [ "$RET" = true ]; then
            NO_EXTRA_REMOVABLE="--no-extra-removable"
          fi
          NO_NVRAM="$(no_nvram_arg)"
          run_grub_multi_install --target="$target" "$NO_EXTRA_REMOVABLE" "$NO_NVRAM"
        fi

so it would seem that /boot/grub/x86_64-efi/core.efi exists, yet user chose not to install grub-efi.

Dimitri John Ledkov (xnox) wrote :

we do similar check for grub-pc checking for /core.img to be present.

description: updated
description: updated
description: updated
description: updated
description: updated
tags: removed: rls-ff-incoming
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu Bionic):
status: New → Confirmed
Changed in grub2 (Ubuntu Focal):
status: New → Confirmed
Changed in grub2 (Ubuntu):
status: New → Confirmed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.04-1ubuntu35

---------------
grub2 (2.04-1ubuntu35) groovy; urgency=medium

  * postinst.in, grub-multi-install: fix logic of skipping installing onto
    any device, if one chose to not install bootloader on any device. LP:
    #1896608
  * Do not finalize params twice on arm64. LP: #1897819

 -- Dimitri John Ledkov <email address hidden> Thu, 01 Oct 2020 22:59:51 +0800

Changed in grub2 (Ubuntu Groovy):
status: Confirmed → Fix Released

Hello Michael, or anyone else affected,

Accepted grub2 into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.04-1ubuntu26.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in grub2 (Ubuntu Focal):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-focal
costinel (costinel) wrote :

I have upgrade the packages from the link provided (manual download since they're not yet accepted into focal-proposed) and grub-pc postinst script exits cleanly now. thanks.

Scott Talbert (swt-techie) wrote :

The build page for this update for focal seems to indicate that the packages for amd64 and arm64 are in an UNAPPROVED queue. Does something need to be triggered there?

https://launchpad.net/ubuntu/+source/grub2/2.04-1ubuntu26.6/+build/20118987

Scott Talbert (swt-techie) wrote :

Never mind my last comment - that issue with the UNAPPROVED builds seems to have been resolved.

The 2.04-1ubuntu26.6 packages from focal-proposed seem to have resolved the issue for me as well.

All autopkgtests for the newly accepted grub2 (2.04-1ubuntu26.6) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

ubuntu-image/1.10+20.04ubuntu1 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#grub2

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

comment 13 suggests the fix works.

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers