Boot failure with efi shims from 20180913.0

Bug #1792575 reported by Peter Sabaini on 2018-09-14
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
grub2 (Ubuntu)
Undecided
Unassigned
grub2-signed (Ubuntu)
Undecided
Unassigned
shim (Ubuntu)
Undecided
Canonical Foundations Team

Bug Description

We have had several nodes that had been deployed on Sept. 12 and were booting correctly fail to boot.

On the console and during tracing we could see they were getting dhcp and pxe information, but then errored out with "relocation failed", dropping into a fallback grub menu with a Local boot option.

After copying over bootx64.efi grubx64.efi from https://images.maas.io/ephemeral-v3/daily/bootloaders/uefi/amd64/20180906.0/ instead of 20180913.0/ and rebooting, boot would commence successfully.

Hardware: Dell R640
maas 2.3.5-6511-gf466fdb-0ubuntu1~16.04.1

Peter Sabaini (peter-sabaini) wrote :

Subscribed field-critical

description: updated
Peter Sabaini (peter-sabaini) wrote :

Ftr., same effect on Dell R730

James Troup (elmo) wrote :

https://bugzilla.redhat.com/show_bug.cgi?id=1347291 may or may not be relevant.

I've visually verified the patch in https://bugzilla.redhat.com/attachment.cgi?id=1222471&action=diff would apply to 16.04's grub2 source code.

Andres Rodriguez (andreserl) wrote :

Just to provide some more context

1. Machines were deployed with Xenial and bootloders from image streams version 20180906, which was using an older shim (versions as per "a" below)

2. The shim has been updated in the ubuntu archive for cosmic and bionic, but for xenial they still in -proposed (new version as per "b" below).

3. MAAS automatically updated to newer bootloaders in 20180913 streams, using the latest from bionic-proposed (see "b" below for details)

4. The machine was rebooted, upon reboot, the new boot fails were unable to chain into the installed systems files. The installed system (in Xenial) had older version of the shim, where as MAAS, had a newer version of the shim (from Bionic).

Looking at the MAAS streams that provides the bootloaders [1] I can see that:

 a) 20180906.0 is using bionic's grub2-signed (1.93.5+2.02-2ubuntu8.4) and shim-signed (1.34.9.2+13-0ubuntu2)
 b) 20180913.0 is using bionic's grub2-signed (1.93.5+2.02-2ubuntu8.4) but it is using a different shim-signed (1.37~18.04.1+15+1533136590.3beb971-0ubuntu1)

[1]: https://images.maas.io/ephemeral-v3/daily/streams/v1/com.ubuntu.maas:daily:1:bootloader-download.json

That said, I would have thought that:

1. using a newer shim/grub2-signed, it should be able to chainload into an older shim/grub2-signed?

I'm adding a task for the 'shim' package to get this question answered and find out whether a shim update should be able to chain into an older shim version (or viceversa).

So, the patch would apply, but you're not seeing the error messages that I expect you should be seeing if that was the problem. We can certainly try to apply the patch (I will make it available in a PPA), but I'm not sure that will help.

For one thing, you are using the same grub in both cases, and the errors you are seeing are post-shim (it's grub errors, but I'm not sure if it's at the point where it tries to load the grub from disk or the kernel).

What versions of grub2 and grub2-signed are installed on the affected system (the client being deployed)?

The package versions we need are for grub-efi-amd64 and grub-efi-amd64-signed.

Peter Sabaini (peter-sabaini) wrote :

Hi, that would be those:

$ dpkg -l | grep grub-efi
ii grub-efi-amd64 2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version)
ii grub-efi-amd64-bin 2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 binaries)
ii grub-efi-amd64-signed 1.66.18+2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version, signed)

Andres Rodriguez (andreserl) wrote :

Marking this as invalid for MAAS> Once this issue is fixed and SRU'd to Xenial, it will be rolled out automatically.

In the meantime, the bootloaders have been reverted from the archive, and from the MAAS image repositories.

Changed in maas:
status: New → Invalid
Changed in shim (Ubuntu):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
James Troup (elmo) wrote :

Since the shim change has been reverted, I'm unsubscribing field-critical.

tags: added: id-5b9c1dc8ce84408c14c66e7e
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2-signed - 1.106

---------------
grub2-signed (1.106) cosmic; urgency=medium

  * Rebuild against grub2 2.02+dfsg1-5ubuntu4. (LP: #1792575)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 17 Sep 2018 08:47:45 -0400

Changed in grub2-signed (Ubuntu):
status: New → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02+dfsg1-5ubuntu4

---------------
grub2 (2.02+dfsg1-5ubuntu4) cosmic; urgency=medium

  * debian/patches/linuxefi_fix_relocate_coff.patch: fix typo in
    relocate_coff() causing issues with relocation of code in chainload.
    (LP: #1792575)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 17 Sep 2018 07:45:49 -0400

Changed in grub2 (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.