Boot failure with efi shims from 20180913.0

Bug #1792575 reported by Peter Sabaini on 2018-09-14
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
grub2 (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned
grub2-signed (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned
shim (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned
shim-signed (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

[Impact]
Chainloading grub via grub in a netboot context using MAAS's Boot to local disk feature.

[Test cases]
1) Deploy UEFI system using MAAS
2) After deployment, have the system reboot to local disk (via netboot).

[Regression potential]
It is possible that the changes to chainloading logic that evaluates the sizes for various sections of code that gets copied to memory to load the next bootloader might fail to correctly evaluate the sections, or otherwise copy sections incorrectly, but this regression scenario is indistinguishable from the current case, there the system fails to load the next bootloader anyway. Error messages may vary, but the net result for a regression would be an incorrectly loaded bootloader, and thus error messages at boot from grub.

---

We have had several nodes that had been deployed on Sept. 12 and were booting correctly fail to boot.

On the console and during tracing we could see they were getting dhcp and pxe information, but then errored out with "relocation failed", dropping into a fallback grub menu with a Local boot option.

After copying over bootx64.efi grubx64.efi from https://images.maas.io/ephemeral-v3/daily/bootloaders/uefi/amd64/20180906.0/ instead of 20180913.0/ and rebooting, boot would commence successfully.

Hardware: Dell R640
maas 2.3.5-6511-gf466fdb-0ubuntu1~16.04.1

Peter Sabaini (peter-sabaini) wrote :

Subscribed field-critical

description: updated
Peter Sabaini (peter-sabaini) wrote :

Ftr., same effect on Dell R730

James Troup (elmo) wrote :

https://bugzilla.redhat.com/show_bug.cgi?id=1347291 may or may not be relevant.

I've visually verified the patch in https://bugzilla.redhat.com/attachment.cgi?id=1222471&action=diff would apply to 16.04's grub2 source code.

Andres Rodriguez (andreserl) wrote :

Just to provide some more context

1. Machines were deployed with Xenial and bootloders from image streams version 20180906, which was using an older shim (versions as per "a" below)

2. The shim has been updated in the ubuntu archive for cosmic and bionic, but for xenial they still in -proposed (new version as per "b" below).

3. MAAS automatically updated to newer bootloaders in 20180913 streams, using the latest from bionic-proposed (see "b" below for details)

4. The machine was rebooted, upon reboot, the new boot fails were unable to chain into the installed systems files. The installed system (in Xenial) had older version of the shim, where as MAAS, had a newer version of the shim (from Bionic).

Looking at the MAAS streams that provides the bootloaders [1] I can see that:

 a) 20180906.0 is using bionic's grub2-signed (1.93.5+2.02-2ubuntu8.4) and shim-signed (1.34.9.2+13-0ubuntu2)
 b) 20180913.0 is using bionic's grub2-signed (1.93.5+2.02-2ubuntu8.4) but it is using a different shim-signed (1.37~18.04.1+15+1533136590.3beb971-0ubuntu1)

[1]: https://images.maas.io/ephemeral-v3/daily/streams/v1/com.ubuntu.maas:daily:1:bootloader-download.json

That said, I would have thought that:

1. using a newer shim/grub2-signed, it should be able to chainload into an older shim/grub2-signed?

I'm adding a task for the 'shim' package to get this question answered and find out whether a shim update should be able to chain into an older shim version (or viceversa).

So, the patch would apply, but you're not seeing the error messages that I expect you should be seeing if that was the problem. We can certainly try to apply the patch (I will make it available in a PPA), but I'm not sure that will help.

For one thing, you are using the same grub in both cases, and the errors you are seeing are post-shim (it's grub errors, but I'm not sure if it's at the point where it tries to load the grub from disk or the kernel).

What versions of grub2 and grub2-signed are installed on the affected system (the client being deployed)?

The package versions we need are for grub-efi-amd64 and grub-efi-amd64-signed.

Peter Sabaini (peter-sabaini) wrote :

Hi, that would be those:

$ dpkg -l | grep grub-efi
ii grub-efi-amd64 2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version)
ii grub-efi-amd64-bin 2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 binaries)
ii grub-efi-amd64-signed 1.66.18+2.02~beta2-36ubuntu3.18 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version, signed)

Andres Rodriguez (andreserl) wrote :

Marking this as invalid for MAAS> Once this issue is fixed and SRU'd to Xenial, it will be rolled out automatically.

In the meantime, the bootloaders have been reverted from the archive, and from the MAAS image repositories.

Changed in maas:
status: New → Invalid
Changed in shim (Ubuntu):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
James Troup (elmo) wrote :

Since the shim change has been reverted, I'm unsubscribing field-critical.

tags: added: id-5b9c1dc8ce84408c14c66e7e
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2-signed - 1.106

---------------
grub2-signed (1.106) cosmic; urgency=medium

  * Rebuild against grub2 2.02+dfsg1-5ubuntu4. (LP: #1792575)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 17 Sep 2018 08:47:45 -0400

Changed in grub2-signed (Ubuntu):
status: New → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02+dfsg1-5ubuntu4

---------------
grub2 (2.02+dfsg1-5ubuntu4) cosmic; urgency=medium

  * debian/patches/linuxefi_fix_relocate_coff.patch: fix typo in
    relocate_coff() causing issues with relocation of code in chainload.
    (LP: #1792575)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 17 Sep 2018 07:45:49 -0400

Changed in grub2 (Ubuntu):
status: New → Fix Released
description: updated

Hello Peter, or anyone else affected,

Accepted grub2 into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02-2ubuntu8.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in grub2 (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed verification-needed-bionic
Changed in grub2-signed (Ubuntu Bionic):
status: New → Fix Committed
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted grub2-signed into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.93.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Łukasz Zemczak (sil2100) wrote :

Hello Peter, or anyone else affected,

Accepted shim-signed into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/shim-signed/1.37~18.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

This does not appear to be a bug in shim at all, closing the shim tasks as Invalid.

Changed in shim (Ubuntu):
assignee: Canonical Foundations Team (canonical-foundations) → nobody
status: New → Invalid
Changed in shim (Ubuntu Xenial):
status: New → Invalid
Changed in shim (Ubuntu Bionic):
status: New → Invalid

Not a bug in shim, but we are adding a Breaks in shim-signed for the purposes of SRUs to avoid people upgrading into a broken state. As such, the task for cosmic in Invalid; but absolutely in progress / committed for other releases.

Changed in shim-signed (Ubuntu):
status: New → Invalid
Changed in shim-signed (Ubuntu Bionic):
status: New → Fix Committed
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu Xenial):
status: New → Confirmed
Changed in grub2-signed (Ubuntu Xenial):
status: New → Confirmed
Changed in shim-signed (Ubuntu Xenial):
status: New → Confirmed

I can't seem to be able to reproduce this problem (after testing again) on a VM, so I'll need help to reproduce the bug, and then to validate the fix.

Peter Sabaini (peter-sabaini) wrote :

Could repro the orig. issue and verify the fix works. Thanks!

Lee Trager (ltrager) wrote :

I'm also unable to verify the fix using the MAAS CI. However I did confirm the new shim works.

Peter checked (I asked him, since they have hardware that clearly exhibits the issue) and it looks like this is a verification-done:

grub2 2.02-2ubuntu8.6
grub2-signed 1.93.7

I still have to do a quick check with Windows 10 to make sure that aspect of the fix also works correctly, and that shim-signed has the right Breaks.

Looks good to me here; I was able to have Windows 10 chainload from grub2 2.02-2ubuntu8.6 / grub2-signed 1.93.7.

tags: added: verification-done-bionic
removed: verification-needed verification-needed-bionic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package shim-signed - 1.37~18.04.2

---------------
shim-signed (1.37~18.04.2) bionic; urgency=medium

  * debian/control: add Breaks: grub-efi-amd64-signed (<< 1.93.7), as the new
    version of shim exercises a bug in relocation code for chainload that was
    fixed in that upload of grub, affecting Windows 7, Windows 10, and some
    netboot scenarios where chainloading is required. (LP: #1792575)

shim-signed (1.37~18.04.1) bionic; urgency=medium

  * Backport shim-signed 1.37 to Ubuntu 18.04. (LP: #1790724)

shim-signed (1.37) cosmic; urgency=medium

  * Update to the signed 15+1533136590.3beb971-0ubuntu1 binary from Microsoft.
  * debian/real-po: replace debian/po to make sure things are translatable
    via Launchpad.

 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 28 Sep 2018 11:02:56 -0400

Changed in shim-signed (Ubuntu Bionic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for shim-signed has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2-signed - 1.93.7

---------------
grub2-signed (1.93.7) bionic; urgency=medium

  * Rebuild against grub2 2.02-2ubuntu8.6 (LP: #1792575)

grub2-signed (1.93.6) bionic; urgency=medium

  * Rebuild against grub2 2.02-2ubuntu8.5 (LP: #788298)

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 27 Sep 2018 13:07:26 -0400

Changed in grub2-signed (Ubuntu Bionic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02-2ubuntu8.6

---------------
grub2 (2.02-2ubuntu8.6) bionic; urgency=medium

  * debian/patches/linuxefi_fix_relocate_coff.patch: fix typo in
    relocate_coff() causing issues with relocation of code in chainload.
    (LP: #1792575)
  * debian/patches/linuxefi_truncate_overlong_reloc_section.patch: The Windows
    7 bootloader has inconsistent headers; truncate to the smaller, correct
    size to fix chainloading Windows 7. (LP: #1792575)

grub2 (2.02-2ubuntu8.5) bionic; urgency=medium

  * debian/patches/grub-reboot-warn.patch: Warn when "for the next
    boot only" promise cannot be kept. (LP: #788298)

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 27 Sep 2018 17:00:43 +0200

Changed in grub2 (Ubuntu Bionic):
status: Fix Committed → Fix Released
tags: added: id-5b36ccda18d5e26eda679909

Hello Peter, or anyone else affected,

Accepted grub2 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-36ubuntu3.20 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in grub2 (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-xenial
Changed in grub2-signed (Ubuntu Xenial):
status: Confirmed → Fix Committed
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted grub2-signed into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.66.20 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-done-xenial
removed: verification-needed verification-needed-xenial

Verification-done with the xenial grub2/grub2-signed versions:

$ dpkg -l grub\* | grep ii
ii grub-common 2.02~beta2-36ubuntu3.20 amd64 GRand Unified Bootloader (common files)
ii grub-efi-amd64 2.02~beta2-36ubuntu3.20 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version)
ii grub-efi-amd64-bin 2.02~beta2-36ubuntu3.20 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 binaries)
ii grub-efi-amd64-signed 1.66.20+2.02~beta2-36ubuntu3.20 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version, signed)
ii grub2-common 2.02~beta2-36ubuntu3.20 amd64 GRand Unified Bootloader (common files for version 2)

I am unable to accurately verify chainloading grub bootloaders from network to disk (especially as it appeared to be a hardware-dependent issue), however, I can easily verify the other side-effect of this, which would break chainloading Windows from grub. I was able to chainload Windows 10 just fine with the patches applied.

Given that this is the same two patches as applied in other releases that was applied without changes, and that they have also successfully passed validation for both chainloading Windows 10 and chainloading grub in Peter's environment (comment #21), I find this acceptable testing.

Autopkgtests failures for ubuntu-image appear to be unrelated, the tests show the machine has already booted at that point -- this is some kind of other networking issue post-boot.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-36ubuntu3.20

---------------
grub2 (2.02~beta2-36ubuntu3.20) xenial; urgency=medium

  * debian/patches/linuxefi_fix_relocate_coff.patch: fix typo in
    relocate_coff() causing issues with relocation of code in chainload.
    (LP: #1792575)
  * debian/patches/linuxefi_truncate_overlong_reloc_section.patch: The Windows
    7 bootloader has inconsistent headers; truncate to the smaller, correct
    size to fix chainloading Windows 7. (LP: #1792575)

grub2 (2.02~beta2-36ubuntu3.19) xenial; urgency=medium

  * debian/patches/0001-i386-linux-Add-support-for-ext_lfb_base.patch:
    Add support for ext_lfb_base. (LP: #1785033)

 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 02 Nov 2018 13:08:47 -0400

Changed in grub2 (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2-signed - 1.66.20

---------------
grub2-signed (1.66.20) xenial; urgency=medium

  * Rebuild against grub2 2.02~beta2-36ubuntu3.20. (LP: #1792575)

grub2-signed (1.66.19) xenial; urgency=medium

  * Rebuild against grub2 2.02~beta2-36ubuntu3.19. (LP: #1785033))

 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 02 Nov 2018 13:27:18 -0400

Changed in grub2-signed (Ubuntu Xenial):
status: Fix Committed → Fix Released
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted shim-signed into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/shim-signed/1.33.1~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in shim-signed (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-xenial
removed: verification-done-xenial
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.