GRUB does not bring up networking when loaded over HTTP

Bug #1879012 reported by Lee Trager
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Incomplete
High
Unassigned
grub2 (Ubuntu)
Fix Committed
Undecided
Unassigned
shim-signed (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

When using MAAS to HTTP boot on x86_64 UEFI grub drops to the command line. net_ls_addr shows the system has no address. If I run net_dhcp I get an address. I can then download the remote grub.cfg file and continue boot.

When reproducing with QEMU you have to manually reconfigure the boot order to try HTTP before TFTP:

# efibootmgr -v
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0002,0003,0001,0000,0004
Boot0000* UiApp FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0001* UEFI QEMU QEMU HARDDISK PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/SCSI(1,1)N.....YM....R,Y.
Boot0002* UEFI PXEv4 (MAC:00163E03BE1A) PciRoot(0x0)/Pci(0x4,0x0)/Pci(0x0,0x0)/MAC(00163e03be1a,1)/IPv4(0.0.0.00.0.0.0,0,0)N.....YM....R,Y.
Boot0003* UEFI HTTPv4 (MAC:00163E03BE1A) PciRoot(0x0)/Pci(0x4,0x0)/Pci(0x0,0x0)/MAC(00163e03be1a,1)/IPv4(0.0.0.00.0.0.0,0,0)/Uri()N.....YM....R,Y.
Boot0004* EFI Internal Shell FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
# efibootmgr -o 0003,0002,0001,0004
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0003,0002,0001,0004
Boot0000* UiApp
Boot0001* UEFI QEMU QEMU HARDDISK
Boot0002* UEFI PXEv4 (MAC:00163E03BE1A)
Boot0003* UEFI HTTPv4 (MAC:00163E03BE1A)
Boot0004* EFI Internal Shell

grub> net_ls_addr
grub>
grub> net_dhcp
efinet0:dhcp 00:16:3e:03:be:1a 10.0.0.75
grub> configfile (http,10.0.0.2:5248)/grub/grub.cfg-default-amd64
Booting under MAAS direction...

The kernel and initrd are downloaded but it hangs there.

I believe the bug is in grub. As MAAS receives its bootloaders from the stream at images.maas.io generated by lp:maas-images this affects all versions of MAAS. Currently MAAS uses GRUB and the Shim from Bionic however I have been able to reproduce this bug using GRUB and the shim from Focal as well.

Lee Trager (ltrager)
description: updated
description: updated
Revision history for this message
Lee Trager (ltrager) wrote :

I modified MAAS to only provide GRUB and skip the Shim. In that case HTTP boot still doesn't work. It looks like when GRUB is loaded over HTTP it does not bring up networking. I can still manually load networking with net_dhcp and specify the remote grub.cfg file. GRUB does download the kernel and initrd the remote grub.cfg file specifies but it looks like it never actually excutes them.

summary: - Shim does not hand off networking during HTTP boot
+ GRUB does not bring up networking when loaded over HTTP
Lee Trager (ltrager)
description: updated
Steve Langasek (vorlon)
affects: grub (Ubuntu) → grub2 (Ubuntu)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

"I modified MAAS to only provide GRUB and skip the Shim" => that will not work in secureboot without canonical CA certificates provisioned. If only MS certificate is provisioned (default for most x86 hw) then one must use shim too.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :
Changed in shim-signed (Ubuntu):
status: New → Incomplete
Changed in grub2 (Ubuntu):
status: New → Incomplete
Revision history for this message
Lee Trager (ltrager) wrote :

Please provide remote artifacts:
All bootloader files are pulled from the Bionic archive and provided on images.maas.io by lp:maas-images. bootloaders.yaml[1] describes what files are pulled from what packages. You can download the tars at[2]

Please provide local artifacts:
The local artifacts are from the deploy OS. So for Ubuntu its whatever shim/grub is in the archive for that version of Ubuntu. Same goes for CentOS, VMware, Windows, etc.

Keep in mind once HTTP boot is set the remote GRUB drops to the command line before loading the remote grub.cfg. I can only precede with the manual steps detailed above.

Please provide reproducer steps:
1. Install MAAS
2. Configure an UEFI virtual machine to use with MAAS. I would suggest manually creating a UEFI VM with libvirt and adding it to MAAS as a machine.
3. Commission the VM and enable SSH so you can enable HTTP boot
4. SSH into the VM and put HTTP boot before TFTP as described above.
5. Shutdown the machine.
6. Try to deploy any operating system

Please provide details how local artifacts were installed:
Local artifacts are installed by Curtin which gets them from the Ubuntu archive when installing Ubuntu or CentOS archive when installing CentOS.

Please provide list of certs trusted by the node's firmware:
Due to LP:1865515 secure boot was disabled to produce this bug.

[1] https://git.launchpad.net/maas-images/tree/conf/bootloaders.yaml
[2] https://images.maas.io/ephemeral-v3/daily/bootloaders/uefi/amd64/

Changed in grub2 (Ubuntu):
status: Incomplete → New
Changed in shim-signed (Ubuntu):
status: Incomplete → New
Alberto Donato (ack)
Changed in maas:
milestone: 2.8.0rc1 → 2.8.0
Revision history for this message
Stéphane Graber (stgraber) wrote :

FYI, reproduced this in LXD virtual machines trying to use UEFI HTTPBOOT.
Similar setup, http-only (no https yet) and no secureboot enabled.

Shim and grub are both retrieved properly over http, then dumped into a grub shell without it ever attempting to download grub.cfg over the network.

net_ls_addr is similarly empty here.

Revision history for this message
Stéphane Graber (stgraber) wrote :

So I managed to make it work by using SUSE's grubx64.efi instead.

Steps are basically:
 - Replace bootx64.efi with the OpenSUSE one
 - Write a grub.cfg shim at the root which does "configfile (http,172.17.16.10:5248)/grub/grub.cfg-default-amd64"

Reboot and everything works as expected.
OpenSUSE Tumbleweed from which the grub build was taken is also on grub 2.04.

Alberto Donato (ack)
Changed in maas:
milestone: 2.8.0rc3 → 2.8.0
Alberto Donato (ack)
Changed in maas:
milestone: 2.8.0 → 2.9.0b1
tags: added: maas-grub
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

So the SUSE srpm has a lot interesting patches that implement:
 * configuring networking from EFI
   - bootp bootpv6
   - dhcp dhcpv4
   - static, v4 & v6, with/without port numbers
   - http https file access via EFI protocol

I don't believe any of that is upstream, and/or only partially, But i should double check.

I think we want to cherrypick:

# Support HTTP Boot IPv4 and IPv6 (fate#320129)
Patch281: 0002-net-read-bracketed-ipv6-addrs-and-port-numbers.patch
Patch282: 0003-bootp-New-net_bootp6-command.patch
Patch283: 0004-efinet-UEFI-IPv6-PXE-support.patch
Patch284: 0005-grub.texi-Add-net_bootp6-doument.patch
Patch285: 0006-bootp-Add-processing-DHCPACK-packet-from-HTTP-Boot.patch
Patch286: 0007-efinet-Setting-network-from-UEFI-device-path.patch
Patch287: 0008-efinet-Setting-DNS-server-from-UEFI-protocol.patch
# UEFI HTTP and related network protocol support (FATE#320130)
Patch420: 0001-add-support-for-UEFI-network-protocols.patch
Patch421: 0002-AUDIT-0-http-boot-tracker-bug.patch

Not sure if we want a feature to automatically scan/find grub.cfg on remote host by ip/mac/uuid/etc:

# bsc#1166409 - Grub netbooting does not search for grub.cfg files with mac
# address or ip address in filename
Patch700: 0001-normal-Move-common-datetime-functions-out-of-the-nor.patch
Patch701: 0002-kern-Add-X-option-to-printf-functions.patch
Patch702: 0003-normal-main-Search-for-specific-config-files-for-net.patch
Patch703: 0004-datetime-Enable-the-datetime-module-for-the-emu-plat.patch

Also they have interesting fixes to limit screen resolution, to make fonts readable out of the box.

Revision history for this message
Lee Trager (ltrager) wrote :

> Not sure if we want a feature to automatically scan/find grub.cfg on remote host by ip/mac/uuid/etc:

Currently MAAS only specifies the path to the bootloader, not the bootloader config. It expects the bootloader will automatically request the config from where the bootloader was downloaded from. Right now that is <server>/grub/grub.cfg. That file attempts to chainload <server>/grub/grub.cfg-${net_default_mac} and if that fails <server>/grub/grub.cfg-default-amd64. MAAS could also respond to <server>/grub/grub.cfg-$uuid as we have that information as well. Having grub do that automatically would remove a request and a level of chainloading.

https://git.launchpad.net/maas/tree/src/provisioningserver/boot/uefi_amd64.py

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Changed in shim-signed (Ubuntu):
status: New → Confirmed
Lee Trager (ltrager)
Changed in maas:
milestone: 2.9.0b1 → 2.9.0b2
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

"MAAS could also respond to <server>/grub/grub.cfg-$uuid as we have that information as well"

In that sentence which $uuid do you mean? UUID of the machine? Something grub can query from smbios?

Revision history for this message
Lee Trager (ltrager) wrote :

I mean the output of /sys/class/dmi/id/product_uuid. However I'm not sure how reliable that is. Since I made that comment we've had multiple users report(LP:1893690) some vendors are handing out duplicate UUIDs. MAAS handles this by ignoring any UUID which has been duplicated.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

nevermind, uuid is the one from dhcp client-id obviously as implemented / referenced in my earlier comment.

Lee Trager (ltrager)
Changed in maas:
milestone: 2.9.0b2 → 2.9.0b3
milestone: 2.9.0b3 → 2.9.0b4
Changed in maas:
status: Confirmed → Triaged
tags: added: fr-683
Lee Trager (ltrager)
Changed in maas:
milestone: 2.9.0b4 → 2.9.0b7
Changed in maas:
milestone: 2.9.0b7 → 2.9.x
Changed in maas:
milestone: 2.9.2 → 2.9.x
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

https://launchpad.net/ubuntu/+source/grub2/2.04-1ubuntu40/+build/21017796/+files/grub-efi-amd64-bin_2.04-1ubuntu40_amd64.deb

from this, you can extract /usr/lib/grub/x86_64-efi/monolithic/grubnetx64.efi and drop it into

/var/snap/maas/common/maas/boot-resources/current/bootloader/uefi/amd64/grubx64.efi

and check that all the fixes are good for your deployment too.

For me, I can do httpboot, have dhcp come up by default, and the grub.cfg is loaded.

Changed in shim-signed (Ubuntu):
status: Confirmed → Invalid
Changed in grub2 (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

Hi folks,

It looks like the issue is still around - I just got hit by this with MAAS 3.1.0 on Focal. However, replacing grubx64.efi with the one from /usr/lib/grub/x86_64-efi-signed/grubx64.efi.signed made the trick and I was able to boot without any manual intervention.

Can we consider replacing the grub binary with the one which comes with our distro?

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Test if the updated bootloader is free of the issue. Check if Jammy bootloader is backwards-compatible with i386 and if it could be used instead of the Focal one.

Changed in maas:
milestone: 2.9.x → 3.3.0
Changed in maas:
milestone: 3.3.0 → 3.4.0
Olivier Gayot (ogayot)
tags: added: foundations-triage-discuss
Changed in maas:
milestone: 3.4.0 → 3.5.0
Revision history for this message
Jacopo Rota (r00ta) wrote (last edit ):

Not sure if this bug is still valid as I can't reproduce it anymore on our master branch (using lxd+qemu). HTTP boot over IPv4 seems to work fine - see attachment.

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Is this issue reproducible on MAAS 3.3+?

Changed in maas:
milestone: 3.5.0 → none
status: Triaged → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.