MAAS incorrectly classifies UEFI machines as iPXE when ipxe.efi is used

Bug #2069095 reported by Alan Baghumian
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
MAAS
New
Undecided
Unassigned

Bug Description

Hello MAAS Team!

One of our large clients is experiencing an issue where UEFI machines PXE booted using the ipxe.efi binary are being incorrectly classified as "iPXE" a.k.a Legacy which creates issues with automatic storage layout missing the /boot/efi partition which leads to un-deployable machines.

Their current "workaround" is creating the /boot/efi manually which is not sustainable; every time the machine is commissioned the storage layout goes back to missing the /boot/efi.

Looking into the MAAS source code (1),(2),(3), it appears that MAAS is looking for the user_class "iPXE" to assume a machine is using Legacy or UEFI mode (user_class == None).

I was able to 100% reproduce this behavior by deploying a dnsmasq based DHCP/PXE server that serves ipxe.pxe for legacy boot and ipxe.efi for UEFI mode configurations.

After switching the UEFI mode conditions to use grubnetx64.efi.signed and chainloading the MAAS generated grub.cfg files everything worked as expected and the machines were correctly classified as UEFI.

I think it would be great if we can find a way to make ipxe.efi register the same way as grubnetx64.efi.signed does from MAAS' perspective.

If need be I can provide instructions regarding getting a PXE/DHCP server configured to reproduce this issue.

Also adding that the MAAS provided ipxe.cfg is also not booting correctly and needs manipulation (4), something that apparently has been addressed in MAAS 3.5.

I would love to hear what your thoughts are on this.

Thanks much,
Alan

(1) https://github.com/canonical/maas/blob/master/src/provisioningserver/dhcp/config.py
(2) https://github.com/canonical/maas/blob/master/src/provisioningserver/boot/grub.py
(3) https://github.com/canonical/maas/blob/master/src/provisioningserver/boot/ipxe.py
(4) https://bugs.launchpad.net/maas/+bug/2002303

Revision history for this message
Wyatt Rees (wyattrees) wrote :

Hi Alan, could you share the logs during commissioning and deployment (from rackd.log)?

Changed in maas:
status: New → Incomplete
Revision history for this message
Alan Baghumian (alanbach) wrote :
Download full text (3.6 KiB)

@Wyatt I'm documenting a complete reproducer here, since you will need it for further debugging anyways and will also provide the logs:

MAAS + External DHCP + iPXE (LP #2069095 Reproducer)
----------------------------------------------------
A. Deploy an Ubuntu server to act as the DHCP + PXE server (I picked Jammy).
B. Disable MAAS DHCP
C. Configure your router to use this machine as DHCP Relay
D. SSH To this machine and follow this process:

(Disable systemd-resolved)
$ sudo systemctl disable systemd-resolved
$ sudo systemctl stop systemd-resolved

(Set Manual DNS- Use Your own DNS Servers)
$ sudo unlink /etc/resolv.conf
$ sudo vim /etc/resolv.conf

nameserver 127.0.0.1
nameserver 10.1.10.10
nameserver 8.8.8.8

(Update / Install iPXE)
$ sudo apt-get update; sudo apt-get dist-upgrade
$ sudo apt-get -y install ipxe

(Create the tFTP dirs and deploy binaries)
$ sudo mkdir -p /pxeboot/firmware; sudo mkdir -p /pxeboot/config
$ sudo cp -v /usr/lib/ipxe/{ipxe.efi,ipxe.pxe} /pxeboot/firmware/

(Create an iPXE Config File - 10.1.9.12 is a Rack Controller)
$ sudo vim /pxeboot/config/boot.ipxe

#!ipxe
chain http://10.1.9.12:5248/ipxe.cfg-${net0/mac} || chain http://10.1.9.12:5248/ipxe.cfg-default-amd64

(Tree View)
$ tree /pxeboot/
/pxeboot/
├── config
│   └── boot.ipxe
└── firmware
    ├── ipxe.efi
    └── ipxe.pxe

2 directories, 3 files

(Install DNSMasq)
$ sudo apt-get -y install dnsmasq
$ sudo mv -v /etc/dnsmasq.conf /etc/dnsmasq.conf.backup

(Configure DNSMasq)
$ sudo vim /etc/dnsmasq.conf

# This is designed to replace MAAS DHCP
interface=bond0
bind-interfaces

# Match MAAS Domain
domain=alannet

dhcp-range=bond0,10.1.13.100,10.1.13.200,255.255.248.0,8h
dhcp-option=option:router,10.1.10.1
dhcp-option=option:dns-server,10.1.10.10 # Main Local DNS Forwarder
dhcp-authoritative

# Enable logging
log-queries
log-dhcp

# Set tag "MAAS" if request comes from iPXE ("iPXE" user class)
dhcp-userclass=set:MAAS,iPXE

# Enable tFTP
enable-tftp

tftp-root=/pxeboot

# Boot config for BIOS systems
dhcp-match=set:bios-x86,option:client-arch,0
dhcp-boot=tag:bios-x86,firmware/ipxe.pxe
dhcp-boot=tag:MAAS,tftp://10.1.8.44/config/boot.ipxe

# Boot config for UEFI systems
dhcp-match=set:efi-x86_64,option:client-arch,7
dhcp-match=set:efi-x86_64,option:client-arch,9
dhcp-boot=tag:efi-x86_64,firmware/ipxe.efi
dhcp-boot=tag:MAAS,http://10.1.8.44/config/boot.php

(Enable / Restart DNSMasq)
$ sudo systemctl restart dnsmasq
$ sudo systemctl status dnsmasq
$ sudo systemctl enable dnsmasq

(To work around LP #2002303 we need a nginx + PHP server)

$ sudo apt-get -y install nginx php8.1-fpm
$ sudo systemctl enable nginx
$ sudo systemctl enable php8.1-fpm

$ sudo mv /etc/nginx/sites-available/default /etc/nginx/sites-available/default.backup
$ sudo vim /etc/nginx/sites-available/default

server {
        listen 80 default_server;
        listen [::]:80 default_server;

        root /pxeboot;

        index index.php

        server_name _;

        location / {
                try_files $uri $uri/ =404;
        }

        location ~ \.php$ {
            include snippets/fastcgi-php.conf;
            fastcgi_pass unix:/run/php/php8.1-fpm.sock;
        }

        location...

Read more...

Revision history for this message
Alan Baghumian (alanbach) wrote :

Rackd logs here.

Revision history for this message
Nicholas Fries (nicfries) wrote :

Hello. Please consider other means for determining the system's firmware capabilities (legacy vs EFI).
iPXE is not strictly used for Legacy boot, it is quite common for EFI hosts depending on the environment.

Background: There are cases where an EFI capable host is unable to successfully PXE boot using the GRUB efi binary (due to firmware bugs/limitations), and using iPXE's more configurable network stack and smaller binary size can be a workaround. In these cases, the systems are able to work with MAAS just fine, just not via GRUB.

I don't think the Canonical team needs to take on massive scope for this, just consider how to determine system capabilities a bit differently and perhaps adjust the iPXE config template to correctly support booting from disk on EFI systems (also broken).

Revision history for this message
Alan Baghumian (alanbach) wrote :

@Wyatt Just making sure you saw the logs notification. Thanks much!

Changed in maas:
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.