secure boot enabled on RHEL image fails to boot local on 2nd reboot after deploy

Bug #2022084 reported by Jeff Hillman
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Igor Brovtsin
3.2
Fix Released
High
Unassigned
3.4
Fix Released
High
Igor Brovtsin

Bug Description

ubuntu 22.04.4
tried on MAAS versions:

3.1.1-10918-g.9cbd96fd2
3.2.7-12037-g.c688dd446
3.3.3-13184-g.3e9972c19

Using packer-maas to generate a RHEL MAAS image from a vanilla RHEL 8.6 x86_64 ISO using the following packer config:

```

{
    "variables": {
        "rhel8_iso_path": "{{env `RHEL8_ISO_PATH`}}"
    },
    "builders": [
        {
            "type": "qemu",
            "communicator": "none",
     "iso_url": "{{user `rhel8_iso_path`}}",
            "iso_checksum_type": "none",
            "boot_command": [
                "<tab> ",
                "inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/rhel8.ks ",
                "<enter>"
            ],
            "boot_wait": "3s",
            "disk_size": "4G",
            "headless": true,
            "memory": 1024,
            "http_directory": "http",
            "shutdown_timeout": "20m"
        }
    ],
    "post-processors": [
        {
            "type": "shell-local",
            "inline_shebang": "/bin/bash -e",
            "inline": [
                "TMP_DIR=$(mktemp -d /tmp/packer-maas-XXXX)",
                "echo 'Mounting image...'",
                "modprobe nbd",
                "qemu-nbd -d /dev/nbd4",
                "qemu-nbd -c /dev/nbd4 -n output-qemu/packer-qemu",
                "partprobe /dev/nbd4",
                "mount /dev/nbd4p1 $TMP_DIR",
                "echo 'Tarring up image...'",
                "tar -czpf rhel8.tar.gz -C $TMP_DIR .",
                "echo 'Unmounting image...'",
                "umount $TMP_DIR",
                "qemu-nbd -d /dev/nbd4",
                "rmdir $TMP_DIR"
            ]
        }
    ]
}

```

Which is very generic basically the example that comes with packer-maas.

The image creates fine and is uploaded into MAAS. however when deploying to a VM or physical machine with secure boot enabled, the machine fails to properly boot local when being directed to do so via PXE.

A screenshot of the error is attached to this bug report.

If the machine is explicitly directed to boot local instead of PXE'ing, it boots fine. So that tells me that the image is being deployed fine, but there is an issue with how MAAS is pushing the local boot option over PXE.

Steps to recreate:

1) install and configure MAAS as per usual
2) create RHEL image with 8.6 DVD iso using the default packer-maas config and instruction to upload the image to MAAS
3) create a vM in virt-managed and customize the installation so that the Firmware is changed from BIOS to UEFI X86_64: /usr/share/OVMF/OVMF_CODE_4M.secboot.fd (this requires the ovmf package to be installed)
4) enlist the machine into MAAS
5) attempt to deploy the RHEL image previously created.

It will do the initial boot of laying down the image, and it will boot on it's own for 1 time to apply SELinux rules, but the 2nd boot is when the PXE fails.

Tags: cpe-onsite

Related branches

Revision history for this message
Jeff Hillman (jhillman) wrote :
Revision history for this message
Jeff Hillman (jhillman) wrote :

subscribed field medium

tags: added: bug
tags: added: bug-council
removed: bug
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Could you check the system behaviour when GRUB debug mode is enabled? It looks like the machine struggles to find the bootloader at the right location. Edit `/etc/maas/rackd.conf` and append `debug: true` to the file. Then restart the rackd service, `sudo systemctl restart maas-rackd` and `tail -f /var/log/maas/rackd.log` to see debug output. GRUB should also output more information during boot, as MAAS passes the debug flag to the grub config file.

Moreover, what does the output look like on screen when the machine boots successfully, when set to boot locally instead of PXEing?

Could you also `ls -la` the contents of the '/boot' partition?

Changed in maas:
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Jeff Hillman (jhillman) wrote :

First off, this is a snapped installation, so none of those paths nor does that systemctl command work.

Either way, i did find rackd.conf and set debug: true and did a `snap stop maas; snap start maas`.

In any event, absolutely nothing was logged to rackd.conf. in fact, i deleted it just to prrovide an entirely fresh file for this deployment operation. But nothing was entered into it upon adding debug: true.

WRT, when it fails to PXE, it basically goes to the local grub and boots. I attached some screenshots, but it goes by really really fast.

lastly, here ya go:

]$ ls -la /boot
total 144048
dr-xr-xr-x. 5 root root 4096 Jun 7 14:08 .
dr-xr-xr-x. 18 root root 4096 Jun 7 14:08 ..
-rw-r--r--. 1 root root 195982 Apr 16 2022 config-4.18.0-372.9.1.el8.x86_64
drwxr-xr-x. 3 root root 4096 Jan 1 1970 efi
drwx------. 4 root root 4096 Apr 27 17:56 grub2
-rw-------. 1 root root 67932965 Apr 27 17:55 initramfs-0-rescue-59e691cd256544c68f4fd7a7a61da2e9.img
-rw-------. 1 root root 28488732 Jun 7 14:07 initramfs-4.18.0-372.9.1.el8.x86_64.img
-rw-------. 1 root root 25571328 Jun 7 14:08 initramfs-4.18.0-372.9.1.el8.x86_64kdump.img
drwxr-xr-x. 3 root root 4096 Apr 27 17:53 loader
lrwxrwxrwx. 1 root root 49 Apr 27 17:53 symvers-4.18.0-372.9.1.el8.x86_64.gz -> /lib/modules/4.18.0-372.9.1.el8.x86_64/symvers.gz
-rw-------. 1 root root 4359450 Apr 16 2022 System.map-4.18.0-372.9.1.el8.x86_64
-rwxr-xr-x. 1 root root 10460528 Apr 27 17:54 vmlinuz-0-rescue-59e691cd256544c68f4fd7a7a61da2e9
-rwxr-xr-x. 1 root root 10460528 Apr 16 2022 vmlinuz-4.18.0-372.9.1.el8.x86_64
-rw-r--r--. 1 root root 170 Apr 16 2022 .vmlinuz-4.18.0-372.9.1.el8.x86_64.hmac

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Issue LP:1865515 may be related. We'll update the bootloader and shim in a candidate stream and ask you to try to reproduce the problem with the updated shim.

Changed in maas:
assignee: nobody → Adam Collard (adam-collard)
Revision history for this message
Jeff Hillman (jhillman) wrote :

$ find /boot
/boot
/boot/.vmlinuz-4.18.0-372.9.1.el8.x86_64.hmac
/boot/vmlinuz-0-rescue-59e691cd256544c68f4fd7a7a61da2e9
/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img
/boot/vmlinuz-4.18.0-372.9.1.el8.x86_64
/boot/efi
/boot/efi/EFI
/boot/efi/EFI/BOOT
/boot/efi/EFI/BOOT/fbx64.efi
/boot/efi/EFI/BOOT/BOOTX64.EFI
/boot/efi/EFI/redhat
/boot/efi/EFI/redhat/shimx64.efi
/boot/efi/EFI/redhat/fonts
/boot/efi/EFI/redhat/mmx64.efi
/boot/efi/EFI/redhat/grubx64.efi
/boot/efi/EFI/redhat/shimx64-redhat.efi
/boot/efi/EFI/redhat/grubenv
/boot/efi/EFI/redhat/BOOTX64.CSV
/boot/efi/EFI/redhat/grub.cfg
/boot/initramfs-0-rescue-59e691cd256544c68f4fd7a7a61da2e9.img
/boot/config-4.18.0-372.9.1.el8.x86_64
/boot/loader
/boot/loader/entries
find: ‘/boot/loader/entries’: Permission denied
/boot/grub2
find: ‘/boot/grub2’: Permission denied
/boot/symvers-4.18.0-372.9.1.el8.x86_64.gz
/boot/initramfs-4.18.0-372.9.1.el8.x86_64.img
/boot/System.map-4.18.0-372.9.1.el8.x86_64

Revision history for this message
Jeff Hillman (jhillman) wrote :
Download full text (10.1 KiB)

sorry, with sudo

$ sudo find /boot
/boot
/boot/.vmlinuz-4.18.0-372.9.1.el8.x86_64.hmac
/boot/vmlinuz-0-rescue-59e691cd256544c68f4fd7a7a61da2e9
/boot/initramfs-4.18.0-372.9.1.el8.x86_64kdump.img
/boot/vmlinuz-4.18.0-372.9.1.el8.x86_64
/boot/efi
/boot/efi/EFI
/boot/efi/EFI/BOOT
/boot/efi/EFI/BOOT/fbx64.efi
/boot/efi/EFI/BOOT/BOOTX64.EFI
/boot/efi/EFI/redhat
/boot/efi/EFI/redhat/shimx64.efi
/boot/efi/EFI/redhat/fonts
/boot/efi/EFI/redhat/mmx64.efi
/boot/efi/EFI/redhat/grubx64.efi
/boot/efi/EFI/redhat/shimx64-redhat.efi
/boot/efi/EFI/redhat/grubenv
/boot/efi/EFI/redhat/BOOTX64.CSV
/boot/efi/EFI/redhat/grub.cfg
/boot/initramfs-0-rescue-59e691cd256544c68f4fd7a7a61da2e9.img
/boot/config-4.18.0-372.9.1.el8.x86_64
/boot/loader
/boot/loader/entries
/boot/loader/entries/59e691cd256544c68f4fd7a7a61da2e9-0-rescue.conf
/boot/loader/entries/59e691cd256544c68f4fd7a7a61da2e9-4.18.0-372.9.1.el8.x86_64.conf
/boot/grub2
/boot/grub2/grub.cfg
/boot/grub2/device.map
/boot/grub2/grubenv
/boot/grub2/i386-pc
/boot/grub2/i386-pc/gzio.mod
/boot/grub2/i386-pc/part_msdos.mod
/boot/grub2/i386-pc/hfs.mod
/boot/grub2/i386-pc/ufs1.mod
/boot/grub2/i386-pc/test_asn1.mod
/boot/grub2/i386-pc/gcry_camellia.mod
/boot/grub2/i386-pc/cat.mod
/boot/grub2/i386-pc/drivemap.mod
/boot/grub2/i386-pc/xnu_uuid.mod
/boot/grub2/i386-pc/usb.mod
/boot/grub2/i386-pc/cbmemc.mod
/boot/grub2/i386-pc/functional_test.mod
/boot/grub2/i386-pc/pxechain.mod
/boot/grub2/i386-pc/iso9660.mod
/boot/grub2/i386-pc/terminfo.mod
/boot/grub2/i386-pc/ctz_test.mod
/boot/grub2/i386-pc/terminal.lst
/boot/grub2/i386-pc/ehci.mod
/boot/grub2/i386-pc/backtrace.mod
/boot/grub2/i386-pc/time.mod
/boot/grub2/i386-pc/aout.mod
/boot/grub2/i386-pc/biosdisk.mod
/boot/grub2/i386-pc/cryptodisk.mod
/boot/grub2/i386-pc/videoinfo.mod
/boot/grub2/i386-pc/gettext.mod
/boot/grub2/i386-pc/parttool.lst
/boot/grub2/i386-pc/luks.mod
/boot/grub2/i386-pc/gfxmenu.mod
/boot/grub2/i386-pc/hashsum.mod
/boot/grub2/i386-pc/file.mod
/boot/grub2/i386-pc/cbtime.mod
/boot/grub2/i386-pc/regexp.mod
/boot/grub2/i386-pc/search_label.mod
/boot/grub2/i386-pc/lsapm.mod
/boot/grub2/i386-pc/msdospart.mod
/boot/grub2/i386-pc/div.mod
/boot/grub2/i386-pc/gcry_sha256.mod
/boot/grub2/i386-pc/gfxterm_background.mod
/boot/grub2/i386-pc/date.mod
/boot/grub2/i386-pc/cbls.mod
/boot/grub2/i386-pc/xnu_uuid_test.mod
/boot/grub2/i386-pc/cmp.mod
/boot/grub2/i386-pc/usbserial_common.mod
/boot/grub2/i386-pc/gcry_serpent.mod
/boot/grub2/i386-pc/procfs.mod
/boot/grub2/i386-pc/crypto.lst
/boot/grub2/i386-pc/fshelp.mod
/boot/grub2/i386-pc/part_sun.mod
/boot/grub2/i386-pc/cbtable.mod
/boot/grub2/i386-pc/serial.mod
/boot/grub2/i386-pc/offsetio.mod
/boot/grub2/i386-pc/mdraid1x.mod
/boot/grub2/i386-pc/crypto.mod
/boot/grub2/i386-pc/halt.mod
/boot/grub2/i386-pc/true.mod
/boot/grub2/i386-pc/relocator.mod
/boot/grub2/i386-pc/adler32.mod
/boot/grub2/i386-pc/tar.mod
/boot/grub2/i386-pc/bufio.mod
/boot/grub2/i386-pc/freedos.mod
/boot/grub2/i386-pc/morse.mod
/boot/grub2/i386-pc/multiboot.mod
/boot/grub2/i386-pc/disk.mod
/boot/grub2/i386-pc/boottime.mod
/boot/grub2/i386-pc/font.mod
/boot/grub2/i386-pc/gcry_sha512.mod
/boot/grub2/i386-pc/usbserial_pl2303.mod
/boot/grub2/i386-pc/mda_text.mod
/boo...

Revision history for this message
Adam Collard (adam-collard) wrote :

We have bumped the source of the bootloader from focal to Jammy in the candidate stream and are initiating this request for testing

@Jeff and any others affected, please can you try updating your image source to the candidate stream (see https://maas.io/docs/how-to-mirror-images-locally#heading--changing-the-stream ) and seeing if that addresses the issue?

Changed in maas:
assignee: Adam Collard (adam-collard) → nobody
Changed in maas:
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Jeff Hillman (jhillman) wrote :

Relayed from the customer as they cannot update the bug directly.

"After changing image source to http://images.maas.io/ephemeral-v3/candidate, the same behavior continues. The test was done using RHEL8.6"

Changed in maas:
assignee: nobody → Igor Brovtsin (igor-brovtsin)
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

We are trying to reproduce the issue locally.

Changed in maas:
assignee: Igor Brovtsin (igor-brovtsin) → nobody
status: Incomplete → Triaged
assignee: nobody → Igor Brovtsin (igor-brovtsin)
Changed in maas:
assignee: Igor Brovtsin (igor-brovtsin) → Alberto Donato (ack)
Revision history for this message
Alberto Donato (ack) wrote (last edit ):

This is most likely a duplicate of LP:1865515.

I can reproduce with a LXD VM as well, turning on secure boot. Deployment fails to reboot, but setting the machine to boot from disk directly makes the boot successful.

Looking at the local boot entry, redhat/shimx64.efi is used:

Boot0007* redhat HD(1,GPT,3dbcb728-fcd0-4015-88c9-7ae7d737a2a1,0x800,0x100000)/File(\EFI\redhat\shimx64.efi)

That file has the same exact content as boot/bootx64.efi, which is what MAAS tries first:

[root@maas-images efi]# find -type f | xargs md5sum
10156333884e130e115c1089c9b15a02 ./EFI/BOOT/BOOTX64.EFI
2269c8692c313156f89c75190f6485cf ./EFI/BOOT/fbx64.efi
1a027bd46215a0984a1f8ec274a866bf ./EFI/redhat/shimx64-redhat.efi
cf1227bbbe4fe608b21e9379be7f2ebe ./EFI/redhat/BOOTX64.CSV
113fbb2060e124e7ffbcb8589ee1e5a6 ./EFI/redhat/mmx64.efi
10156333884e130e115c1089c9b15a02 ./EFI/redhat/shimx64.efi
f62b51991a3658d87d07b7da0b68b4b3 ./EFI/redhat/grubenv
5db0f8311ad56eb7fc59f26b8b0bb6cb ./EFI/redhat/grubx64.efi
01a6e191f5d8a2bcf990327095054b9f ./EFI/redhat/grub.cfg

When boot is performed by MAAS and therefore starts with grub + shim provided by MAAS, the kernel is not accepted due to invalid signature.

Revision history for this message
Jeff Hillman (jhillman) wrote :

removed field medium and set to field high. The bug marked as duplicate is being said to be invalid for this behavior.

Changed in maas:
assignee: Alberto Donato (ack) → nobody
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

See https://bugs.launchpad.net/maas/+bug/1865515/comments/74 - using chainloading breaks secureboot on certain distros, so the advice is to remove it.
See https://bugs.launchpad.net/maas/+bug/1865515/comments/42 - not using chainloading breaks boot on certain distros, so the advice is to keep it.

We need to investigate possible solution directions.

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

We'll work to reproduce the issue on CentOS and RHEL, together with someone from Foundations, to pinpoint where the issue occurs, and if it's still the issue with passing control in grub(network)->shim(local), or another problem. Having that, we need to implement a proper fix, and propose workarounds if the proper fix requires a long time to materialise.

Changed in maas:
assignee: nobody → Jacopo Rota (r00ta)
milestone: none → 3.5.0
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Note that

10156333884e130e115c1089c9b15a02 ./EFI/BOOT/BOOTX64.EFI
10156333884e130e115c1089c9b15a02 ./EFI/redhat/shimx64.efi

behave and boot differently.

because shim when booted from /efi/boot/boox64.efi tries to lookup defaullt boot binary (uasualy grubx64.efi) and fails to find it, and then triggers /EFI/BOOT/fbx64.efi which then tries to create boot entries and boot them.

whereas ./EFI/redhat/shimx64.efi (which must be the RHEL shim, not ubuntu shim) will load and execute /EFI/redhat/grubx64.efi correctly.

Thus statements like "which maas tries anyway directly" is probably not right either, as maas should inspect boot entries and try to load the correct shim directly.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Local boot should be (and previously working) was possible to achieve via:

method (A)
pxe -> shim (ubuntu) -> grub (ubuntu) -> (local boot entry) -> chainloader /EFI/redhat/shimx64.efi -> which loads /EFI/redhat/grubx64.efi -> does local boot

(at even older times, chainload of redhat/grubx64.efi used to be attempted, which only works for matching distro shims from pxe and target OS)

or

method (B)
pxe -> shim (ubuntu) -> grub (ubuntu) -> (local boot) -> do nothing execute exit 1
cause UEFI to skip to the next boot entry (which should be rhel entry)
Start boot flow from scratch
Directly (wthout pxe) load /EFI/redhat/shimx64.efi which continues local boot

During various releases of maas and various combinations of releases either method A or method B was used.

Changed in maas:
status: Triaged → Fix Committed
Jacopo Rota (r00ta)
Changed in maas:
assignee: Jacopo Rota (r00ta) → Igor Brovtsin (igor-brovtsin)
tags: removed: bug-council
Revision history for this message
Alan Baghumian (alanbach) wrote :

I confirm this is happening with MAAS 3.3 and RHEL 8.8 and RHEL 9.2 custom images built using packer-maas.

The workaround (annoying) was to disable PXE booting on the interface which then loads the locally installed GRUB for a normal boot process.

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Because this fix can change the default behaviour on deployed systems, we would like to release 3.4 first to gather feedback on the behaviour. Once we have sufficient evidence this works as expected, we will backport to previous releases. Thank you for the MPs!

Revision history for this message
Andy Wu (qch2012) wrote :

I am still seeing the same issue with MAAS on latest/edge, boot RHEL8 image drops to the UEFI shell , I have to selecdt boot from local disk to get VM deployed

Revision history for this message
Andy Wu (qch2012) wrote :

screenshot attached

Revision history for this message
Igor Brovtsin (igor-brovtsin) wrote :

Hey Andy. On your screenshot, I see that GRUB thinks secure boot is disabled. Could you please check whether it is enabled when you boot from the local disk?

Also, what kind of VM you have (which software starts the QEMU)?

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

RHEL 8.6 and other older distros may fail to secure boot if the VM or the machine had a newer distro installed on it at any point. That's because of a change in EFI security settings in newer distros (Ubuntu 22.04.02, RHEL 9) that get persisted in NVRAM. Thanks to Ghadi Rahme who pointed out https://access.redhat.com/solutions/7010515 which describes some workarounds. In general, resetting the secure boot keys in the BIOS (or disabling secure boot, running `mokutil --set-sbat-policy delete` and re-enabling secure boot) should restore the VM's/machine's ability to secure boot RHEL 8.6.

Additional context here: https://discourse.ubuntu.com/t/sbat-revocations-boot-process/34996

Changed in maas:
milestone: 3.5.0 → 3.5.0-beta1
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.