crash during subuiquity installation with 23.04 arm64 live server

Bug #2017304 reported by Taihsiang Ho
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin (Ubuntu)
New
Undecided
Unassigned
subiquity (Ubuntu)
New
Undecided
Unassigned

Bug Description

When installing an Ampere Mt. Jade AltraMax server with the latest Lunar Lobster 23.04 arm64 ubuntu-23.04-live-server-arm64.iso https://cdimage.ubuntu.com/ubuntu/releases/23.04/release/ubuntu-23.04-live-server-arm64.iso , curtin crash for this traceback:

Running command ['umount', '/target/dev'] with allowed return codes [0] (capture=False)
finish: cmd-install/stage-curthooks/builtin/cmd-curthooks/install-grub: FAIL: installing grub to target devices
finish: cmd-install/stage-curthooks/builtin/cmd-curthooks/configuring-bootloader: FAIL: configuring target system bootloader
finish: cmd-install/stage-curthooks/builtin/cmd-curthooks: FAIL: curtin command curthooks
Traceback (most recent call last):
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/commands/main.py", line 202, in main
    ret = args.func(args)
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/commands/curthooks.py", line 1918, in curthooks
    builtin_curthooks(cfg, target, state)
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/commands/curthooks.py", line 1883, in builtin_curthooks
    setup_grub(cfg, target, osfamily=osfamily,
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/commands/curthooks.py", line 824, in setup_grub
    uefi_reorder_loaders(grubcfg, target, efi_orig_output, variant)
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/commands/curthooks.py", line 566, in uefi_reorder_loaders
    in_chroot.subp(['efibootmgr', '-o', new_boot_order])
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/util.py", line 787, in subp
    return subp(*args, **kwargs)
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/util.py", line 275, in subp
    return _subp(*args, **kwargs)
  File "/snap/subiquity/4678/lib/python3.10/site-packages/curtin/util.py", line 139, in _subp
    raise ProcessExecutionError(stdout=out, stderr=err,
curtin.util.ProcessExecutionError: Unexpected error while running command.
Command: ['unshare', '--fork', '--pid', '--', 'chroot', '/target', 'efibootmgr', '-o', '000A,0000,0007,0006,0008']
Exit code: 8
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: ['unshare', '--fork', '--pid', '--', 'chroot', '/target', 'efibootmgr', '-o', '000A,0000,0007,0006,0008']
Exit code: 8
Reason: -
Stdout: ''
Stderr: ''

Steps to reproduce:
1. Grab a Ampere Mt. Jade AltraMax server
2. Install the image from https://cdimage.ubuntu.com/ubuntu/releases/23.04/release/ubuntu-23.04-live-server-arm64.iso via the virtual CD provided by its BMC
3. Use every default value prompted by subiquity

Expected result:
subiquity finishes installation

Actual result:
curtin crashes at the last stage

Additional information:
Please refer to the attachment var.log.tar.gz to check the detailed logs from /var/log . I collected the attachment by logging in the subiquity shell.

Revision history for this message
Taihsiang Ho (tai271828) wrote :
Revision history for this message
Taihsiang Ho (tai271828) wrote :

I am not entirely sure, but it looks like a firmware bug for me:

1. I logged in the subiquity shell and efibootmgr shows a suspicious boot entry 000A:

installer@ubuntu-server:~$ efibootmgr -v
BootCurrent: 000A
Timeout: 5 seconds
BootOrder: 0000,0007,0006,0008
Boot0000* ubuntu HD(1,GPT,988a2a4b-c94f-42a7-89a9-b023f494ddd2,0x800,0x219800)/File(\EFI\ubuntu\shimaa64.efi)
Boot0006 UEFI: Built-in EFI Shell VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO
Boot0007* UEFI: PXE IPv4 Mellanox Network Adapter - 98:03:9B:7D:43:52 PcieRoot(0x30000)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(98039b7d4352,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
Boot0008 UEFI: PXE IPv4 Mellanox Network Adapter - 98:03:9B:7D:43:53 PcieRoot(0x30000)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(98039b7d4353,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO

Besides, the version of efibootmgr is:
installer@ubuntu-server:~$ efibootmgr --version
version 17
installer@ubuntu-server:~$ apt-cache policy efibootmgr
efibootmgr:
  Installed: 17-1ubuntu2
  Candidate: 17-1ubuntu2
  Version table:
 *** 17-1ubuntu2 500
        500 cdrom://Ubuntu-Server 23.04 _Lunar Lobster_ - Release arm64 (20230415) lunar/main arm64 Packages
        500 http://ports.ubuntu.com/ubuntu-ports lunar/main arm64 Packages
        100 /var/lib/dpkg/status
installer@ubuntu-server:~$

2. In the mean time, I could not reproduce the same issue with another similar model, an Ampere Mt. Jade Altra server. They have different firmware versions.

Revision history for this message
Olivier Gayot (ogayot) wrote :

Hello Taihsiang,

We've seen similar reports for this issue during the past weeks. Every time, the BootCurrent entry has no associated BootXXXX entry. I looked in the UEFI spec and it's still unclear to me if this is a stretch or a broken implementation.

That said, based on the number of reports, I believe we should still do something about it. As far as subiquity is concerned, I don't think trying to set the install media as the first boot entry makes sense so I've disabled the behavior in https://github.com/canonical/subiquity/pull/1671

I'd still argue that curtin should do the right think on "broken" UEFI implementations if we can.

Thanks,
Olivier

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.