curtin Xenial install does not uefi boot

Bug #1647827 reported by Robert Brenneman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned

Bug Description

On a Xenial 16.04.1 MASS region controller with MAAS 2.1.1+bzr5544-0ubuntu1~16.04.1 installed

I am able to discover other nodes and configure their power control settings, commission, and start a Xenial 16.04 install.

The nodes are UEFI x86_64 servers managed by IPMI 2 management controllers.

The curtin installer completes, and the node reboots, and at this point it does not boot from the local disk, but rather PXE boots again from the rack controller. The Rack controller PXE boots the node into grub with the following '*Local' boot entry:

>>>
setparams 'Local'

  echo 'Booting local disk...'
  search --set=root --file /efi/ubuntu/shimx64.efi
  chainloader /efi/ubuntu/shimx64.efi
>>>

The node fails to boot using this configuration - grub boot fails with 'error: cannot load image' and falls back to the grub menu. The node sits in this state until MAAS decides the deploy timed out and marks the node as failed.

I can remote the system to create a uefi boot entry manually, and it will start off the local disk, so the installed Xenial is bootable if the uefi boot entry exists. Installing Xenial from the DVD also correctly creates the uefi boot entry, so this seems specific to MAAS/curtin.

Should MAAS/curtin be creating UEFI boot entries on each node as part of the install, or should it be providing a usable PXE grub *Local startup configuration?

testflr@fp2u33:~$ sudo ls -l /var/log/maas/*
lrwxrwxrwx 1 root root 16 Nov 18 11:17 /var/log/maas/apache2 -> /var/log/apache2
-rw-r--r-- 1 syslog syslog 16184 Dec 6 14:14 /var/log/maas/maas.log
-rw-r--r-- 1 syslog syslog 42476 Dec 6 06:14 /var/log/maas/maas.log.1
-rw-r--r-- 1 syslog syslog 1844 Dec 5 06:14 /var/log/maas/maas.log.2.gz
-rw-r--r-- 1 syslog syslog 1739 Dec 4 06:14 /var/log/maas/maas.log.3.gz
-rw-r--r-- 1 syslog syslog 1855 Dec 3 06:14 /var/log/maas/maas.log.4.gz
-rw-r--r-- 1 syslog syslog 1863 Dec 2 06:14 /var/log/maas/maas.log.5.gz
-rw-r--r-- 1 syslog syslog 3182 Dec 1 06:14 /var/log/maas/maas.log.6.gz
-rw-r--r-- 1 syslog syslog 1860 Nov 30 06:14 /var/log/maas/maas.log.7.gz
-rw-r--r-- 1 maas maas 3640028 Dec 6 14:19 /var/log/maas/rackd.log
-rw-r--r-- 1 maas maas 3750947 Dec 6 14:27 /var/log/maas/regiond.log
-rw-r--r-- 1 maas maas 804213 Dec 5 06:24 /var/log/maas/regiond.log.1.gz
-rw-r--r-- 1 maas maas 668053 Nov 29 06:24 /var/log/maas/regiond.log.2.gz
-rw-r--r-- 1 maas maas 450393 Nov 21 06:24 /var/log/maas/regiond.log.3.gz

/var/log/maas/proxy:
total 756
-rw-r----- 1 proxy proxy 384150 Dec 6 13:58 access.log.1
-rw-r----- 1 proxy proxy 20 Nov 17 13:28 access.log.2.gz
-rw-r----- 1 proxy proxy 98063 Dec 6 13:58 cache.log.1
-rw-r----- 1 proxy proxy 3671 Nov 18 11:33 cache.log.2.gz
-rw-r----- 1 proxy proxy 264430 Dec 6 14:17 store.log.1
-rw-r----- 1 proxy proxy 604 Nov 18 10:30 store.log.2.gz

/var/log/maas/rsyslog:
total 8
drwxr-xr-x 6 syslog syslog 4096 Dec 6 10:51 fp2u29
drwxr-xr-x 4 syslog syslog 4096 Nov 22 09:32 fp2u31
testflr@fp2u33:~$

testflr@fp2u33:~$ sudo dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-==============================-============-=================================================
ii maas 2.1.1+bzr5544-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cli 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.1.1+bzr5544-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.1.1+bzr5544-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.1.1+bzr5544-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.1.1+bzr5544-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)

Revision history for this message
Robert Brenneman (rjbrenn) wrote :

attaching maas sosreport

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Seems like curtin is not installing the shimx64.efi. Can you mark the node broken and then boot it into rescue mode, to check what was actually placed in the /boot partition on the machine?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Also:

Has this system always booted UEFI?
Does it netboot UEFI?
Have you tried to re-commissioning the machine, then deploy?

Revision history for this message
Robert Brenneman (rjbrenn) wrote :

the shim does get installed:

ubuntu@fp2u29:~$ sudo su -
root@fp2u29:~# mount /dev/sda2 /mnt
root@fp2u29:~# mount /dev/sda1 /mnt/boot
root@fp2u29:~# cd /mnt/boot
root@fp2u29:/mnt/boot# find
.
./EFI
./EFI/ubuntu
./EFI/ubuntu/shimx64.efi
./EFI/ubuntu/grubx64.efi
./EFI/ubuntu/MokManager.efi
./EFI/ubuntu/grub.cfg

Also - here's the current boot list on this box:

root@fp2u29:~# mount -o bind /sys /mnt/sys
root@fp2u29:~# mount -o bind /proc /mnt/proc
root@fp2u29:~# mount -o bind /dev /mnt/dev
root@fp2u29:~# chroot /mnt
root@fp2u29:/# efibootmgr -v
Timeout: 10 seconds
BootOrder: 0000,0002
Boot0000* CD/DVD Rom PciRoot(0x0)/Pci(0x1d,0x0)/USB(0,0)/USB(0,0)/USB(2,0)
Boot0002* PXE Network PciRoot(0x0)/Pci(0x1c,0x0)/Pci(0x0,0x0)/MAC(40f2e922a99a,0)/IPv4(0.0.0.0:0<->0.0.0.0:0,0,0)

these machines have always UEFI booted. They previously had a RHEL 6 install, and as mentioned before if I install Xenial 16.04.1 off the DVD the installer creates the UEFI boot record and boots fine.

The PXE boot statements are also UEFI based net boots - it does not get to legacy PXE boot BIOS screens when installing or commissioning.

I also released, recommissioned, and reinstalled, and it ended up in the same state.

Revision history for this message
Robert Brenneman (rjbrenn) wrote :
Revision history for this message
Andres Rodriguez (andreserl) wrote :

We believe this issue has been already fixed in the latest releases of MAAS. If you believe this issue is still present, please re-open this back report (set it back to 'New') and provide more detailed information (MAAS version, postgresql logs, maas logs).

Thanks.

Changed in maas:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.