MAAS

Deployment fails if server's EFI variable storage is full

Bug #1724989 reported by Rod Smith on 2017-10-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	MAAS	Invalid	Undecided	Unassigned	MAAS 2.3.x
	curtin	Expired	Undecided	Unassigned

Bug Description

Deployment can fail if a server's EFI variable storage is full. Unfortunately, I lack most relevant logs, since the error went away while I was investigating it, and a subsequent deployment worked; however, I'm pretty confident of the cause: When calling efibootmgr to add a local-disk boot variable and/or set the boot variable order, efibootmgr returned an error condition, which caused the deployment to fail. I don't recall the exact message, but in a deployment, there was a message to the effect that a call to efibootmgr had failed, which appeared to trigger the deployment failure. In my experiments, I booted an Ubuntu Artful desktop image and tried running "efibootmgr -o {a sensible boot order}", which returned:

could not set BootOrder: No space left on device

This error refers to an out-of-space condition on the system's NVRAM, blocking a change in the BootOrder variable. On a subsequent boot, the system deployed correctly. Perhaps a normal garbage collection by the EFI fixed it, or perhaps a change I made to the firmware settings cleared the problem. In either event, I lost the exact MAAS installation logs.

Failing the installation upon a failure of the "efibootmgr -o" command is an unnecessarily strict condition, IMHO, since if the system booted to the MAAS installer, we know that PXE-booting works. Adding a boot entry for the local disk and adjusting the boot order to boot from the network is done so that the system can continue to boot if the MAAS server goes down; but if these operations fail, it seems to me that it's better to reboot and (if the system comes up) call the installation a success -- but ideally to flag the system with a warning that the boot order may be set incorrectly or that the system might fail to boot if the MAAS server goes down, depending on which efibootmgr call failed.

I'm attaching the /var/log/maas directory tree from the server. The node that experienced the problem is oil-prunus. Here's the MAAS package version information:

$ dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================================-============-==================================================
ii maas 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cert-server 0.2.30-0~76~ubuntu16.04.1 all Ubuntu certification support files for MAAS server
ii maas-cli 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.2.2-6099-g8751f91-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)