nova stop/start or reboot --hard resets uefi nvram

Bug #1633447 reported by Derek Higgins
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Jack Ding

Bug Description

Using nova to boot UEFI instances in certain circumstances the nvram is cleared

e.g. on a deployed node my nvram is set too boot from the grub installed on the EFI partition

[root@t1 boot]# efibootmgr
Timeout: 0 seconds
BootOrder: 0004,0002,0000,0001,0003
Boot0000* EFI Floppy
Boot0001* EFI Floppy 1
Boot0002* EFI Hard Drive
Boot0003* EFI Network
Boot0004* centos

This is working I can run
> nova reboot dbdc6b36-1f17-4722-89e5-117986b10059

but if I run a nova reboot --hard or a combination of nova stop/start then the libvirt domain is redefined, as part of this process the nvram is reset, the boot process stalls at the boot menu and I have to select boot from file

[root@t1 boot]# efibootmgr
Timeout: 0 seconds
BootOrder: 0002,0000,0001,0003
Boot0000* EFI Floppy
Boot0001* EFI Floppy 1
Boot0002* EFI Hard Drive
Boot0003* EFI Network

Tags: libvirt uefi
Revision history for this message
Derek Higgins (derekh) wrote :
Revision history for this message
Sean Dague (sdague) wrote :

Are these baremetal instances? Should this really be an Ironic issue?

Changed in nova:
status: New → Incomplete
Revision history for this message
Derek Higgins (derekh) wrote :

No these are nova instances (I just happened to be testing ironic at the time)

Any instance booted with UEFI has its nvram reset when you do a "nova stop" then "nova start" as the libvirt domain gets redefined along with a new nvram.

Changed in nova:
status: Incomplete → New
Revision history for this message
Sam Song (samsong8610) wrote :
Download full text (9.2 KiB)

There is a similar issue for me. I boot a instance using UEFI on aarch64 arm server successfully, but I can't delete it. There is error in nova compute service log.

2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [req-49fb4cc9-485e-4861-8cb3-610fc41ec317 cf6960596f614a6caad26d12c38e4b0f a26a8a0692744742beccb71b2f319d75 - - -] [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] Setting instance vm_state to ERROR
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] Traceback (most recent call last):
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2420, in do_terminate_instance
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] self._delete_instance(context, instance, bdms, quotas)
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/nova/hooks.py", line 154, in inner
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] rv = f(*args, **kwargs)
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2383, in _delete_instance
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] quotas.rollback()
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] self.force_reraise()
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] six.reraise(self.type_, self.value, self.tb)
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2347, in _delete_instance
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] self._shutdown_instance(context, instance, bdms)
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-43c1-aff9-d55af0c00ace] File "/home/venusource/src/openstack/nova/.venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2249, in _shutdown_instance
2016-12-09 12:05:12.705 9467 ERROR nova.compute.manager [instance: 6a1b9579-e9f4-...

Read more...

Revision history for this message
Matt Riedemann (mriedem) wrote :

We have 0 CI testing for libvirt + uefi scenarios so while it was accepted as a feature a couple of releases ago, it's not validated in any meaningful way anywhere in our integration test system. So I wouldn't be surprised that there are bugs. Patches are welcome from those interested in making this work.

tags: added: libvirt uefi
summary: - nova stop/start or reboot --hard rests uefi nvram
+ nova stop/start or reboot --hard resets uefi nvram
赵明俊 (falseuser)
Changed in nova:
assignee: nobody → 赵明俊 (falseuser)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/471706

Changed in nova:
status: New → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :

Search for "UEFI variable store file" in the starlingx diff:

https://github.com/starlingx-staging/stx-nova/commit/71acfeae0d1c59fdc77704527d763bd85a276f9a

And you'll see a few things they did which I think address this bug.

赵明俊 (falseuser)
Changed in nova:
assignee: 赵明俊 (falseuser) → nobody
Revision history for this message
赵明俊 (falseuser) wrote :

Looks like stx-nova did not resolve the case where the instance under shared storage was booted from another host.

Revision history for this message
Adam Spiers (adam.spiers) wrote :

Bug #1785123 sounds related to this.

Changed in nova:
assignee: nobody → Jack Ding (jackding)
Changed in nova:
assignee: Jack Ding (jackding) → Chris Friesen (cbf123)
Changed in nova:
assignee: Chris Friesen (cbf123) → yao (yaozhou)
Changed in nova:
assignee: yao (yaozhou) → Boxiang Zhu (bxzhu-5355)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.opendev.org/471706
Reason: This is old, in merge conflict and has had a -1 for a long time so I'm going to abandon this. There is a newer attempt at fixing this here from the windriver/starlingx team:

https://review.opendev.org/#/c/621646/

Matt Riedemann (mriedem)
Changed in nova:
assignee: Boxiang Zhu (bxzhu-5355) → Jack Ding (jackding)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.