CentOS7.6: Unable to launch vm with UEFI boot

Bug #1814335 reported by Yang Liu
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Austin Sun

Bug Description

Brief Description
-----------------
Failed to launch vm with UEFI boot.

Severity
--------
Major

Steps to Reproduce
------------------
# Create an uefi glance image using uefi guest image.
glance image-create --property hw_firmware_type=uefi --visibility public --disk-format qcow2 --file /home/wrsroot//images/trusty-server-cloudimg-amd64-uefi1.img --container-format bare --name trusty-server-cloudimg-amd64-uefi1_auto --wait 1

# Create cinder volume off that image
cinder create --display-name vol-tenant2 --image-id 5ff8241e-56ff-4449-8c8a-8aa284b6fe02 5

# Create vm from above volume
nova boot --flavor 97270889-c05b-4832-825c-54771e9be72e --nic net-id=35838068-6b01-484b-a8f7-0f25fcb0e816 --nic net-id=00db947d-3855-498c-85e9-b5553301a7cc tenant2-sec-boot-vm-168 --poll --block-device id=209129b0-6843-4852-a9e1-05b20e0b89f8,bootindex=0,source=volume --block-device id=1c0c3dfb-6c88-46b1-9912-7d11bac982c7,bootindex=1,source=volume

Expected Behavior
------------------
VM launches successfully

Actual Behavior
----------------
[2019-02-01 03:46:49,902] 387 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+-------------------------------------------------+
| Property | Value |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| adminPass | WYc8QC7cbD6y |
| config_drive | |
| created | 2019-02-01T03:46:27Z |
| description | - |
| flavor:disk | 5 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | flavor-30 |
| flavor:ram | 1024 |
| flavor:swap | 0 |
| flavor:vcpus | 2 |
| hostId | |
| id | 9be9e52a-70a3-4ee7-9691-ce5a18427a17 |
| image | Attempt to boot from volume - no image supplied |
| key_name | keypair-tenant2 |
| locked | False |
| metadata | {} |
| name | tenant2-sec-boot-vm-168 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | BUILD |
| tags | [] |
| tenant_id | f7bddf6ed41c48368d15ba993940a52a |
| updated | 2019-02-01T03:46:27Z |
| user_id | 01f8da72e2ea496a8299e0166fb6431f |
| wrs-if:nics | |
| wrs-res:pci_devices | |
| wrs-res:topology | node:-, 1024MB, pgsize:4K, vcpus:2 |
| wrs-res:vcpus | [2, 2, 2] |
| wrs-sg:server_group | |
+--------------------------------------+-------------------------------------------------+

Server building... 0% complete
Server building... 0% complete
Server building... 0% complete
Server building... 0% complete
Error building server
ERROR (ResourceInErrorState): `Server` resource is in the error state due to 'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 9be9e52a-70a3-4ee7-9691-ce5a18427a17. Last exception: UEFI is not supported'.
controller-1:~$

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any

Branch/Pull Time/Commit
-----------------------
f/centos76 as 2019/01/30

Timestamp/Logs
--------------
Easy to reproduce. Test image attached.

Revision history for this message
Yang Liu (yliu12) wrote :
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Cindy Xie (xxie1)
Austin Sun (sunausti)
Changed in starlingx:
assignee: Cindy Xie (xxie1) → Austin Sun (sunausti)
Revision history for this message
Austin Sun (sunausti) wrote :

after analysis the nova/nova/virt/libvirt/driver.py
DEFAULT_UEFI_LOADER_PATH = {
    "x86_64": "/usr/share/OVMF/OVMF_CODE.fd",
    "aarch64": "/usr/share/AAVMF/AAVMF_CODE.fd"
}

But for CentOS7.6 upgrade OVMF to OVMF-20180508-3.gitee3198e672e2.el7.noarch. and change the file struct .
/usr/share/OVMF/OVMF_CODE.fd is removed from package. so nava check UEFI failed.

Options:
1) we downgrade OVMF to OVMF-20150414-2.gitc9e5618.el7.noarch.rpm(CentOS7.5 package)
2) modify/upgrade Nova patch.

Changed in starlingx:
status: New → Confirmed
Revision history for this message
Austin Sun (sunausti) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-tools (f/centos76)

Reviewed: https://review.openstack.org/634591
Committed: https://git.openstack.org/cgit/openstack/stx-tools/commit/?id=553a053915c9ddbd18bed462465cad1079db844a
Submitter: Zuul
Branch: f/centos76

commit 553a053915c9ddbd18bed462465cad1079db844a
Author: Martin, Chen <email address hidden>
Date: Mon Feb 4 05:30:26 2019 +0800

    Revert OVMF-20150414-2.gitc9e5618.el7.noarch.rpm for check UEFI failed

    For openstack pike release, nove request to acces
    /usr/share/OVMF/OVMF_CODE.fd in nova/nova/virt/libvirt/driver.py,
    which is removed in upgraded OVMF-20180508-3.gitee3198e672e2.el7.noarch.rpm
    Rollback to previous OVMF package

    Closes-Bug: 1814335

    Change-Id: I2376bc7e0bbc21c61be3ef8964c527ddc7fcf250
    Signed-off-by: Martin, Chen <email address hidden>

tags: added: in-f-centos76
Ghada Khalil (gkhalil)
tags: added: stx.distro.other
tags: added: stx.2019.05
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
Austin Sun (sunausti) wrote :

634591 was merged in branch f/centos76, wait rebased to Master.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as Fix Committed given that the fix was merged in the centos76 feature branch. This should be marked as Fix Released once the feature branch is merged to master.

Changed in starlingx:
status: Confirmed → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-tools (master)

Fix proposed to branch: master
Review: https://review.openstack.org/642485

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-tools (master)
Download full text (23.3 KiB)

Reviewed: https://review.openstack.org/642485
Committed: https://git.openstack.org/cgit/openstack/stx-tools/commit/?id=6ae17f4a76a9e76237a06cdc9290100d5aee203a
Submitter: Zuul
Branch: master

commit df50d0d5a0bd07472feb4ed698f05a94471a346f
Author: Luis Botello <email address hidden>
Date: Mon Feb 18 08:51:25 2019 -0600

    Adding GPG key for Storage SIG

    Adding RPM-GPG-KEY-CentOS-SIG-Storage GPG key which is needed for
    checking integrity of some CentOS packages.

    Closes-bug:1816814
    Change-Id: I36e00c1c38bf3556a6fea40d74db4164c8c256cb
    Signed-off-by: Luis Botello <email address hidden>

commit 553a053915c9ddbd18bed462465cad1079db844a
Author: Martin, Chen <email address hidden>
Date: Mon Feb 4 05:30:26 2019 +0800

    Revert OVMF-20150414-2.gitc9e5618.el7.noarch.rpm for check UEFI failed

    For openstack pike release, nove request to acces
    /usr/share/OVMF/OVMF_CODE.fd in nova/nova/virt/libvirt/driver.py,
    which is removed in upgraded OVMF-20180508-3.gitee3198e672e2.el7.noarch.rpm
    Rollback to previous OVMF package

    Closes-Bug: 1814335

    Change-Id: I2376bc7e0bbc21c61be3ef8964c527ddc7fcf250
    Signed-off-by: Martin, Chen <email address hidden>

commit a6d4b8391918e7576c9faab61f65e0568e6ddf09
Author: Shuicheng Lin <email address hidden>
Date: Wed Jan 30 23:48:29 2019 +0800

    update mellanox driver name with the new version

    Mellanox driver has been upgraded to 4.5.1 version in tarball_dl.lst
    Need update the name in populate_downloads.sh accordingly.

    Story: 2004521
    Task: 29193

    Change-Id: If4cb097f9ee2027f2e7eb6436057cdd607890f81
    Signed-off-by: Shuicheng Lin <email address hidden>

commit 3faee2cdb0a88dfad68f15d655ba560c09f8ebdd
Author: Shuicheng Lin <email address hidden>
Date: Mon Jan 28 21:32:23 2019 +0800

    Fix openvswitch script crash issue

    When run openvswitch script "dpdk-pmdinfo.py", it will crash with Error:
    "ImportError: cannot import name UBInt8". And this failure will cause
    dpdk always return not supported for ethernet adapter.
    It is due to new python-construct rpm doesn't have UBInt8.
    Roll back python-construct to fix it.

    Story: 2004522
    Task: 29124

    Change-Id: Iefa9d21bb81390b5bcf1d0605c0408b5869616f5
    Signed-off-by: Shuicheng Lin <email address hidden>

commit 494dd3f78e2a8233ea3df507b7a82891e27facd1
Author: Shuicheng Lin <email address hidden>
Date: Fri Jan 25 19:49:27 2019 +0800

    Roll back device-mapper-multipath version to fix build-iso failure

    new lvm2 rpm causes AIO duplex deploy failure, so lvm2 is kept with
    old version currently. device-mapper-multipath should be kept with
    old version also to avoid dependency failure.

    Move device-mapper packages to rpms_centos.lst since all packages
    could be found in centos repo.

    Story: 2004522
    Task: 29099

    Change-Id: I5cd4d434a629201934a48a551d4fb354f8d57318
    Signed-off-by: Shuicheng Lin <email address hidden>

commit 6f1506bc52401378d270fba66b9ab39887079665
Author: Shuicheng Lin <email address hidden>
Date: Fri Jan 11 03:0...

Changed in starlingx:
status: Fix Committed → Fix Released
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (8.0 KiB)

This is still failing with nova show output reporting the failure reason:

'nova --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne show 3d194fbc-1fba-4d90-9dd0-8132766b0253'

OS-EXT-SRV-ATTR:hostname | tenant2-ge-edge-shared-migrate-23 | OS-EXT-SRV-ATTR:instance_name | instance-000002cc
| OS-EXT-SRV-ATTR:reservation_id | r-q3mbt23d |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | I2Nsb3VkLWNvbmZpZwpydW5jbWQ6Cg==
| OS-EXT-STS:power_state | 0
.
| OS-EXT-STS:vm_state | error
| created | 2019-04-07T14:26:21Z
| fault | {"message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 3d194fbc-1fba-4d90-9dd0-8132766b0253. Last exception: UEFI is not supported", "code": 500, "details": " File \"/var/lib/openstack/lib/python2.7/site-packages/nova/conductor/manager.py\", line 613, in build_instances
| | filter_properties, instances[0].uuid) |
| | File \"/var/lib/openstack/lib/python2.7/site-packages/nova/scheduler/utils.py\", line 742, in populate_retry |
| | raise exception.MaxRetriesExceeded(reason=msg) |
| | ", "created": "2019-04-07T14:26:36Z"}
| flavor:disk | 5 ...

Read more...

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

Load: 20190404T013000Z
(Lab: WP_3_7, Node Config: 2+3, Software Version: 19.01)

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (7.2 KiB)

claim successful but fails to spawn

nova-compute-compute-1-eae26dba-cvd7q_openstack_nova-compute-faed2a4bf4159ad89956d292e36ff4ce4abca9d17c0303b562b5a8e2c1f45131.log

{"log":"2019-04-07 14:26:32,427.427 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] Attempting claim on node compute-1: memory 1024 MB, disk 5 GB, vcpus 1 CPU\n","stream":"stdout","time":"2019-04-07T14:26:32.428263017Z"}

{"log":"2019-04-07 14:26:32,428.428 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] Total memory: 195288 MB, used: 0.00 MB\n","stream":"stdout","time":"2019-04-07T14:26:32.428470413Z"}

{"log":"2019-04-07 14:26:32,428.428 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] memory limit not specified, defaulting to unlimited\n","stream":"stdout","time":"2019-04-07T14:26:32.428653207Z"}

{"log":"2019-04-07 14:26:32,428.428 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] Total disk: 439 GB, used: 0.00 GB\n","stream":"stdout","time":"2019-04-07T14:26:32.428851158Z"}

{"log":"2019-04-07 14:26:32,428.428 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] disk limit: 439.00 GB, free: 439.00 GB\n","stream":"stdout","time":"2019-04-07T14:26:32.429030639Z"}

{"log":"2019-04-07 14:26:32,429.429 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] Total vcpu: 90 VCPU, used: 0.00 VCPU\n","stream":"stdout","time":"2019-04-07T14:26:32.429219334Z"}

{"log":"2019-04-07 14:26:32,429.429 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] vcpu limit not specified, defaulting to unlimited\n","stream":"stdout","time":"2019-04-07T14:26:32.429405442Z"}

{"log":"2019-04-07 14:26:32,430.430 109152 INFO nova.compute.claims [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d194fbc-1fba-4d90-9dd0-8132766b0253] Claim successful on node compute-1\n","stream":"stdout","time":"2019-04-07T14:26:32.430802875Z"}

{"log":"2019-04-07 14:26:32,754.754 109152 INFO nova.virt.libvirt.driver [req-ec64db06-94f3-4d41-b29a-ac0231fddaca ce7eff78764048ff946a8b776be67f82 ccccdcaffece490ebb9962e94b7bfef9 - default default] [instance: 3d19...

Read more...

Yang Liu (yliu12)
Changed in starlingx:
status: Fix Released → Confirmed
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Revision history for this message
Austin Sun (sunausti) wrote :

This is different root cause, nova-compute running container and container does not mount this file.

Revision history for this message
Austin Sun (sunausti) wrote :

i will check how to fix this issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/651949

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/651949
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=c4766c0fde49392df27c3b1ef8d8e57b70a88471
Submitter: Zuul
Branch: master

commit c4766c0fde49392df27c3b1ef8d8e57b70a88471
Author: Sun Austin <email address hidden>
Date: Thu Apr 11 17:10:38 2019 +0800

    Mount OVMF path for nova-compute and libvirt to support uefi

    Override helm chart to mount OVMF path for nova-compute
    and libvirt pods to support uefi boot VM creating

    Closes-Bug: 1814335

    Change-Id: Ib876971ff096a68fd3a65ed37a8e295a475641d8
    Signed-off-by: Sun Austin <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

Verified passed with master load "20190626T013000Z"

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.