Nova/Placement creating x86 trait for ARM Compute node

Bug #2062425 reported by Sam Schmitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Description
===========
I have a 2023.2 based deployment with both x84 and aarch64 based compute nodes. For the arm node, placement is showing it having an x86 HW trait, causing scheduling of arm architecture images onto it to fail. It also causes it to try and schedule x86 images onto here, which will fail.

Steps to reproduce
==================
1. I deployed a new 2023.2 deployment with Kolla-ansible.
2. Add hw_architecture=aarch64 to a valid glance image
3. Ensure that image_metadata_prefilter = True in nova.conf on all nova services
4. Try and deploy an instance with that image, it will fail with no valid host found
5. Observe the following in the placement-api logs:

placement-api.log:41054:2024-04-18 20:39:04.271 21 DEBUG placement.requestlog [req-0114c318-5dfd-4588-807b-e591a82ce098 req-bd588ea0-5700-4b8e-a43f-0eb15a7275e8 - - - - - -] Starting request: 10.27.10.33 "GET /allocation_candidates?limit=1000&member_of=in%3Aceceb7fb-e0ed-4304-a69f-b327da7ca63f&resources=DISK_GB%3A60%2CMEMORY_MB%3A8192%2CVCPU%3A4&root_required=HW_ARCH_AARCH64%2C%21COMPUTE_STATUS_DISABLED" __call__ /var/lib/kolla/venv/lib/python3.10/site-packages/placement/requestlog.py:55

placement-api.log:41055:2024-04-18 20:39:04.317 21 DEBUG placement.objects.research_context [req-0114c318-5dfd-4588-807b-e591a82ce098 req-bd588ea0-5700-4b8e-a43f-0eb15a7275e8 8ce24731fb34492c9354f05050216395 c48da85ca48f4296b59bacb7b3c2fdfd - - default default] found no providers satisfying required traits: {'HW_ARCH_AARCH64'} and forbidden traits: {'COMPUTE_STATUS_DISABLED'} _process_anchor_traits /var/lib/kolla/venv/lib/python3.10/site-packages/placement/objects/research_context.py:243

Resource providers:
openstack resource provider list
+--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+
| uuid | name | generation | root_provider_uuid | parent_provider_uuid |
+--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+
| a6aa43fb-c819-4dae-b172-b5ed76901591 | infra-prod-compute-04 | 7 | a6aa43fb-c819-4dae-b172-b5ed76901591 | None |
| 2a019b35-25ac-4085-a13d-07802bda6828 | infra-prod-compute-03 | 10 | 2a019b35-25ac-4085-a13d-07802bda6828 | None |
| a008c58b-d16c-4b80-8f58-ca96d1fce2a3 | infra-prod-compute-05 | 7 | a008c58b-d16c-4b80-8f58-ca96d1fce2a3 | None |
| e97340aa-5848-4939-a409-701e5ad52396 | infra-prod-compute-02 | 31 | e97340aa-5848-4939-a409-701e5ad52396 | None |
| 9345e4d0-fc49-4e51-9f38-faeabec1b053 | infra-prod-compute-01 | 18 | 9345e4d0-fc49-4e51-9f38-faeabec1b053 | None |
| 41611dae-3006-4449-9c8b-3369d9b0feb8 | infra-prod-compile-01 | 5 | 41611dae-3006-4449-9c8b-3369d9b0feb8 | None |
| 7fecff4c-9e2d-4d89-a345-91ab4d8c1857 | infra-prod-compile-02 | 5 | 7fecff4c-9e2d-4d89-a345-91ab4d8c1857 | None |
| fbd4030a-1cc9-455a-bca2-2b606fcb3c4d | infra-prod-compile-03 | 5 | fbd4030a-1cc9-455a-bca2-2b606fcb3c4d | None |
| 4d3b29fd-0048-4768-93fa-b7a98f81c125 | infra-prod-compute-06 | 9 | 4d3b29fd-0048-4768-93fa-b7a98f81c125 | None |
| f888bda6-8fb7-4f84-8b87-c9af3b36a6ae | infra-prod-compute-07 | 7 | f888bda6-8fb7-4f84-8b87-c9af3b36a6ae | None |
| 4f53c8d0-bf1d-44d3-89d5-b8f5436ee66a | infra-prod-compile-04 | 5 | 4f53c8d0-bf1d-44d3-89d5-b8f5436ee66a | None |
| 7b6a42c8-b9b4-44a6-9111-2f732c7074e1 | infra-prod-compile-05 | 5 | 7b6a42c8-b9b4-44a6-9111-2f732c7074e1 | None |
| 8312a824-8d88-4646-9eb5-c4937329dab9 | infra-prod-compute-08 | 4 | 8312a824-8d88-4646-9eb5-c4937329dab9 | None |
| 9e60caa5-28ed-4719-aaf5-690b111f17fd | infra-prod-compute-09 | 4 | 9e60caa5-28ed-4719-aaf5-690b111f17fd | None |
| cbfef7fd-b910-4d77-b448-70cdb9638967 | infra-prod-compute-10 | 4 | cbfef7fd-b910-4d77-b448-70cdb9638967 | None |
| d7efda90-b91c-419f-b0be-0f339f37653a | infra-prod-compute-11 | 4 | d7efda90-b91c-419f-b0be-0f339f37653a | None |
| 067f20f4-f513-465e-9e32-e505a97ab165 | infra-prod-compute-12 | 4 | 067f20f4-f513-465e-9e32-e505a97ab165 | None |
| 57a098bf-31d4-4e4f-9a28-72a925d2384c | infra-prod-arm-compute-01 | 12 | 57a098bf-31d4-4e4f-9a28-72a925d2384c | None |
| 632c23d6-63df-4143-9d4c-deb2bdc94c80 | infra-prod-compute-13 | 4 | 632c23d6-63df-4143-9d4c-deb2bdc94c80 | None |
| 0fe3d535-8aec-4307-943e-2c46b01bc019 | infra-prod-compute-14 | 4 | 0fe3d535-8aec-4307-943e-2c46b01bc019 | None |
| 8f60a0e9-2510-48ce-b305-6937314bac4a | infra-prod-compute-15 | 4 | 8f60a0e9-2510-48ce-b305-6937314bac4a | None |
+--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+

Traits showing for the arm node (notice no HW_ARCH_AARCH64):
openstack resource provider trait list 57a098bf-31d4-4e4f-9a28-72a925d2384c
+---------------------------------------+
| name |
+---------------------------------------+
| COMPUTE_IMAGE_TYPE_QCOW2 |
| COMPUTE_ADDRESS_SPACE_EMULATED |
| COMPUTE_NET_VIF_MODEL_VMXNET3 |
| COMPUTE_GRAPHICS_MODEL_NONE |
| COMPUTE_IMAGE_TYPE_ISO |
| COMPUTE_DEVICE_TAGGING |
| COMPUTE_NET_VIF_MODEL_NE2K_PCI |
| COMPUTE_GRAPHICS_MODEL_VIRTIO |
| COMPUTE_RESCUE_BFV |
| COMPUTE_STORAGE_BUS_VIRTIO |
| COMPUTE_STORAGE_BUS_SCSI |
| COMPUTE_GRAPHICS_MODEL_VGA |
| COMPUTE_IMAGE_TYPE_AMI |
| COMPUTE_NET_VIF_MODEL_E1000 |
| COMPUTE_STORAGE_BUS_SATA |
| COMPUTE_NET_VIF_MODEL_PCNET |
| COMPUTE_NET_ATTACH_INTERFACE |
| HW_CPU_X86_AESNI |
| COMPUTE_STORAGE_BUS_USB |
| COMPUTE_ADDRESS_SPACE_PASSTHROUGH |
| COMPUTE_NET_VIF_MODEL_RTL8139 |
| COMPUTE_NET_ATTACH_INTERFACE_WITH_TAG |
| COMPUTE_VOLUME_ATTACH_WITH_TAG |
| COMPUTE_TRUSTED_CERTS |
| COMPUTE_IMAGE_TYPE_AKI |
| COMPUTE_VIOMMU_MODEL_SMMUV3 |
| COMPUTE_STORAGE_BUS_FDC |
| COMPUTE_VIOMMU_MODEL_AUTO |
| COMPUTE_VOLUME_EXTEND |
| COMPUTE_SOCKET_PCI_NUMA_AFFINITY |
| COMPUTE_NET_VIF_MODEL_E1000E |
| COMPUTE_NODE |
| COMPUTE_ACCELERATORS |
| COMPUTE_IMAGE_TYPE_RAW |
| COMPUTE_VOLUME_MULTI_ATTACH |
| COMPUTE_IMAGE_TYPE_ARI |
| COMPUTE_GRAPHICS_MODEL_BOCHS |
| COMPUTE_NET_VIF_MODEL_SPAPR_VLAN |
| COMPUTE_GRAPHICS_MODEL_CIRRUS |
| COMPUTE_GRAPHICS_MODEL_VMVGA |
| COMPUTE_NET_VIF_MODEL_VIRTIO |
| COMPUTE_VIOMMU_MODEL_VIRTIO |
+---------------------------------------+

Confirmation that it is an arm based system:
root@infra-prod-arm-compute-01:/etc/kolla/nova-libvirt# uname -a
Linux infra-prod-arm-compute-01 5.15.0-102-generic #112-Ubuntu SMP Tue Mar 5 16:49:56 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

On the startup of the nova-compute instance on this compute, I can see the libvirt output shows as much:
2024-04-18 21:47:43.978 7 INFO nova.service [-] Starting compute node (version 28.0.2)
2024-04-18 21:47:44.000 7 INFO nova.virt.node [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Determined node identity 57a098bf-31d4-4e4f-9a28-72a925d2384c from /var/lib/nova/compute_id
2024-04-18 21:47:44.021 7 INFO nova.virt.libvirt.driver [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Connection event '1' reason 'None'
2024-04-18 21:47:44.460 7 INFO nova.virt.libvirt.host [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Libvirt host capabilities <capabilities>

  <host>
    <uuid>38393550-3736-4753-4833-3334564b5842</uuid>
    <cpu>
      <arch>aarch64</arch>
      <model>Neoverse-N1</model>
      <vendor>ARM</vendor>
      <topology sockets='1' dies='1' cores='128' threads='1'/>

nova.conf for the nova-compute service on that node:
[DEFAULT]
debug = False
log_dir = /var/log/kolla/nova
state_path = /var/lib/nova
allow_resize_to_same_host = true
compute_driver = libvirt.LibvirtDriver
my_ip = <ip>
transport_url = rabbit://<url>
default_schedule_zone = nova

[conductor]
workers = 5

[vnc]
novncproxy_host = <ip>
novncproxy_port = 6080
server_listen = <ip>
server_proxyclient_address = <ip>
novncproxy_base_url = https://example.com:6080/vnc_lite.html

[serial_console]
enabled = true
base_url = wss://example.com:6083/
serialproxy_host = <ip>
serialproxy_port = 6083
proxyclient_address = <ip>

[oslo_concurrency]
lock_path = /var/lib/nova/tmp

[glance]
debug = False
api_servers = http://<ip>:9292
cafile =
num_retries = 3

[cinder]
catalog_info = volumev3:cinderv3:internalURL
os_region_name = RegionOne
auth_url = http://<ip>:5000
auth_type = password
project_domain_name = Default
user_domain_id = default
project_name = service
username = cinder
password = <pw>
cafile =

[neutron]
metadata_proxy_shared_secret = <secret>
service_metadata_proxy = true
auth_url = http://<ip>:5000
auth_type = password
cafile =
project_domain_name = Default
user_domain_id = default
project_name = service
username = neutron
password = <pw>
region_name = Westford
valid_interfaces = RegionOne

[libvirt]
connection_uri = qemu+tcp://<ip>/system
live_migration_inbound_addr = <ip>
images_type = rbd
images_rbd_pool = vms
images_rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_user = cinder
disk_cachemodes = network=writeback
hw_disk_discard = unmap
rbd_secret_uuid = 48d56060-bcf0-4f94-bee8-83ab18eaabbd
virt_type = kvm
cpu_mode = host-passthrough
num_pcie_ports = 16

[workarounds]
skip_cpu_compare_on_dest = True

[upgrade_levels]
compute = auto

[oslo_messaging_notifications]
transport_url = rabbit://<url>
driver = messagingv2
topics = notifications_designate

[oslo_messaging_rabbit]
heartbeat_in_pthread = false
amqp_durable_queues = true

[privsep_entrypoint]
helper_command = sudo nova-rootwrap /etc/nova/rootwrap.conf privsep-helper --config-file /etc/nova/nova.conf

[guestfs]
debug = False

[placement]
auth_type = password
auth_url = http://<ip>:5000
username = placement
password = <pw>
user_domain_name = Default
project_name = service
project_domain_name = Default
region_name = RegionOne
cafile =
valid_interfaces = internal

[notifications]
notify_on_state_change = vm_and_task_state

[barbican]
auth_endpoint = http://<ip>:5000
barbican_endpoint_type = internal
verify_ssl_path =

[service_user]
send_service_user_token = true
auth_url = http://<ip>:5000
auth_type = password
project_domain_id = default
user_domain_id = default
project_name = service
username = nova
password = <pw>
cafile =
region_name = RegionOne
valid_interfaces = internal

[scheduler]
image_metadata_prefilter = True

I have tried to run openstack resource provider trait delete 57a098bf-31d4-4e4f-9a28-72a925d2384c to delete all traits, then restarted the nova_compute on this compute node, however the same traits come back.

Sam Schmitt (samcat116)
summary: - Placement creating x86 trait for ARM Compute node
+ Nova/Placement creating x86 trait for ARM Compute node
Revision history for this message
Takashi Kajinami (kajinamit) wrote :

I read the series of changes which implemented the architecture selection feature, but the changes do not include one to make nova-compute report the available HW_ARCH trait. So IIUC you shave to add the trait manually to tag compute nodes which support specific CPU arch.

https://review.opendev.org/q/topic:%22bp/pick-guest-arch-based-on-host-arch-in-libvirt-driver%22

The HW_CPU_X86_AESNI trait is added because the libvirt driver detects aes cpu feature flag in the result of domain capabilities API. We should probably change the architecture in the trait according to the supported cpu architecture ideally.

Revision history for this message
Sam Schmitt (samcat116) wrote :

I manually added the HW_ARCH_AARCH64 trait to this node and the scheduling is now working. However I will open a separate RFE to automatically set this trait as an operator should not need to specify this manually.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

so setting hw_architecture=aarch64 or hw_architecture=x84_64
by itself does not actually have any effect on schedule of vms

you need to enable the image_metadata_prefilter but there was a bug in the original series where it id not update the libvirt driver to actually report the architecture traits.

so the issue is not with HW_CPU_X86_AESNI its with the fact the architrue trait is not reported today and that breaks the scheduling support for hw_architecture=aarch64

a workaround for now is to use the provier.yaml feature to advertise the trait.
https://docs.openstack.org/nova/latest/admin/managing-resource-providers.html

the real fix is to update the libvirt driver to report both the hardware architecture with
a HW_ prefix and the emulation architrutres with a COMPUTE_ as was originally intended.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.