instance always taking numa 0 first from host, even if flavor is configured to take memory from numa 1

Bug #2051479 reported by Subhajit Chatterjee
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Openstack version: Wallaby

Controller Details:
OS:

Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal

flavor metadata details:

| properties | hw:cpu_policy='dedicated', hw:mem_page_size='1GB', hw:numa_cpus.1='0,1,2,3,4,5,6,7,8,9', hw:numa_mem.1='102400', hw:numa_nodes='2'

Compute Details:

Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal

test@computedp:~$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
node 0 size: 515544 MB
node 0 free: 307020 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
node 1 size: 516060 MB
node 1 free: 306605 MB
node distances:
node 0 1
  0: 10 20
  1: 20 10

test@computedp:~$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.4.0-125-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro intel_iommu=on iommu=pt isolcpus=2-47 nohz_full=2-47 default_hugepagesz=1G hugepagesz=1G hugepages=400

ISSUE:

I am trying to launch an instance which will take all memory (100GB in my example) and cpus (8) only from numa 1 ( odd numa ), however, while giving the above properties to the flavor, it is taking half memory from numa 0 and half from numa 1 automatically, same for the cpus.

I have tried with hw:numa_nodes='1', but result is same.

If I give 1 Gb and 1 cpu from numa 0 and rest from numa 1 as below, it works fine.

| properties | hw:cpu_policy='dedicated', hw:mem_page_size='1GB', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2,3,4,5,6,7,8,9', hw:numa_mem.0='1024', hw:numa_mem.1='101376', hw:numa_nodes='2' |

Note:
The system has sufficient hugepages, cpus in both the numa. There is no other instance running in the system.
No special configuration is there in nova-compute in compute or nova-scheduler in controller.

Can you please suggest the resolution to this problem?

Thanks
Subhajit

Revision history for this message
Jie Song (songjie-cmss) wrote :

Hey, when flavor property hw:numa_cpus.1='0,1,2,3,4,5,6,7,8,9' be changed to hw:numa_cpus.1='0,2,4,6,8,10,12,14,16,18', is it satisfied for your requirement?

Revision history for this message
Subhajit Chatterjee (subhajitcdot) wrote :

No, It is always taking half or few from numa 0. So if I mention only 1 from numa 0 and rest all from numa 1, then it is working fine, same behavior for the memory as well. Need to assign few memory from numa 0 and rest from numa 1 to make it work. But the requirement is get all from odd numa, which I am able to achieve.

Thanks

Revision history for this message
Jie Song (songjie-cmss) wrote :

Sorry, I made a mistake.

Can you provide the XML file of the instance (sudo virsh dumpxml instance-xxx.) and the nova-compute log for the launch operation (including the xml, "End _get_guest_xml xml=...")? Thank you.

Revision history for this message
Jie Song (songjie-cmss) wrote (last edit ):

Hi, Subhajit. I read the explanation of "Customizing instance NUMA placement policies" in openstack document.

> For `hw:numa_cpus.{num}`, is says the {num} parameter is an index of guest NUMA nodes and may not correspond to host NUMA nodes. For example, on a platform with two NUMA nodes, the scheduler may opt to place guest NUMA node 0, as referenced in hw:numa_mem.0 on host NUMA node 1 and vice versa. Similarly, the CPUs bitmask specified in the value for hw:numa_cpus.{num} refer to guest vCPUs and may not correspond to host CPUs. As such, this feature cannot be used to constrain instances to specific host CPUs or NUMA nodes [1].

> For `hw:numa_nodes`, it means the number of virtual NUMA nodes to allocate to configure the guest with. Each virtual NUMA node will be mapped to a unique host NUMA node [2].

If you want to launch an instance which will take all memory and and cpus from one numa, you can try set $FLAVOR --property hw:numa_nodes=1. However, guest NUMA node {num} may not correspond to host NUMA nodes.

[1] https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#customizing-instance-numa-placement-policies.
[2] https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:numa_nodes.

Revision history for this message
Subhajit Chatterjee (subhajitcdot) wrote :

Thanks @Jie Song for the clarification. It seems as per the document openstack provides NUMA cpu aware configuration for the guest VM. However, my requirement seems practical as it may required to pin guest cpus to particular numa node of the host. is it possible to do that in openstack as nova has access and awareness to the host resources. I know, virsh vcpupin is possible in the runtime but changing the guest vm xml may also possible ( feasible ) using instance flavor.

Revision history for this message
Subhajit Chatterjee (subhajitcdot) wrote :
Download full text (5.7 KiB)

Another interesting observation when I am trying to use dedicated cpus, it is taking from shared cpu list only...

1. I have 4 shared and 44 dedicated cpus in the compute.

root@computedp:/home/cdot# cat /etc/nova/nova.conf
cpu_shared_set = 0-3
cpu_dedicated_set = 4-47

2. I am just trying to assign cpus from dedicated list in my instance, using below flavor for that
root@controller# openstack flavor show testnew
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| description | None |
| disk | 10 |
| id | 37f816c6-56d7-4e16-8253-75de7d64690e |
| name | testnew |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated' |
| ram | 8192 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+--------------------------------------+

3. nova-scheduler log is showing

2024-02-20 11:14:08.447 3247 INFO nova.virt.hardware [req-a5c01121-1224-4346-876d-d43c23b3c4b0 0adba34df2344b218b0ecd329a5464b8 f665d61d52ff4f2d963a4321b12a5842 - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[2], [8], [14], [20], [26], [32], [38], [44], [4], [10], [16], [22], [28], [34], [40], [46], [0], [6], [12], [18], [24], [30], [36], [42]], vCPUs mapping: [(0, 2), (1, 8), (2, 14), (3, 20)]

4. instance is launched and taking cpus from shared list only
root@computedp:# virsh vcpupin 37
 VCPU CPU Affinity
----------------------
 0 0,2
 1 0,2
 2 0,2
 3 0,2

5. instance log also doesn't contain any dedicated info
root@computedp:# cat /var/log/libvirt/qemu/instance-00000085.log
2024-02-20 05:44:10.539+0000: starting up libvirt version: 6.0.0, package: 0ubuntu8.16 (Marc Deslauriers <email address hidden> Wed, 20 Apr 2022 11:31:12 -0400), qemu version: 4.2.1Debian 1:4.2-3ubuntu6.28, kernel: 5.4.0-125-generic, hostname: computedp
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin \
HOME=/var/lib/libvirt/qemu/domain-37-instance-00000085 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-37-instance-00000085/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-37-instance-00000085/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-37-instance-00000085/.config \
QEMU_AUDIO_DRV=none \
/usr/bin/qemu-system-x86_64 \
-name guest=instance-00000085,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-37-ins...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.