Nova by default will first fill up NUMA node 0 if there are still free pCPUs. This issue happens when the requested pCPUs still fir into NUMA 0, but the hugepages on NUMA 0 aren't sufficient for the instance memory to fit. Unfortunately, at time of this writing, one cannot tell nova to spawn an instance on a specific NUMA node.
Diagnostic Steps
On a hypervisor with 2MB hugepages and 512 free hugepages per NUMA node:
Raw
Root Cause
Nova by default will first fill up NUMA node 0 if there are still free pCPUs. This issue happens when the requested pCPUs still fir into NUMA 0, but the hugepages on NUMA 0 aren't sufficient for the instance memory to fit. Unfortunately, at time of this writing, one cannot tell nova to spawn an instance on a specific NUMA node.
Diagnostic Steps
On a hypervisor with 2MB hugepages and 512 free hugepages per NUMA node:
Raw
[root@overcloud -compute- 1 ~]# cat /sys/devices/ system/ node/node* /meminfo | grep -i huge
Node 0 AnonHugePages: 2048 kB
Node 0 HugePages_Total: 1024
Node 0 HugePages_Free: 512
Node 0 HugePages_Surp: 0
Node 1 AnonHugePages: 2048 kB
Node 1 HugePages_Total: 1024
Node 1 HugePages_Free: 512
Node 1 HugePages_Surp: 0
And with the following NUMA architecture:
Raw
[root@overcloud -compute- 1 nova]# lscpu | grep -i NUMA
NUMA node(s): 2
NUMA node0 CPU(s): 0-3
NUMA node1 CPU(s): 4-7
Spawn 3 instances with the following flavor (1 vCPU and 512 MB or memory):
Raw
[stack@undercloud-4 ~]$ nova flavor-show m1.tiny ------- ------- ------- -+----- ------- ------- ------- ------- ------- ------- ------- ------- + ------- ------- ------- -+----- ------- ------- ------- ------- ------- ------- ------- ------- + DISABLED: disabled | False | EXT-DATA: ephemeral | 0 | c12e-4435- 97ef-f575990b35 2f | access: is_public | True | ------- ------- ------- -+----- ------- ------- ------- ------- ------- ------- ------- ------- +
+------
| Property | Value |
+------
| OS-FLV-
| OS-FLV-
| disk | 8 |
| extra_specs | {"hw:cpu_policy": "dedicated", "hw:mem_page_size": "large"} |
| id | 49debbdb-
| name | m1.tiny |
| os-flavor-
| ram | 512 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 1 |
+------
The new instance will boot and will use memory from NUMA 1:
Raw
[stack@undercloud-4 ~]$ nova list | grep d98772d1- 119e-48fa- b1d9-8a68411cba 0b 119e-48fa- b1d9-8a68411cba 0b | cirros-test0 | ACTIVE | - | Running | provider1= 2000:10: :f816:3eff: fe8d:a6ef, 10.0.0.102 |
| d98772d1-
Raw
[root@overcloud -compute- 1 nova]# cat /sys/devices/ system/ node/node* /meminfo | grep -i huge
Node 0 AnonHugePages: 2048 kB
Node 0 HugePages_Total: 1024
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
Node 1 AnonHugePages: 2048 kB
Node 1 HugePages_Total: 1024
Node 1 HugePages_Free: 256
Node 1 HugePages_Surp: 0
Raw
nova boot --nic net-id=$NETID --image cirros --flavor m1.tiny --key-name id_rsa cirros-test0
The 3rd instance fails to boot:
Raw
[stack@undercloud-4 ~]$ nova list ------- ------- ------- ------- ----+-- ------- -----+- ------- +------ ------+ ------- ------+ ------- ------- ------- ------- ------- ------- ------- ---+ ------- ------- ------- ------- ----+-- ------- -----+- ------- +------ ------+ ------- ------+ ------- ------- ------- ------- ------- ------- ------- ---+ c298-4c92- 8d2c-0a9fe886e9 bc | cirros-test0 | ERROR | - | NOSTATE | | 49ad-43c5- b8a1-543ed8ab80 ad | cirros-test0 | ACTIVE | - | Running | provider1= 2000:10: :f816:3eff: fe0f:565b, 10.0.0.105 | 6161-45e6- 8a04-6c45cef4aa 3e | cirros-test0 | ACTIVE | - | Running | provider1= 2000:10: :f816:3eff: fe69:18bd, 10.0.0.111 | ------- ------- ------- ------- ----+-- ------- -----+- ------- +------ ------+ ------- ------+ ------- ------- ------- ------- ------- ------- ------- ---+
+------
| ID | Name | Status | Task State | Power State | Networks |
+------
| 1b72e7a1-
| a44c43ca-
| e21ba401-
+------
From the compute node, we can see that free hugepages on NUMA Node 0 are exhausted, whereas in theory there's still enough space on NUMA node 1:
Raw
[root@overcloud -compute- 1 qemu]# cat /sys/devices/ system/ node/node* /meminfo | grep -i huge
Node 0 AnonHugePages: 2048 kB
Node 0 HugePages_Total: 1024
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
Node 1 AnonHugePages: 2048 kB
Node 1 HugePages_Total: 1024
Node 1 HugePages_Free: 512
Node 1 HugePages_Surp: 0
/var/log/ nova/nova- compute. log reveals that the instance CPU shall be pinned to NUMA node 0:
Raw
<name> instance- 00000006< /name> 1b72e7a1- c298-4c92- 8d2c-0a9fe886e9 bc</uuid> openstack. org/xmlns/ libvirt/ nova/1. 0"> "14.0.8- 5.el7ost" /> nova:name> cirros- test0</ nova:name> nova:creationTi me>2017- 11-23 19:53:00< /nova:creationT ime>
<nova: memory> 512</nova: memory>
<nova: disk>8< /nova:disk>
<nova: swap>0< /nova:swap>
<nova: ephemeral> 0</nova: ephemeral>
<nova: vcpus>1< /nova:vcpus> /nova:flavor> 7294a6fad5e2bdd dd91cc20" >admin< /nova:user>
<nova: project uuid="8c307c08d 2234b339c504bfd d896c13e" >admin< /nova:project> 5a11-4e02- a21a-cb1c0d5432 14"/> instance> >524288< /memory> >524288< /currentMemory> 'static' >1</vcpu> 1024</shares>
<uuid>
<metadata>
<nova:instance xmlns:nova="http://
<nova:package version=
<
<
<nova:flavor name="m1.tiny">
<
<nova:owner>
<nova:user uuid="5d1785ee8
</nova:owner>
<nova:root type="image" uuid="6350211f-
</nova:
</metadata>
<memory unit='KiB'
<currentMemory unit='KiB'
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
<vcpu placement=
<cputune>
<shares>
<vcpupin vcpu='0' cpuset='2'/>
<emulatorpin cpuset='2'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
</numatune>
In the above, also look at the nodeset='0' in the numatune section, which indicates that memory shall be claimed from NUMA 0.