numa_fit_instance_to_host() algorithm is highly ineffective on higher number of NUMA nodes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Balazs Gibizer |
Bug Description
Description
===========
Nova scheduler, when numa_fit_
This makes scheduling 48 cores flavor extremely sloooow…
Output of reproducer:
```
InstanceNUMATop
_______
Executed in 269.13 secs fish external
usr time 268.60 secs 0.00 micros 268.60 secs
sys time 0.07 secs 595.00 micros 0.07 secs
```
Steps to reproduce
==================
1. Add host with 16 NUMA nodes (3 cores × 2 threads each) to the OpenStack
2. Create a flavor for 48 CPUs that would take half of the host exactly
openstack flavor create sh4a-c48r488e20 \
--ram $((488*1024)) \
--vcpus 48 \
--ephemeral 20 \
--disk 20 \
--swap 0 \
--property 'hw:mem_
--property 'hw:cpu_
--property 'hw:cpu_
--property 'hw:cpu_
--property 'hw:cpu_sockets=8' \
--property 'hw:numa_
--property 'hw:numa_nodes=8' \
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:numa_
--property 'hw:cpu_threads=2' \
--property 'hw:cpu_
3. Create an instance with such flavor (so that it would normally land to that host) - command is skipped as in different installation it could be different
4. Wait for the first instance to spawn (this part is fast as it takes first 8 NUMA nodes).
5. Create a second instance with the same flavor.
…
Wait 5+ minutes until nova-scheduler is done with its work.
Expected result
===============
NUMA nodes selected within 10-15 seconds.
Actual result
=============
Algorithm is slow enough so that it takes 5 minutes to have instance scheduled.
Environment
===========
1. OpenStack Nova 23.2.0-1.el8. NOTE: I am able to reproduce this with master branch with 20 lines reproducer.
commit 4939318649650b6
Merge: c6e0f4f551 4c339c10e3
Author: Zuul <email address hidden>
Date: Tue May 17 00:01:41 2022 +0000
Merge "Drop lower-constrain
2. Libvirt + KVM (although it is not relevant here)
libvirt-
qemu-kvm-
2. LVM storage (not relevant either)
lvm2-2.
3. Neutron with L2 (not relevant)
Logs & Configs
==============
Check the reproducer and try it with uncommented DEBUG lines (will attach it here too).
I'll not be attaching DEBUG log from reproducer as it contains gigabytes of the same lines.