Need NUMA aware RAM reservation to avoid OOM killing host processes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Description:
===========
CPU pinning is widely used in VNFs. When VM CPU is pinned, currently there is no way to reserve memory on NUMA 0 for host processes:
> ram_allocation_
> reserved_
This leads to many VMs are scheduled on NUMA 0 (CPU pinned to NUMA 0) while their memory needs are met "globally".
When the system starts to take load, VMs' memory start to get allocated on NUMA 0 (because their are pinned to NUMA 0) to the extend that memory shortage occurs on NUMA 0 and OOM kicks in to kill host processes.
Many mitigation are "invented", but those mitigation all have some form of technical or operational "difficulties". One mitigation, for example, is to enable huge pages, and put VMs on huge pages.
The right solution is for nova to support NUMA aware RAM reservation as for the huge pages case, i.e.
reserved_
Steps to reproduce
==================
Create CPU pinned VMs. VMs are crowded on NUMA 0, until no more CPU cores are available on NUMA 0 then they are scheduled on NUMA 1. Stress the system.
Expected result
===============
The system stays operational.
Actual result
=============
OOM kicks to kill host process due to lacking of memory on NUMA 0, while there are tons of memory on NUMA 1.
Asking stephenfin and sean-k-mooney in IRC about this they said it's a long-standing known issue that is hard to fix and agreed that the workaround is to set hw:mem_ page_size= small in the flavors that use CPU pinning. There might be duplicate bugs for this. Either way we should document the known limitation alongside the hw:cpu_policy flavor extra spec here:
https:/ /docs.openstack .org/nova/ latest/ user/flavors. html