Small pages memory are not take into account when not explicitly requested

Bug #1439247 reported by Sahid Orentino
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Unassigned

Bug Description

Guests using small pages (as default) on compute node are not take into account when calculating available small pages memory [1] - Consequence when booting instance with an explicitly small pages request the compute of available resources is corrupted.

In order to fix the issue two solutions are able.

1/
Associate to every guest a NUMA topology and set the default page_size to MEMPAGES_SMALL when nothing has been requested by user. ** This also implies when using libvirt the default option of virt-type should be KVM **

A small couple of change are needed in hardware.py:
- make the method 'numa_get_constraints' to return NUMATopology in all cases.
- make the method ' _numa_get_pagesize_constraints' to return MEMPAGES_SMALL instead of None when nothing is requested.

2/
Disallow to request a memory page size small, means remove all of the code which handle that case since the information reported to the host are not correctly updated and let the default behavior handle that case.

[1] http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/hardware.py#n1087

Tags: numa
Sean Dague (sdague)
tags: added: numa
Changed in nova:
status: New → Confirmed
Revision history for this message
Nikola Đipanov (ndipanov) wrote :

So some comments regarding 1/

* I am not sure if we want to make NUMA topology (even if single cell) exposed to every instance without a way to turn it off. There may be guest OS concerns around that for the libvirt case.
* The NUMA information makes no sense at this point for other drivers, some of which may never implement it.

Based on that it seems to me that a cleaner approach is to keep the NUMA qualities of instances optional

2/ Seems like a better approach to me - as for how to keep the smallest page size information in sync - two things come to mind: either don't report the smalles size at all, or make sure that resource tracker considers it and tracks it even for non-numa instances. I slightly prefer the first option

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/172079

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/172079
Reason: This patch is very old and appears to not be active any more. I am therefore abandoning it to keep the nova review queue sane. Feel free to restore the change when you're actively working on it again.

Revision history for this message
Sujitha (sujitha-neti) wrote :

This has not been updated in a long time. Moving it to Unassinged.

Please assign it to yourself and set to in progress if you plan on working on it.

Changed in nova:
assignee: sahid (sahid-ferdjaoui) → nobody
aishwarya (bkaishwarya)
Changed in nova:
assignee: nobody → aishwarya (bkaishwarya)
Revision history for this message
aishwarya (bkaishwarya) wrote :

can you please specify the reproduction steps of this bug.Thanks in advance.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by sahid (<email address hidden>) on branch: master
Review: https://review.openstack.org/172079

Sean Dague (sdague)
Changed in nova:
assignee: aishwarya (bkaishwarya) → nobody
Revision history for this message
Chris Friesen (cbf123) wrote :

There is some useful discussion under bug 1792985 which has been marked as a dupe of this bug.

Currently it's still not safe to schedule numa-topology and non-numa-topoology on the same compute node because instances with no numa topology can "float" over the whole compute node, which means we have no way of knowing where the memory comes from and therefore can't possibly accurately track memory consumption per NUMA node.

I think it's a cop-out to say "don't schedule numa-topology and non-numa-topology instances on the same compute node". I mean, the way the code is written currently its not safe, but I think we *should* try to make it safe.

Specifically for edge scenarios, we may only have a small number of compute nodes (sometimes just one or two) and so any host-aggregate-based solution doesn't really work. We need to be able to have these things co-exist on a single compute node.

Specifically for 4K memory, this means either disabling "strict" NUMA affinity, or else restricting floating instances to a single NUMA node.

Revision history for this message
Jing Zhang (jing.zhang.nokia) wrote :

Bug 1844721, Need NUMA aware RAM reservation to avoid OOM killing host processes, is marked as a duplicate of this bug.

For bug 1844721, there is no mixing of VMs using small pages and not-using small pages on the same compute, but all VMs have their CPUs pinned.

The root cause of bug 1844721 is nova lacking the capability of reserving memory on NUMA 0 for host processes.

The option 1 proposed in this bug report, plus adding a restraint of "when VM has its CPU pinned", is a sensible solution.

Changed in nova:
assignee: nobody → Jing Zhang (jing.zhang.nokia)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/686079

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Chris Friesen (cbf123) wrote :

The proposal doesn't address the issue raised in bug 1792985, which deals with non-pinned instances consuming memory on unknown NUMA nodes.

Changed in nova:
assignee: Jing Zhang (jing.zhang.nokia) → nobody
status: In Progress → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.