VM creation failure due to Nova hugepage assumptions

Bug #1594529 reported by Paul Michali
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Medium
Sahid Orentino
Queens
In Progress
Medium
Sahid Orentino

Bug Description

Description:

In Liberty and Mitaka, Nova assumes that it has exclusive access to the huge pages on the compute node. It maintains track of the total pages per NUMA node on the compute node, and then number of used (by Nova VMs) pages on each NUMA node. This is done for the three huge page sizes supported.

However, if other third party processes consume huge pages, there will be a discrepancy between the actual pages available and what Nova thinks is available. As a result, it is possible (based on the number of pages and the VM size) for Nova to think it has enough pages, when there are not enough pages. The create will fail with QEMU reporting insufficient memory available, for example.

Steps to reproduce:

1. Compute with 32768 2MB pages available, giving 16384 per NUMA node with two nodes.
2. Third party process that consumes 256 pages per NUMA node.
3. Create 15 small flavor (2GB = 1024 pages) VMs.
4. Create another small flavor VM.

Expected Result:

That the 16th VM would be created, without an error, and using huge pages on the second NUMA node (and allow more VMs as well).

Actual Result:

After step 3, Nova thinks there are 1024 pages available, but the compute host shows only 768 pages available. The scheduler thinks there is space for one more VM, it will pass the filter. The creation will commence, as Nova thinks there is enough space on NUMA node 0. QEMU will fail, indicating that there is not enough memory.

In addition, there are 16128 pages available on NUMA node 1, but Nova will not attempt using them, as it thinks there is still memory available on NUMA node 0.

In my case, I had multiple compute hosts and ended up with a "No hosts available" error, as it fails on each host when trying NUMA node 0. If, at step 4, one creates a medium flavor VM, it will succeed, as Nova will not see enough pages on NUMA node 0, and will try NUMA node 1, which has ample space.

Commentary: Nova checks total huge pages, but not available huge pages.

Note: A feature was added to master (for Newton) that has a config based mechanism to reserve huge pages for third party applications under bug 1543149. However, the Nova team indicated that this change cannot be back ported to Liberty.

Environment:

Liberty release (12.0.3), with LB, neutron networking, libvirt 1.2.17, API QEMU 1.2.17, QEMU 2.3.0.

Config:

nova flavor-key m1.small set hw:numa_nodes=1
nova flavor-key m1.small set hw:mem_page_size=2048

network, subnet, and standard VM create commands.

Revision history for this message
Paul Michali (pcm) wrote :

Was also wondering if the solution being targeted to Newton, should reduce the total of pages passed in, when creating a NUMAPagesTopology object in the libvirt driver, rather than alter the object's signature by adding a reserved parameter. With the former, the versioned object would not need an API change and may be a more backward compatible solution.

Matt Riedemann (mriedem)
tags: added: hugepages numa
Revision history for this message
liuxiuli (liu-lixiu) wrote :

I think it has been finished in mitaka version. It uses reserved_huge_pages for system process. Please see function numa_get_reserved_huge_pages and the related code in master.

Revision history for this message
Stephen Finucane (stephenfinucane) wrote :

I'm not sure what the expected resolution is here. We've added support for reserving some amount of hugepages but, as you say, this is classified as feature rather than a bugfix. I don't really think there is anything that can be done in older versions of nova to handle this outside of an out-of-tree patch...

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Changed in nova:
assignee: nobody → sahid (sahid-ferdjaoui)
status: Expired → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/580657
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b1c45fedc3b80769f21ffe5ff748593a4b017a7a
Submitter: Zuul
Branch: master

commit b1c45fedc3b80769f21ffe5ff748593a4b017a7a
Author: Sahid Orentino Ferdjaoui <email address hidden>
Date: Fri Jul 6 08:43:31 2018 -0400

    hardware: fix hugepages memory usage per intances

    In the algo to compute memory, when several instances are their memory
    backed on a same host NUMA node, we always compute the memory used
    based on the initial 'used' value of hostcell.

    Partial-Bug: #1594529
    Change-Id: I5b8684b10688b91ca36a509ea0cb8bee397263d6
    Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/581736

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/581736
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d7864fbb9c2c558c409559e1d5989f84c7403832
Submitter: Zuul
Branch: stable/queens

commit d7864fbb9c2c558c409559e1d5989f84c7403832
Author: Sahid Orentino Ferdjaoui <email address hidden>
Date: Fri Jul 6 08:43:31 2018 -0400

    hardware: fix hugepages memory usage per intances

    In the algo to compute memory, when several instances are their memory
    backed on a same host NUMA node, we always compute the memory used
    based on the initial 'used' value of hostcell.

    Partial-Bug: #1594529
    Change-Id: I5b8684b10688b91ca36a509ea0cb8bee397263d6
    Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
    (cherry picked from commit b1c45fedc3b80769f21ffe5ff748593a4b017a7a)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by sahid (<email address hidden>) on branch: master
Review: https://review.openstack.org/581365

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.