Oversubscription broken for instances with NUMA topologies

Bug #1810977 reported by Stephen Finucane
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Stephen Finucane
Rocky
Fix Committed
Medium
Stephen Finucane

Bug Description

As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages.

Steps to reproduce:

1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all-in-one deployment where the host has 32GB RAM, we will request a 20GB instance:

   $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
   $ openstack flavor set test.numa --property hw:numa_nodes=2

2. Boot an instance using this flavor:

   $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test

3. Boot another instance using this flavor:

   $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2

# Expected result:

The second instance should boot.

# Actual result:

The second instance fails to boot. We see the following error message in the logs.

  nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
  nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}}

If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us.

[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168

Revision history for this message
sean mooney (sean-k-mooney) wrote :

triaged as medium as while this will affect all deployment with ram_allocation_ratio >1.0
that use numa affined guests without hugepages, the propotion of clouds that it affect is
expected to be low.

for does that are affected there is no workaround beyond disabling all numa related feature if they want to achive
memory over subscription.

Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
tags: added: libvirt numa scheduler
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/629281

Changed in nova:
assignee: nobody → Stephen Finucane (stephenfinucane)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/629281
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b24ad3780bc872d1a17907909cd6bcbea7e804b3
Submitter: Zuul
Branch: master

commit b24ad3780bc872d1a17907909cd6bcbea7e804b3
Author: Stephen Finucane <email address hidden>
Date: Tue Jan 8 17:01:41 2019 +0000

    Fix overcommit for NUMA-based instances

    Change I5f5c621f2f0fa1bc18ee9a97d17085107a5dee53 modified how we
    evaluated available memory for instances with a NUMA topology.
    Previously, we used a non-pagesize aware check unless the user had
    explicitly requested a specific pagesize. This means that for instances
    without pagesize requests, nova considers hugepages as available memory
    when deciding if a host has enough available memory for the instance.

    The aforementioned change modified this so that all NUMA-based
    instances, whether they had hugepages or not, would use the
    pagesize-aware check. Unfortunately the functionality it was reusing to
    do this was functionality previously only used for hugepages. Hugepages
    cannot be oversubscribed so we did not take oversubscription into
    account, comparing against available memory on the host (i.e. memory not
    consumed by other instances) rather than total memory. This is OK when
    using hugepages but not small pages, where overcommit is OK.

    Given that overcommit is already handled elsewhere in the code, we
    simply modify the non-hugepage code path to check for available memory
    of the lowest pagesize vs. total memory.

    Change-Id: I890b2c81cd49c1c601e9baee6a249709d0f6810e
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1810977

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/633197

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/633197
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=780ccfcbdea919b196c18372d1c66bc88b4fa48c
Submitter: Zuul
Branch: stable/rocky

commit 780ccfcbdea919b196c18372d1c66bc88b4fa48c
Author: Stephen Finucane <email address hidden>
Date: Tue Jan 8 17:01:41 2019 +0000

    Fix overcommit for NUMA-based instances

    Change I5f5c621f2f0fa1bc18ee9a97d17085107a5dee53 modified how we
    evaluated available memory for instances with a NUMA topology.
    Previously, we used a non-pagesize aware check unless the user had
    explicitly requested a specific pagesize. This means that for instances
    without pagesize requests, nova considers hugepages as available memory
    when deciding if a host has enough available memory for the instance.

    The aforementioned change modified this so that all NUMA-based
    instances, whether they had hugepages or not, would use the
    pagesize-aware check. Unfortunately the functionality it was reusing to
    do this was functionality previously only used for hugepages. Hugepages
    cannot be oversubscribed so we did not take oversubscription into
    account, comparing against available memory on the host (i.e. memory not
    consumed by other instances) rather than total memory. This is OK when
    using hugepages but not small pages, where overcommit is OK.

    Given that overcommit is already handled elsewhere in the code, we
    simply modify the non-hugepage code path to check for available memory
    of the lowest pagesize vs. total memory.

    Change-Id: I890b2c81cd49c1c601e9baee6a249709d0f6810e
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1810977
    (cherry picked from commit fd19aeafbce0fa11821b2a064bd694b078613c2f)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.2.0

This issue was fixed in the openstack/nova 18.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/726868

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/queens)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/queens
Review: https://review.opendev.org/c/openstack/nova/+/726868
Reason: This branch transitioned to End of Life for this project, open patches needs to be closed to be able to delete the branch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.