NUMA Topology cell memory sent to xml in MiB, but qemu uses KiB

Bug #1373159 reported by Michael Turek
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Nikola Đipanov

Bug Description

Currently when specifying NUMA cell memory via flavor extra_specs or image properties, MiB units are used. According to the libvirt xml domain format documentation (http://libvirt.org/formatdomain.html) , cell memory should be specified in KiB.

In this example, we use the following extra_specs:
"hw:numa_policy": "strict", "hw:numa_mem.1": "2048", "hw:numa_mem.0": "6144", "hw:numa_nodes": "2", "hw:numa_cpus.0": "0,1,2", "hw:numa_cpus.1": "3"

The flavor has 8192 MB of ram and 4 vcpus.

When using qemu 2.1.0, the following will be seen in the n-cpu logs when booting a machine with NUMA specs.

"libvirtError: internal error: process exited while connecting to monitor: qemu-system-x86_64: total memory for NUMA nodes (8388608) should equal RAM size (200000000)"

Please note that the 200000000 is 8388608 KiB in bytes and hex (simply an issue with the qemu error message). The error shows that 8192 KiB is being requested rather than 8192 MiB. Because the RAM size does not equal the total memory size, the machine fails to boot.

When using versions of qemu lower than 2.1.0 the issue is not obvious, as machines with NUMA specs boot, but only because of a bug (that has since been resolved) in qemu. This is because the check to ensure that RAM size equals the NUMA node total memory does not happen in versions lower than 2.1.0

In short, we should be using KiB units for NUMA cell memory, or at least be converting from MiB to KiB before creating the xml. Otherwise, NUMA placement will not behave as intended.

To be fair, I haven't had the chance to look at the memory placement in a guest booted using qemu 2.0.0 or lower, though I suspect the memory placement would be incorrect.. If anyone has the chance to look, it would be greatly appreciated.

I am currently investigating the appropriate fix for this alongside Tiago Mello. We made a quick fix in /nova/virt/libvirt/config.py on line 495:

                cell.set("memory", str(self.memory * 1024))

Mutiplying by 1024 allowed the machine to properly boot, but it is probably a bit too quick and dirty. Just thought it would be worth mentioning.

Sys-info:
x86_64 machine

Virt-info:
qemu version 2.1.0
libvirt version 1.2.2

Kenerl-info:
3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

OS-info:
Distributor ID: Ubuntu
Description: Ubuntu 14.04.1 LTS
Release: 14.04
Codename: trusty

Michael Turek (mjturek)
Changed in nova:
assignee: nobody → Michael Turek (mjturek)
Sean Dague (sdague)
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Michael Turek (mjturek)
description: updated
Michael Turek (mjturek)
summary: - NUMA Topology cell memory in MiB units rather than KiB units
+ NUMA Topology cell memory sent to libvirt in MiB when qemu expects KiB
description: updated
summary: - NUMA Topology cell memory sent to libvirt in MiB when qemu expects KiB
+ NUMA Topology cell memory sent to xml in MiB when qemu expects KiB
summary: - NUMA Topology cell memory sent to xml in MiB when qemu expects KiB
+ NUMA Topology cell memory sent to xml in MiB, but qemu uses KiB
Revision history for this message
Michael Turek (mjturek) wrote :

So after a bit more investigating, I have a better understanding of what the consequences of specifying cell memory in MiB rather than the expected KiB.

When using qemu-2.1.0:
The feature simply does not work. Machines with NUMA specs that should boot, fail at the libvirt/qemu level and go to error. This happens regardless of whether cell memory is specified or is using the default of equally distributing the memory across the cells.

When using qemu-2.0.0 (or lower):
Machines boot, but with the wrong NUMA topology. For example, with either of the following extra_specs:
{"hw:numa_policy": "strict", "hw:numa_mem.1": "2048", "hw:numa_mem.0": "6144", "hw:numa_nodes": "2", "hw:numa_cpus.0": "0,1,2", "hw:numa_cpus.1": "3"}
{{"hw:numa_policy": "strict", "hw:numa_nodes": "2"}

The following topology is found on the guest:
node 0 cpus: 0 1 2 3
node 0 size: 7986 MB
node 0 free: 7568 MB
node distances:
node 0
  0: 10

The quick fix that Tiago and I tried produces the following topology, which is the intended behavior:

When extra specs are{"hw:numa_policy": "strict", "hw:numa_nodes": "2"}

node 0 cpus: 0 1
node 0 size: 3955 MB
node 0 free: 3728 MB
node 1 cpus: 2 3
node 1 size: 4031 MB
node 1 free: 3846 MB
node distances:
node 0 1
  0: 10 20
  1: 20 10

When extra_specs are {"hw:numa_policy": "strict", "hw:numa_mem.1": "2048", "hw:numa_mem.0": "6144", "hw:numa_nodes": "2", "hw:numa_cpus.0": "0,1,2", "hw:numa_cpus.1": "3"}

available: 2 nodes (0-1)
node 0 cpus: 0 1 2
node 0 size: 5971 MB
node 0 free: 5587 MB
node 1 cpus: 3
node 1 size: 2015 MB
node 1 free: 1983 MB
node distances:
node 0 1
  0: 10 20
  1: 20 10

So in short, the feature is not working as intended and once qemu-2.1.0 becomes more common, it will be broken. I'll be proposing a fix later today for this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/124187

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
John Garbutt (johngarbutt) wrote :

Move to high as when you have NUMA it can stop VMs booting, due to bad memory config.

tags: added: juno-rc-potential
Changed in nova:
importance: Low → High
Changed in nova:
assignee: Michael Turek (mjturek) → Nikola Đipanov (ndipanov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/124187
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6a374f21495c12568e4754800574e6703a0e626f
Submitter: Jenkins
Branch: master

commit 6a374f21495c12568e4754800574e6703a0e626f
Author: Nikola Dipanov <email address hidden>
Date: Tue Sep 30 12:39:22 2014 +0200

    libvirt: Make sure NUMA cell memory is in Kb in XML

    This patch makes libvirt report memory in Mb (as
    hardware.VirtNUMATopologyUsage class expects Mb) and that when
    generating XML we use Kb as this is what libvirt expects by default.

    Change-Id: Ic4518acf6bcee463009437829646de8c83aff6bf
    Closes-bug: #1373159

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.