nova libvirt pinning not reflected in VirtCPUTopology

Bug #1466780 reported by Stephen Finucane on 2015-06-19
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Stephen Finucane

Bug Description

Using a CPU policy of dedicated ('hw:cpu_policy=dedicated') results in vCPUs being pinned to pCPUs, per the original blueprint:

    http://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/virt-driver-cpu-pinning.html

When scheduling instance with this extra spec, it would be expected that the 'VirtCPUToplogy' object used by 'InstanceNumaCell' objects (which are in turn used by an 'InstanceNumaTopology' object) should bear some reflection on the actual configuration. For example, a VM booted with four vCPUs and the 'dedicated' CPU policy should have NUMA topologies similar to one of the below:

    VirtCPUTopology(cores=4,sockets=1,threads=1)
    VirtCPUTopology(cores=2,sockets=1,threads=2)
    VirtCPUTopology(cores=1,sockets=2,threads=2)
    ...

In summary, cores * sockets * threads = vCPUs. However, this does not appear to happen.

---

# Testing Configuration

Testing was conducted on a single-node, Fedora 21-based (3.17.8-300.fc21.x86_64) OpenStack instance (built with devstack). The system is a dual-socket, 10 core, HT-enabled system (2 sockets * 10 cores * 2 threads = 40 "pCPUs". 0-9,20-29 = node0, 10-19,30-39 = node1). Two flavors were used:

    openstack flavor create --ram 4096 --disk 20 --vcpus 10 demo.no-pinning

    openstack flavor create --ram 4096 --disk 20 --vcpus 10 demo.pinning
    nova flavor-key demo.pinning set hw:cpu_policy=dedicated hw:cpu_threads_policy=separate

# Results

Results vary - however, we have seen very random assignments like so:

For a three vCPU instance:

    (Pdb) p instance.numa_topology.cells[0].cpu_topology
    VirtCPUTopology(cores=10,sockets=1,threads=1)

For a four vCPU instance:

    VirtCPUTopology(cores=2,sockets=1,threads=2)

For a ten vCPU instance:

    VirtCPUTopology(cores=7,sockets=1,threads=2)

The actual underlying libvirt XML is correct, however:

For example, for a three vCPU instance:

    <cputune>
        <shares>3072</shares>
        <vcpupin vcp='0' cpuset='1'/>
        <vcpupin vcp='1' cpuset='0'/>
        <vcpupin vcp='2' cpuset='25'/>
    </cputune>

UPDATE(23/06/15): The random assignments aren't actually random (thankfully). They correspond to the number of free cores in the system. The reason they change is because the number of cores is changing (as pinned CPUs deplete resources). However, I still don't think this is correct/logical.

Changed in nova:
assignee: nobody → Stephen Finucane (sfinucan)
description: updated

@Stephen Finucane (sfinucan):

Since you are set as assignee, I switch the status to "In Progress".

Changed in nova:
status: New → In Progress

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/197129
Reason: Duplicate

Reviewed: https://review.openstack.org/197125
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8358936a24cd223046580ddfa3bfb37a943abc91
Submitter: Jenkins
Branch: master

commit 8358936a24cd223046580ddfa3bfb37a943abc91
Author: Stephen Finucane <email address hidden>
Date: Fri Jun 19 13:43:42 2015 +0100

    Store correct VirtCPUTopology

    When booting a NUMA-enabled instance, the scheduler generates and
    stores a meaningless VirtCPUTopology as part of the InstanceNumaCell
    objects. A correct value is later generated and used in libvirt XML
    generation but this is not stored. Fix this by skipping the initial
    generation and storing of the VirtCPUTopology in favour of storing
    the correctly generated version.

    Change-Id: Ief4fcdec0e107a233225ebd68207ac6172b3751e
    Closes-bug: #1466780

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-09-03
Changed in nova:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Nikola Đipanov (ndipanov) wrote :

This is not a bug at all but is acutally by design. The NUMA cell CPU topology was meant to carry information about threads that we want to expose to the single cell based on how it was fitted with regards to threading on the host, so that we can expose this information to the guest OS if possible for optimal perf.

The final topoology exposed to the guest takes this information into account as well as any request for particular topology passed in by the user and decides on the final solution. There is no reason to store this as it will be different for different hosts.

The code that does instance fitting will use this in case of a migration, which is a problem described in https://bugs.launchpad.net/nova/+bug/1501358.

The bottom line is - there is really no need to save the topology calculated for the guest - we should revert this patch and mark the bug as invalid.

Thierry Carrez (ttx) on 2015-10-15
Changed in nova:
milestone: liberty-3 → 12.0.0

Changed to invalid as the released patch was later reverted, per Nikola's comments.

Changed in nova:
status: Fix Released → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers