nova libvirt driver assumes libvirt support for CPU pinning

Bug #1438226 reported by Stephen Finucane on 2015-03-30
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Stephen Finucane

Bug Description

CPU pinning support was implemented as part of this blueprint:

    http://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-cpu-pinning.html

However, CPU pinning support is broken in some libvirt versions (summarized below), resulting in exceptions when attempting to schedule instances with the 'hw:cpu_policy' flavor key.

We should add a libvirt version test against known broken versions and use that to determine whether or not to support the flavor keys.

This is somewhat related to #1422775 ("nova libvirt driver assumes qemu support for NUMA pinning").

---

# Testing Configuration

Testing was conducted in a container which provided a single-node, Fedora 21-based (3.17.8-300.fc21.x86_64) OpenStack instance (built with devstack). The yum-provided libvirt and its dependencies were removed and libvirt and libvirt-python were built and installed from source.

# Results

The results are as follows:

    versions status
    -------- ------
    1.2.9 ok
    1.2.9.1 ok
    1.2.9.2 fail
    1.2.10 fail
    1.2.11 ok
    1.2.12 ok

v1.2.9.2 is broken by this (backported) patch:

    https://www.redhat.com/archives/libvir-list/2014-November/msg00275.html

This can be seen as commit

    e226772 (qemu: fix domain startup failing with 'strict' mode in numatune)

v1.2.10 inherits is broken at checkout but can be fixed by applying these three patches (yes, one of these broke v1.2.9.2 - the irony is not lost on me):

    [0/3] https://www.redhat.com/archives/libvir-list/2014-November/msg00274.html
     - [1/3] https://www.redhat.com/archives/libvir-list/2014-November/msg00273.html
     - [2/3] https://www.redhat.com/archives/libvir-list/2014-November/msg00276.html
     - [3/3] https://www.redhat.com/archives/libvir-list/2014-November/msg00275.html

# Error logs

v1.2.9.2 produces the following exception:

    Traceback (most recent call last):
      File "/opt/stack/nova/nova/compute/manager.py", line 2301, in _build_resources
        yield resources
      File "/opt/stack/nova/nova/compute/manager.py", line 2171, in _build_and_run_instance
        flavor=flavor)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2357, in spawn
        block_device_info=block_device_info)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4376, in _create_domain_and_network
        power_on=power_on)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4307, in _create_domain
        LOG.error(err)
      File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
        six.reraise(self.type_, self.value, self.tb)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4297, in _create_domain
        domain.createWithFlags(launch_flags)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
        result = proxy_call(self._autowrap, f, *args, **kwargs)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
        rv = execute(f, *args, **kwargs)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
        six.reraise(c, e, tb)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
        rv = meth(*args, **kwargs)
      File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1029, in createWithFlags
        if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
    libvirtError: Failed to create controller cpu for group: No such file or directory

v1.2.10 produces the following exception:

    Traceback (most recent call last):
      File "/opt/stack/nova/nova/compute/manager.py", line 2342, in _build_resources
        yield resources
      File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance
        block_device_info=block_device_info)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2356, in spawn
        block_device_info=block_device_info)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4375, in _create_domain_and_network
        power_on=power_on)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4306, in _create_domain
        LOG.error(err)
      File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
        six.reraise(self.type_, self.value, self.tb)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4296, in _create_domain
        domain.createWithFlags(launch_flags)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
        result = proxy_call(self._autowrap, f, *args, **kwargs)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
        rv = execute(f, *args, **kwargs)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
        six.reraise(c, e, tb)
      File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
        rv = meth(*args, **kwargs)
      File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1037, in createWithFlags
        if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
    libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/system.slice/docker.service/machine.slice/machine-qemu\x2dinstance\x2d0000000a.scope/cpuset.mems': Device or resource busy

Sean Dague (sdague) wrote :

Because this should *hopefully* be a small conditional fix, I think it should end up on the RC list.

Changed in nova:
status: New → Confirmed
importance: Undecided → High
milestone: none → kilo-rc1
tags: added: numa
tags: removed: numa
Changed in nova:
assignee: nobody → Radoslaw Smigielski (radoslaw-smigielski)
John Garbutt (johngarbutt) wrote :

Seems related to this patch: https://review.openstack.org/#/c/159106/

Nikola Đipanov (ndipanov) wrote :

Not marking as duplicate because this bug actually raises this problem wrt CPU pinning which is not directly mentioned by other bugs (but is addressed by the patch https://review.openstack.org/#/c/159106/), instead I will make sure the patch references this bug as well

Nikola Đipanov (ndipanov) wrote :

Actually looking into this further - it seems that the proposed patch does not cover this. This bug refers to specific versions of libvirt that have been found to be broken.

tags: added: kilo-rc-potential

Fix proposed to branch: master
Review: https://review.openstack.org/170190

Changed in nova:
assignee: Radoslaw Smigielski (radoslaw-smigielski) → Stephen Finucane (stephen-finucane)
status: Confirmed → In Progress

Apologies @radoslaw-smigielski for not assigning myself sooner.

Reviewed: https://review.openstack.org/170190
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c380987aa8844a46b2d50e55350e89e0791d76b6
Submitter: Jenkins
Branch: master

commit c380987aa8844a46b2d50e55350e89e0791d76b6
Author: Stephen Finucane <email address hidden>
Date: Tue Mar 31 11:03:48 2015 +0100

    libvirt: Add version check when pinning guest CPUs

    Ensure versions of libvirt with broken CPU pinning support are not used
    for said feature. This requires the addition of a new Exception,
    specific version check functionality and unit tests for same.

    Change-Id: I03b462c4985517ff8a4d94f0e1acae4fabdc5d39
    Closes-Bug: #1438226

Changed in nova:
status: In Progress → Fix Committed
tags: removed: kilo-rc-potential
Thierry Carrez (ttx) on 2015-04-10
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-04-30
Changed in nova:
milestone: kilo-rc1 → 2015.1.0

Reviewed: https://review.openstack.org/178188
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7ca56106def7950aceecacf40b2ae8de7c846cb2
Submitter: Jenkins
Branch: master

commit 7ca56106def7950aceecacf40b2ae8de7c846cb2
Author: Stephen Finucane <email address hidden>
Date: Thu Apr 23 14:01:10 2015 +0100

    libvirt: Disable NUMA for broken libvirt

    Ensure versions of libvirt with broken NUMA tuning support are not used
    for said feature.

    Change-Id: I6de388f1ca98c1ae16f2968f59881e3b0dba5f8d
    Closes-Bug: #1449028
    Related-Bug: #1438226

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers