Libvirt CPU affinity error
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | libvirt (Ubuntu) |
High
|
Unassigned | ||
| | Vivid |
High
|
Unassigned | ||
| | nova (Ubuntu) |
Undecided
|
Unassigned | ||
| | Vivid |
Undecided
|
Unassigned | ||
Bug Description
=======
SRU Justification
1. Impact: VMs fail to launch with TCG (non-kvm-
2. Stable fix: cherrypick a patch from upstream.
3. Regression potential: this only slightly relaxes the check for multiple cpus, and is a cherrpyick from upstream. It therefore should not introduce any regressions.
4. Test case: See below - or simply attempt to launch a VM with multiple cpus on non-accelerated qemu.
=======
I'm testing the Kilo packages from the cloud archive staging PPA on 14.04 and cannot launch a VM due to a Libvirt CPU affinity error. I'm using QEMU because my environment resides on cloud servers.
Package versions:
ii nova-common 1:2015.
ii nova-compute 1:2015.
ii nova-compute-kvm 1:2015.
ii nova-compute-
ii python-nova 1:2015.
ii python-novaclient 1:2.22.
ii libvirt-bin 1.2.12-
ii libvirt0 1.2.12-
Content of nova compute logs while attempting to launch an CirrOS/m1.tiny instance :
2015-03-31 23:00:07.106 31118 INFO nova.compute.
2015-03-31 23:00:07.180 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.181 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.181 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.182 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.182 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.206 31118 INFO nova.compute.claims [-] [instance: 0551a285-
2015-03-31 23:00:07.295 31118 INFO nova.scheduler.
2015-03-31 23:00:07.407 31118 INFO nova.scheduler.
2015-03-31 23:00:07.574 31118 INFO nova.virt.
2015-03-31 23:00:07.819 31118 INFO nova.scheduler.
2015-03-31 23:00:07.907 31118 INFO nova.virt.
2015-03-31 23:00:10.902 31118 ERROR nova.virt.
<name>
<uuid>
<metadata>
<nova:instance xmlns:nova="http://
<nova:package version=
<
<
<nova:flavor name="m1.tiny">
<
<nova:owner>
<nova:user uuid="f214e083a
</nova:owner>
<nova:root type="image" uuid="38047887-
</nova:
</metadata>
<memory unit='KiB'
<currentMemory unit='KiB'
<vcpu placement='static' cpuset=
<cputune>
<shares>
</cputune>
<sysinfo type='smbios'>
<system>
<entry name='manufactu
<entry name='product'
<entry name='version'
<entry name='serial'
<entry name='uuid'
</system>
</sysinfo>
<os>
<type arch='x86_64' machine=
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='host-model'>
<model fallback='allow'/>
<topology sockets='1' cores='1' threads='1'/>
</cpu>
<clock offset='utc'/>
<on_poweroff>
<on_reboot>
<on_crash>
<devices>
<emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<controller type='usb' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<interface type='bridge'>
<mac address=
<source bridge=
<target dev='tap4d42f32
<model type='virtio'/>
<driver name='qemu'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='file'>
<source path='/
<target port='0'/>
</serial>
<serial type='pty'>
<target port='1'/>
</serial>
<console type='file'>
<source path='/
<target type='serial' port='0'/>
</console>
<input type='tablet' bus='usb'/>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0' keymap='en-us'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<video>
<model type='cirrus' vram='16384' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
<stats period='10'/>
</memballoon>
</devices>
</domain>
2015-03-31 23:00:10.903 31118 ERROR nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.903 31118 TRACE nova.compute.
2015-03-31 23:00:10.907 31118 INFO nova.compute.
2015-03-31 23:00:10.913 31118 INFO nova.virt.
2015-03-31 23:00:10.936 31118 INFO nova.virt.
2015-03-31 23:00:10.937 31118 INFO nova.virt.
2015-03-31 23:00:11.053 31118 INFO nova.scheduler.
| Stephen Gordon (sgordon) wrote : | #1 |
| Launchpad Janitor (janitor) wrote : | #2 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in nova (Ubuntu): | |
| status: | New → Confirmed |
| Kashyap Chamarthy (kashyapc) wrote : | #3 |
Yes, to test CPU pinning/NUMA with libvirt you ought to use Nested KVM.
Please report results after testing with that.
That said, some notes below.
Quoting Dan Berrange from a different review with a complete response on
*why*:
It is fundamentally impossible to test CPU pinning with TCG (aka plain
QEMU) because TCG only has a single thread for all virtual CPUs. As such
there is no mechanism to pin vCPU threads with TCG. Nested KVM is thus
the only possible option for testing any of the NUMA / CPU pinning
stuff. Instructions for nested KVM setup on a KVM host are documented
here
From my testing, a Nova guest booted with a NUMA flavor, will have the
below contextual XML snippets w.r.t vCPU placement:
. . .
<vcpu placement=
<cputune>
<vcpupin vcpu='0' cpuset='0-3'/>
<vcpupin vcpu='1' cpuset='0-3'/>
<vcpupin vcpu='2' cpuset='0-3'/>
<vcpupin vcpu='3' cpuset='0-3'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
</numatune>
. . .
<cpu>
<topology sockets='4' cores='1' threads='1'/>
<numa>
<cell id='0' cpus='0-3' memory='1048576' unit='KiB'/>
</numa>
</cpu>
. . .
Here's the working example XMLs from my testing.
Libvirt XML for the guest hypervisor (also called L1), running DevStack
and will host Nova instances which are nested guests:
https:/
And, Nova guest XML, booted with a NUMA flavor:
https:/
| Changed in nova (Ubuntu): | |
| status: | Confirmed → Incomplete |
| Matt Kassawara (ionosphere80) wrote : | #4 |
I'm not testing NUMA. I am launching a basic CirrOS image using the m1.tiny flavor, neither of which should trigger NUMA bits. In fact, adding the "hw:cpu_
| Kashyap Chamarthy (kashyapc) wrote : | #5 |
Matt, you're right, allow me to correct myself below. Short: I still cannot reproduce it.
I just tested it below in a Single node DevStack with today's Nova
git with the Nova instance being QEMU emulated, but I cannot reproduce
the said failure in this bug description.
Test environment
----------------
$ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86
4.0.
libvirt-
qemu-
I'm at these commits in my All-In-One DevStack environment:
cinder:
commit c7ca4b95b56539d
devstack:
commit 72bdc8c27102db3
glance:
commit f84e49db5a455b3
keystone:
commit af568dd1afdcdc9
neutron:
commit 483de6313fab591
nova:
commit 74ca660ab688e15
requirements:
commit 56ab196ad1fb0e3
Test
----
$ nova flavor-show 1
+--
| Property | Value |
+--
| OS-FLV-
| OS-FLV-
| disk | 1 |
| extra_specs | {} |
| id | 1 |
| name | m1.tiny |
| os-flavor-
| ram | 512 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 1 |
+--
Boot the instance:
$ nova boot --config-drive false --flavor 1 \
--key_name oskey1 --image cirros-
Nova instance's guest XML attached.
That's the QEMU invocation
-------
$ ps -ef | grep qemu-system-x86_64
qemu 2889 1 14 11:12 ? 00:00:11 /usr/bin/
| Matt Kassawara (ionosphere80) wrote : | #6 |
I'm fairly certain this is specific to the Ubuntu packages, not upstream nova or libvirt.
| Changed in nova (Ubuntu): | |
| status: | Incomplete → New |
| Kashyap Chamarthy (kashyapc) wrote : | #7 |
@Matt: Since you're farily certain that this is specific to Ubuntu, then I hope Ubuntu's Nova package maintainers will take a look. . .
| Mark Vanderwiel (vanderwl) wrote : | #8 |
Any ideas on how one could workaround this?
How to disable affinity in nova, libvirt, or qemu config's?
Some easy place in the code to hack?
| Matt Kassawara (ionosphere80) wrote : | #9 |
Still an issue in the 2015.1~
| Launchpad Janitor (janitor) wrote : | #10 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in nova (Ubuntu): | |
| status: | New → Confirmed |
| James Page (james-page) wrote : | #11 |
Nothing jumps out at me from the list of patches we have in the nova package - most are working around testing challenges due to offline build environments.
| James Page (james-page) wrote : | #12 |
Raising a libvirt task to get the libvirt maintainers attention - I'll poke on irc as well.
| Martin Mailand (todin) wrote : | #13 |
@Matt V: I hacked an easy place in /usr/lib/
change
if CONF.libvirt.
IN
if CONF.libvirt.
the commit who changed the numa behavior is 945ab28.
I am not sure, does qemu without kvm has numa support?
@Matt K: Are you certain that it applies only to ubuntu packages? The changes are made upstream.
I tested in virtualbox.
| Matt Kassawara (ionosphere80) wrote : | #14 |
My installation with packages uses the cloud archive repo that includes libvirt 1.2.12. My installation from source uses the generic 14.04 LTS repo that includes libvirt 1.2.2. Both installations use the same nova code (RC1), but the older version of libvirt doesn't exhibit this issue. Also, I'm fairly certain that QEMU doesn't support any form of NUMA.
| Martin Mailand (todin) wrote : | #15 |
@Matt:
in line 359 in the driver.py is the minimum libvirt version definded for which the numa code is activated.
MIN_LIBVIRT_
Therfore you did not trigger the behavior with the source code installation.
| Chuck Short (zulcss) wrote : | #16 |
I wasnt able to reproduce this on vivid.
| Matt Kassawara (ionosphere80) wrote : | #17 |
Chuck,
What version of libvirt?
| Liusheng (liusheng) wrote : | #18 |
I have met the same issue:
root@openstack:~# virsh version
Compiled against library: libvirt 1.2.12
Using library: libvirt 1.2.12
Using API: QEMU 1.2.12
Running hypervisor: QEMU 2.2.0
-------
root@openstack:~# dpkg -l |grep nova
ii nova-api 1:2015.
ii nova-cert 1:2015.
ii nova-common 1:2015.
ii nova-compute 1:2015.
ii nova-compute-kvm 1:2015.
ii nova-compute-
ii nova-conductor 1:2015.
ii nova-consoleauth 1:2015.
ii nova-novncproxy 1:2015.
ii nova-scheduler 1:2015.
ii python-nova 1:2015.
ii python-novaclient 1:2.22.
| Mark Vanderwiel (vanderwl) wrote : | #19 |
I also see this is still an issue with http://
VERSION="14.04.1 LTS, Trusty Tahr"
# virsh version
Compiled against library: libvirt 1.2.12
Using library: libvirt 1.2.12
Using API: QEMU 1.2.12
Running hypervisor: QEMU 2.2.0
nova-compute 1:2015.
| Launchpad Janitor (janitor) wrote : | #20 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in libvirt (Ubuntu): | |
| status: | New → Confirmed |
| Rui Chen (kiwik-chenrui) wrote : | #21 |
Looks like a nova bug, I guess it will issue in virt_type=qemu, libvirt>=1.2.7 and compute host that supporting NUMA.
| Serge Hallyn (serge-hallyn) wrote : | #22 |
I'm a bit confused - what exactly is the bug? It is not that cpusets inside the guests to not work right? It's that the qemu guest's vcpus are not pinned to the specified cpus on the host? The host is a true hardware host? The guest is running accelerated KVM? How is it being verified that it is misbhaving - the kvm process is just not in a proper cpuset??
| Martin Mailand (todin) wrote : | #23 |
The hostsystem is virtualbox, and the guest system is qemu without kvm, because virtualbox doesn't support hardware acceleration.
The Problem is, that nova-compute generates an invalid xml for this combination.
The offending part is "<vcpu placement=
This part is not accepted from libvirt and an error is logged.
As a result of this I am unable to test Openstack in a Vagrant Virtualbox environment.
| Serge Hallyn (serge-hallyn) wrote : | #24 |
Thanks - so the solution is for nova to only drop the placement='static' from that line, whenever it knows it is using tcg?
| Matt Kassawara (ionosphere80) wrote : | #25 |
On the same system (QEMU only) and same version of nova (Kilo), I can launch an instance with Libvirt 1.2.2 (included with Ubuntu 14.04) but receive this error with Libvirt 1.2.12 (included with the Kilo cloud-archive repo). Either Libvirt 1.2.12 reports the wrong capabilities to nova or nova makes some sort of incorrect assumptions with it.
Can you show 'virsh capabilities' output with both packages?
| Liusheng (liusheng) wrote : | #27 |
The packages list of ubuntu-archive of kilo:
http://
| Tony Breeds (o-tony) wrote : | #29 |
But that makes no sense. If you were changing
# While earlier versions could support NUMA reporting and
# NUMA placement, not until 1.2.7 was there the ability
# to pin guest nodes to host nodes, so mandate that. Without
# this the scheduler cannot make guaranteed decisions, as the
# guest placement may not match what was requested
MIN_LIBVIRT_
I could see that helping but MIN_LIBVIRT_
| Jeffrey Zhang (jeffrey4l) wrote : | #30 |
@Tony
Yes. You are right. I am sorry that I make a wrong file diff. I hide my last comment, and paste a correct one.
I meet this issue too. After applied following patch, it works. I think in the ubuntu, the libvirt (1.2.12, from cloud-archive) doesn't support numa.
diff --git a/nova/
index 98a4537..4d573e1 100644
--- a/nova/
+++ b/nova/
@@ -355,7 +355,7 @@ REQ_HYPERVISOR_
# to pin guest nodes to host nodes, so mandate that. Without
# this the scheduler cannot make guaranteed decisions, as the
# guest placement may not match what was requested
-MIN_LIBVIRT_
+MIN_LIBVIRT_
# While earlier versions could support hugepage backed
# guests, not until 1.2.8 was there the ability to request
# a particular huge page size. Without this the scheduler
| Tony Breeds (o-tony) wrote : | #31 |
It seems that nova's libvirt driver is generating an invalid domain xml. If I understand correctly specifyin a 'vcpu' node with a cpuset is invlaid in TCG *unless* you also specify emulatorpin See: https:/
| Tony Breeds (o-tony) wrote : | #32 |
This patch (which hasn't gone anywhere near upstream yet)
Forces the libvirt driver in nova to avoid generating a cpuset and there fore no longer generates the invalid domain XML.
Next steps are to discuss my findings with upstream libvirt and nova developers to see if I'm correct of I've just fluked it.
The attachment "1439280.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.
[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]
| tags: | added: patch |
| Jeffrey Zhang (jeffrey4l) wrote : | #34 |
I found the root cause in my environment.
I use libvirt in LXC. And the lxc doesn't enable the cgroup w/ read and write permission. I add/change the config file with following line.
lxc.aa_profile = lxc-container-
And install `cgroup-lite` in lxc guest. The libvirt works with CPU affinity.
| Mark Vanderwiel (vanderwl) wrote : | #36 |
Using the patch above (that basically hacks out qemu specifically), nova boot work fine.
Same qemu environment as I noted in my 4-23 post.
Can we get this patch out for formal review by nova folks?
| Tony Breeds (o-tony) wrote : | #37 |
We're in discussions with the libvirt devs to work out if the fix is correct and/or exposes a libvirt bug.
Once that discussion concludes there will the a nova patch posted (and tagged for backport)
| Serge Hallyn (serge-hallyn) wrote : | #38 |
Thanks - the patch seems to make sense.
| Tony Breeds (o-tony) wrote : | #39 |
Summary of the libvirt discussion. current upstream works. the libvirt team would like to identify the libvirt fixes required and get them backported. to the maintenance releases.
With reference to:
https:/
http://
If I read those links correctly we're still going to need to fix nova and/or get the pack-ports into the appropriate ubuntu libvirt packages.
| Tony Breeds (o-tony) wrote : | #40 |
For the record. Applying this patch to the cloud-archive libvirt package should fix the problem.
http://
| Tony Breeds (o-tony) wrote : | #41 |
I was pointed at the v1.2.12-maint head in the libvirt git which contains this fix already.
http://
I suggest we close the nova issue with won't fix and get the correctly backported patch into the libvirt package.
| Dr. Jens Harbott (j-harbott) wrote : | #42 |
"wont-fixing" the nova side will leave it broken for quite some time until the backport has made its way into all relevant distro images. I'd prefer to add your patch into nova code as workaround for older libvirt versions.
| Changed in libvirt (Ubuntu): | |
| status: | Confirmed → Fix Released |
| importance: | Undecided → High |
| Changed in libvirt (Ubuntu Vivid): | |
| importance: | Undecided → High |
| description: | updated |
Hello Matt, or anyone else affected,
Accepted libvirt into vivid-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-
Further information regarding the verification process can be found at https:/
| Changed in libvirt (Ubuntu Vivid): | |
| status: | New → Fix Committed |
| tags: | added: verification-needed |
| Tony Breeds (o-tony) wrote : | #44 |
I can verify that installing 1.2.12-0ubuntu13 on vivid fixes the issue for me.
Please forgive my ignorance but can that package be tagged into cloud-archive once it's officially a vivid update?
| tags: |
added: verification-done removed: verification-needed |
| Tony Breeds (o-tony) wrote : | #45 |
@j-rosenboom-j My "fix" for nova will never be accepted upstream. I wont speak for the Ubuntu developers but I strongly suspect that they'll be unwilling to diverge from upstream. Especially as they've already shown the fix will land in vivid.
| Mark Vanderwiel (vanderwl) wrote : | #46 |
When will this get fixed for Trusty? that's where it was originally reported.
| Matt Kassawara (ionosphere80) wrote : | #47 |
The cloud archive repository currently does not contain a fix for Kilo on Trusty.
| Tony Breeds (o-tony) wrote : | #48 |
@vanderwl: The original report was against the trusty cloud-archive repo.
If you look at: https:/
Only vivid and the could archive PPA
| Mark Vanderwiel (vanderwl) wrote : | #49 |
Matt, Tony, that's for the clarification. I'm still a bit confused as to when the cloud archive used by trusty (http://
| Tony Breeds (o-tony) wrote : | #50 |
@vanderwl No problem I'm new to cloud archive as well.
So the URL I have is: https:/
and that shows (for me) that 4 hours ago the (kilo-staging) cloud-archive PPA got the fixed libvirt.
So in theory We're done here. Matt and I just need to verify that the new package is correct (I have no doubt it is)
| Matt Kassawara (ionosphere80) wrote : | #51 |
I'm waiting until the package appears in the official cloud archive repository.
| Launchpad Janitor (janitor) wrote : | #52 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in nova (Ubuntu Vivid): | |
| status: | New → Confirmed |
| Launchpad Janitor (janitor) wrote : | #53 |
This bug was fixed in the package libvirt - 1.2.12-0ubuntu13
---------------
libvirt (1.2.12-0ubuntu13) vivid-proposed; urgency=medium
* 9038-qemu-
-- Serge Hallyn <email address hidden> Wed, 13 May 2015 10:48:53 -0500
| Changed in libvirt (Ubuntu Vivid): | |
| status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for libvirt has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
| Chuck Short (zulcss) wrote : | #55 |
This should be fixed now, please re-open if it isnt.
| Changed in nova (Ubuntu): | |
| status: | Confirmed → Fix Released |


This guest XML would be expected to fail if the guest is running on a host that is using qemu w/o kvm acceleration:
<vcpu placement='static' cpuset= '0-1'>1< /vcpu>
...as qemu does not support pinning when kvm isn't available. What's not clear to me is why the Ubuntu version would be adding this line since it sounds like master/kilo-3 is behaving correctly per the spec (only adding these lines where the user explicitly requests direct pinning of CPUs on the image or flavor [1]).
[1] http:// specs.openstack .org/openstack/ nova-specs/ specs/juno/ approved/ virt-driver- cpu-pinning. html