Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned | ||
OpenStack Security Advisory |
Invalid
|
Undecided
|
Unassigned | ||
libvirt (Ubuntu) |
Expired
|
Undecided
|
Unassigned |
Bug Description
In my environment: Trusty (3.13) + JuJu (1.25) w/ latest charms + Kilo upgraded to Mitaka (already using non-tunnelled live migrations, after latest SRU to disable tunnelled live migrations)
My compute nodes are NOT loading "apparmor" libvirt capability by default:
inaddy@
1
inaddy@
1
inaddy@
1
Because "libvirt" is loaded before apparmor profile is loaded and qemu.conf doesn't specify 'security_driver = "apparmor' in its file. If you try to add the security driver to the file, libvirt and nova-compute won't start because apparmor isn't started when they start. For trusty, apparmor is started as a legacy SYS-V init script, at the end of initialisation, causing this problem.
After re-starting libvirt-bin service, apparmor starts being used:
inaddy@
libvirt-bin stop/waiting
libvirt-bin start/running, process 7031
inaddy@
0
inaddy@
libvirt-bin stop/waiting
libvirt-bin start/running, process 7031
inaddy@
0
inaddy@
libvirt-bin stop/waiting
libvirt-bin start/running, process 7031
inaddy@
0
And, when libvirt starts using apparmor, and creating apparmor profiles for every virtual machine created in the compute nodes, mitaka qemu (2.5) uses a fallback mechanism for creating shared memory for live-migrations. This fall back mechanism, on kernels 3.13 - that don't have memfd_create() system-call, try to create files on /tmp/ directory and fails.. causing live-migration not to work.
Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability = can't live migrate.
From qemu 2.5, logic is on :
void *qemu_memfd_
{
if (memfd_create)... ### only works with HWE kernels
else ### 3.13 kernels, gets blocked by apparmor
tmpdir = g_get_tmp_dir
...
mfd = mkstemp(fname)
}
And you can see the errors:
From the host trying to send the virtual machine:
2016-08-15 16:36:26.160 1974 ERROR nova.virt.
2016-08-15 16:36:26.248 1974 ERROR nova.virt.
From the host trying to receive the virtual machine:
Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 audit(147128977
Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 audit(147128977
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 audit(147128978
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 audit(147128978
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 audit(147128978
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 audit(147128978
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 audit(147128978
When leaving libvirt without apparmor capabilities (thus not confining virtual machines on compute nodes, the live migration works as expected, so, clearly, apparmor is stepping into the live migration). I'm sure that virtual machines have to be confined and that this isn't the desired behaviour...
Still trying to figure out rules from /etc/apparmor.
description: | updated |
description: | updated |
tags: | added: sts |
description: | updated |
information type: | Private Security → Public |
This looks like a bug in Libvirt / Qemu and/or packaging for libvirt / Qemu, not really something that can be fixed in Nova unless you have a recommendation on a change Nova can make as a call to qemu that would change this interaction with AppArmor.
I don't think Nova itself ships apparmor profiles (or SELinux for that matter), which is left to packagers (Ubuntu, Debian, Red Hat, etc).