KVM guest execution start apparmor blocks on /dev/ptmx now (regression?)

Bug #1684481 reported by ChristianEhrhardt on 2017-04-20
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apparmor (Ubuntu)
Undecided
Unassigned
linux (Ubuntu)
Medium
Unassigned
lxc (Ubuntu)
Wishlist
Christian Brauner
lxd (Ubuntu)
Undecided
Unassigned

Bug Description

Setup:
- Xenial host
- lxd guests with Trusty, Xenial, ...
- add a LXD profile to allow kvm [3] (inspired by stgraber)
- spawn KVM guests in the LXD guests using the different distro release versions
- guests are based on the uvtool default template which has a serial console [4]

Issue:
- guest starting with serial device gets blocked by apparmor and killed on creation
- This affects at least ppc64el and x86 (s390x has no serial concept that would match)
- This appeared in our usual checks on -proposed releases so maybe we can/should stop something?
  Last good was "Apr 5, 2017 10:40:50 AM" first bad one "Apr 8, 2017 5:11:22 AM"

Background:
We use this setup for a while and it was working without a change on our end.
Also the fact that it still works in the Trusty LXD makes it somewhat suspicious.
Therefore I'd assume an SRUed change in LXD/Kernel/Apparmor might be the reason and open this bug to get your opinion on it.

You can look into [1] and search for uvt-kvm create in it.

Deny in dmesg:
[652759.606218] audit: type=1400 audit(1492671353.134:4520): apparmor="DENIED" operation="open" namespace="root//lxd-testkvm-xenial-from_<var-lib-lxd>" profile="libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b" name="/dev/pts/ptmx" pid=27162 comm="qemu-system-ppc" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0

Qemu-log:
2017-04-20T06:55:53.139450Z qemu-system-ppc64: -chardev pty,id=charserial0: Failed to create PTY: No such file or directory

There was a similar issue on qmeu namespacing (which we don't use on any of these releases) [2].
While we surely don't have the "same" issue the debugging on the namespacing might be worth as it could be related.

Workaround for now:
- drop serial section from guest xml

[1]: https://jenkins.ubuntu.com/server/view/Virt/job/virt-migration-cross-release-amd64/78/consoleFull
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
[3]: https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test/tree/kvm_profile.yaml
[4]: https://libvirt.org/formatdomain.html#elementsCharPTY
---
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: ppc64el
DistroRelease: Ubuntu 16.04
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
Package: lxd
PackageArchitecture: ppc64el
ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro console=hvc0
ProcLoadAvg: 3.15 3.02 3.83 1/3056 79993
ProcSwaps:
 Filename Type Size Used Priority
 /swap.img file 8388544 0 -1
ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri Mar 31 14:05:15 UTC 2017
ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
Syslog:

Tags: xenial uec-images
Uname: Linux 4.4.0-72-generic ppc64le
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: utah
_MarkForUpload: True
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_smt: SMT is off
---
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: ppc64el
DistroRelease: Ubuntu 16.04
NonfreeKernelModules: cfg80211 ebtable_broute ebtable_nat binfmt_misc veth nbd openvswitch vhost_net vhost macvtap macvlan xt_conntrack ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables xt_comment xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables zfs zunicode zcommon znvpair spl zavl kvm_hv kvm ipmi_powernv ipmi_msghandler uio_pdrv_genirq vmx_crypto powernv_rng ibmpowernv leds_powernv uio ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_core ipr
Package: lxd
PackageArchitecture: ppc64el
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro console=hvc0
ProcLoadAvg: 5.56 5.25 4.60 1/3057 3526
ProcSwaps:
 Filename Type Size Used Priority
 none virtual 8388544 8388544 0
ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri Mar 31 14:05:15 UTC 2017
ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
Syslog:

Tags: xenial uec-images
Uname: Linux 4.4.0-72-generic ppc64le
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_smt: SMT is off
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Apr 12 17:37 seq
 crw-rw---- 1 root audio 116, 33 Apr 12 17:37 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
IwConfig: Error: [Errno 2] No such file or directory
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
Package: linux (not installed)
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro console=hvc0
ProcLoadAvg: 6.01 5.68 4.92 1/3060 83740
ProcSwaps:
 Filename Type Size Used Priority
 /swap.img file 8388544 0 -1
ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri Mar 31 14:05:15 UTC 2017
ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-72-generic N/A
 linux-backports-modules-4.4.0-72-generic N/A
 linux-firmware 1.157.8
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial uec-images
Uname: Linux 4.4.0-72-generic ppc64le
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: utah
_MarkForUpload: True
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_dscr: DSCR is 0
cpu_freq:
 min: 3.691 GHz (cpu 120)
 max: 3.691 GHz (cpu 8)
 avg: 3.691 GHz
cpu_runmode:
 Could not retrieve current diagnostics mode,
 No kernel interface to firmware
cpu_smt: SMT is off

tags: added: regression-proposed

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1684481

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → Medium

apport information

tags: added: apport-collected uec-images xenial
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

ChristianEhrhardt (paelzer) wrote :

Running apport-collect on Host (Xenial) and LXD Container (Xenial as well).
BTW I saw LXD is not in the report, it is at:
*** 2.12-0ubuntu3~ubuntu16.04.1~ppa1 500
    500 http://ppa.launchpad.net/ubuntu-lxc/lxd-stable/ubuntu xenial/main ppc64el Packages
    100 /var/lib/dpkg/status

The latter second apport-collect is from the LXD Container which daily + proposed on every Test.

description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

ChristianEhrhardt (paelzer) wrote :

Since apport-collect detected this as apparmor for the report I was also forcing a "linux" apport collect via "sudo apport-collect --package=linux 1684481" on the host - since the guest is LXD the kernel there (if any) doesn't matter).

Now logs should be complete.

Changed in linux (Ubuntu):
status: Incomplete → New

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Stéphane Graber (stgraber) wrote :

Ok, so that's an apparmor or apparmor profile problem.

LXD recently changed to also allow for apparmor profiles to be loaded inside privileged containers. This seems to align with your timeline above.

Before that change, your kvm process wasn't itself confined when run inside a privileged LXD container, instead only being confined by the container's own profile. With this LXD fix, we now offer the same behavior for unprivileged and privileged containers, letting the container load its own profile in both cases.

There are a number of problems with apparmor profiles being loaded as part of an apparmor stack not behaving the same as when loaded in the host, but those are either issues that need be addressed in the profiles or in the apparmor kernel code.

As far as we (LXD) are concerned, we'd very much appreciate it if apparmor could behave the same in containers as it does on the host, but we understand that there are design problems with this and so most apparmor profiles are now showing some problems...

Closing LXD task as invalid, since as far as LXD is concerned, we are doing the right thing wrt apparmor setup. This is caused by either apparmor misbehaving or the apparmor profile being invalid.

Changed in lxd (Ubuntu):
status: New → Invalid
John Johansen (jjohansen) wrote :

Its true there are a few issues with apparmor profiles being loaded as part of a stack when namespacing is involved. However this does not appear to be one of them.

However the application may be behaving slightly differently resulting in the profile needed to be extended. Can you please attach your libvirt profile files

/etc/apparmor.d/libvirt/libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b
/etc/apparmor.d/libvirt/libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b.files

so I can verify their contents. The likely fix is going to be expanding the profile to include access to
  /dev/pts/ptmx rw,

but I still need to verify something else isn't going on, and determine the best location to update.

ChristianEhrhardt (paelzer) wrote :

Thanks Stephane for outlining the likely related timeline of changes.
Thanks John for picking that up, let me search the profiles for you.

Only when writing that up I realized that there is a path difference that might as well be the root cause after all - writing it up after the attachments.

ChristianEhrhardt (paelzer) wrote :
ChristianEhrhardt (paelzer) wrote :

Now the abstraction used in this case via:
#include <abstractions/libvirt-qemu>

Held the following statement like for ages just for this use:
/dev/ptmx rw,

Please note the difference since the Deny is on:
/dev/pts/ptmx

That is especially notworthy since the former is just a link to the latter:
$ ll /dev/ptmx
lrwxrwxrwx 1 root root 13 Apr 20 17:19 /dev/ptmx -> /dev/pts/ptmx

So now inside the container apparmor resolves the path to be checked to "/dev/pts/ptmx".
Maybe it did all the time, but before profile stacking it didn't matter, but now it does.

Eventually we might just add /dev/pts/ptmx to the profile, but understanding why it detects the path. It could after all be an LXD issue (not saying that it has to be fixed there). It seems LXD binds these as:
'/dev/pts/ptmx'->'/dev/ptmx
At least that is what most search hits on the two paths showed me like in bug 1507959

That said this could be the reason why in this kvm-in-lxd case the path is no more resolved and checked by apparmor on /dev/ptmx which is allowed, but on /dev/pts/ptmx instead.

Is this something to be adressed in LXD or in apparmor or just a line to the libvirt profile - I'm not sure.
Setting LXD to new again to get Stephanes expertise again on that ptmx mapping.

Changed in lxd (Ubuntu):
status: Invalid → New
John Johansen (jjohansen) wrote :

Hey Christian,

thanks for the profiles, I haven't had a chance to dig into them yet, but after a quick first pass they look as expected.

so very interesting. First up apparmor has always done mediation post symlink resolution, this is not new with stacking. What is new with stacking is we are now loading policy within the container and applying it. And it can and will expose several things done to setup the container. Specifically you now have 2 profiles being enforced, the lxd container profile (which was being enforced before), and now system profiles from within the container, so in this case the libvirt profile. The libvirt profile within the container should work the same as when used on the host modulo any container setup that leaks through. This is generally around mounts, and namespacing.

The bind mount done in bug 1507959, will manifest it self in different ways than the symlink. Generally speaking bind mounts will act just like a file at the location they are bound (name resolution follows them, unlike symlink), but will require the mount rule to set them up.

With LXD doing a bind mount to /dev/ptmx its odd that you are seeing it as a symlink. I am going to do some investigation, and see if I can't replicate.

ChristianEhrhardt (paelzer) wrote :

Thank John,
as extra info on the ptmx pathing.

Host:
$ ls -laF /dev/ptmx /dev/pts/ptmx
crw-rw-rw- 1 root root 5, 2 Apr 21 2017 /dev/ptmx
c--------- 1 root root 5, 2 Apr 12 17:36 /dev/pts/ptmx

Container:
$ lxc exec testkvm-xenial-from -- ls -laF /dev/ptmx /dev/pts/ptmx
lrwxrwxrwx 1 root root 13 Apr 20 17:19 /dev/ptmx -> /dev/pts/ptmx
crw-rw-rw- 1 root root 5, 2 Apr 20 17:19 /dev/pts/ptmx

That plus your explanation on "mediation after symlink" explains why we see this.
In the non container case it is NOT a symlink, it will open /dev/ptmx and that is the path apparmor mediates and things work.
But in the container case it is a symlink, so it is resolved before mediation and the new path in /dev/pts/ptmx is blocked by the profile.

@Stephane - could/would lxd be able to do that in a way without the symlink but "as in the host"?

Stéphane Graber (stgraber) wrote :

We're looking at changing lxc to show /dev/ptmx as a real file rather than symlink. This is however not particularly easy because:
 - It can't be a bind-mount from the host (or it will interact with the host's devpts)
 - It can't be a straight mknod (because that's not allowed in unprivileged containers)

So we're looking at re-ordering the liblxc code to setup a bind-mount from /dev/pts/ptmx to /dev/ptmx INSIDE the container, which should work.

That part of the kernel has changed quite a bit, so making sure we don't break things for supported kernels (2.6.32 or higher) is going to be a bit tricky.

Note that there is nothing wrong with /dev/ptmx being a symlink to /dev/pts/ptmx and I'd argue it's actually "more right" than having it be a device node. But since that's not what udev/devtmpfs do, we probably should mimic the host's behavior.

Changed in lxd (Ubuntu):
status: New → Invalid
Changed in lxc (Ubuntu):
status: New → Triaged
importance: Undecided → Wishlist
John Johansen (jjohansen) wrote :

Thanks Stéphane,

@Christian, it looks like adding a rule
  /dev/pts/ptmx rw,

to the profile is necessary for now.

Christian Brauner (cbrauner) wrote :

Hi John,
hi Christian,

Sent a branch to lxc that should fix this issue: https://github.com/lxc/lxc/pull/1519

Changed in lxc (Ubuntu):
status: Triaged → In Progress
Changed in lxc (Ubuntu):
status: In Progress → Fix Committed
assignee: nobody → Christian Brauner (cbrauner)
ChristianEhrhardt (paelzer) wrote :

Thanks Stephane and Christian!

Since we ...

a) have a workaround by manually adding the entry to the apparmor abstraction (or dropping serial if that is an option)
b) having an explicit serial in the guest profile is not the default
c) KVM in LXD is more a "nice to have" solution than something everybody uses on a daily base (To drive KVM in LXD applying a certain amount of tricks e.g. to the profile is already involved)

... I think for now we can wait on the LXC update.
Please update when you know about the schedule that it might happen on.

Dropping the Kernel Task to invalid as it turns out not to be involved, and the apparmor task to "Won't Fix" as while being involved there is no change we want to make there.

Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in apparmor (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers