udev interface fails in privileged containers

Bug #1712808 reported by Colin Watson on 2017-08-24
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
snapd
Medium
Unassigned
lxd (Ubuntu)
Undecided
Unassigned

Bug Description

I think this is possibly a known issue since there's evidence of a workaround in e.g. https://stgraber.org/2017/01/13/kubernetes-inside-lxd/, but I couldn't find any proper discussion of it.

Installing snaps in a privileged LXD container fails. Here's a test script:

  $ lxc launch -c security.privileged=true ubuntu:16.04 snap-test
  $ lxc exec snap-test apt update
  $ lxc exec snap-test apt install squashfuse
  $ lxc exec snap-test snap install hello-world
  2017-08-24T12:03:59Z INFO cannot auto connect core:core-support-plug to core:core-support: (slot auto-connection), existing connection state "core:core-support-plug core:core-support" in the way
  error: cannot perform the following tasks:
  - Setup snap "core" (2462) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
  udev output:
  )
  - Setup snap "core" (2462) security profiles (cannot reload udev rules: exit status 2
  udev output:
  )

This is because /sys is mounted read-only in privileged containers (presumably to avoid causing havoc to the host) and so the systemd-udevd service isn't started. The prevailing recommendation seems to be to work around it by making /usr/local/bin/udevadm be a symlink to /bin/true, but this looks like a hack rather than a proper fix.

Colin Watson (cjwatson) wrote :

On IRC, Stéphane suggested making the container "even more privileged" as a cleaner workaround, by adding the following to raw.lxc:

  lxc.mount.auto=
  lxc.mount.auto=proc:rw sys:rw

(I also had to fiddle with my restrictive policy-rc.d script to allow udev to start.)

Perhaps documenting that somewhere reasonably findable would be good enough?

Zygmunt Krynicki (zyga) wrote :

I'm not quite sure what's the difference between the regular and privileged (or more privileged) containers but last time we looked at similar issues we came to the conclusion that any container in which apparmor is not stacked but instead directly shared with the host is unsupportable for us. I'm not sure if this is the same problem again. I didn't try to reproduce it yet.

Colin Watson (cjwatson) wrote :

The "even more privileged" workarounds have been working in launchpad-buildd for a while now. We can't use unprivileged containers for various reasons, for example because one of the categories of builds that needs to install snaps sometimes is live filesystem builds, and those do various things like mknod that'll never work in unprivileged containers.

Of course, launchpad-buildd is somewhat special in that it typically only runs a single build before shutting down the VM, so I can imagine that there might be some isolation failures that are a problem in general but that don't affect us in practice. Please don't outright forbid privileged containers though, as we don't really have a good alternative.

Michael Vogt (mvo) on 2018-01-09
Changed in snapd:
status: New → Triaged
importance: Undecided → Medium
Zygmunt Krynicki (zyga) wrote :

I'm wondering what we can do about it.

When we're not running in a unprivileged container anything that we do inside (tweak cgroups, tweak apparmor) will contaminate the host. If the host also uses snaps those definitions will conflict and collide.

I see two options:

1) Close as WONTFIX as in reality this cannot work very well
2) Make it so that launchpad doesn't have to do hacks ... somehow and ignore the contamination

I'm not so sure how 2) would even look like. Shall we ignore errors? Even if we do snaps may fail at runtime, depending on what they do.

Could launchpad spawn a VM instead of a container for this? (I know it's far heavier)

Changed in snapd:
status: Triaged → Incomplete
Colin Watson (cjwatson) wrote :

I filed this bug because it seems ugly, but it does at least work with our current hacks, so closing this as Won't Fix would be better than changing something in a way that makes our hacks not work. :-) If you feel you need to close it then go ahead.

We already run every build in a dedicated VM that's reset at the start of each build (hence why we really don't care whether the container contaminates the host - the host is going to be thrown away anyway). However, those VMs are generic: for instance, they're currently all xenial rather than being for the release we're building for. We use the container both to avoid too much in the way of interference from the software that runs the builder itself and to arrange for the build to be running on the appropriate version of Ubuntu. Using another VM here would both be more complicated/expensive to set up and either slower to run or entirely non-functional due to requiring nested virtualisation. So no, we can't reasonably switch to a VM rather than a container.

Launchpad Janitor (janitor) wrote :

[Expired for snapd because there has been no activity for 60 days.]

Changed in snapd:
status: Incomplete → Expired
Anthony Fok (foka) on 2018-04-05
Changed in snapd:
status: Expired → Confirmed
Download full text (3.4 KiB)

This will come up again and more frequently now that the LXD package upgrade will do the deb->snap transition even when running in a container itself.

As Colin I run (and others might) privileged containers a lot using those extra privs: http://paste.ubuntu.com/p/bcVHRBTKyP/

I never had an issue as I didn't try to snap-in-lxd on my own, but the new package transition will trigger this.

Due to that the severity of this case increases a bit.

[...]
Preparing to unpack .../16-apache2-utils_2.4.34-1ubuntu2_amd64.deb ...
Unpacking apache2-utils (2.4.34-1ubuntu2) over (2.4.34-1ubuntu1) ...
Preparing to unpack .../17-lxd-client_1%3a0.4_all.deb ...
Unpacking lxd-client (1:0.4) over (3.0.2-0ubuntu3) ...
Setting up apparmor (2.12-4ubuntu8) ...
Installing new version of config file /etc/apparmor.d/abstractions/private-files ...
Installing new version of config file /etc/apparmor.d/abstractions/private-files-strict ...
Installing new version of config file /etc/apparmor.d/abstractions/ubuntu-browsers.d/user-files ...
Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
Setting up squashfs-tools (1:4.3-6ubuntu2) ...
Setting up libapparmor1:amd64 (2.12-4ubuntu8) ...
Setting up systemd (239-7ubuntu10) ...
Setting up udev (239-7ubuntu10) ...
update-initramfs: deferring update (trigger activated)
Setting up snapd (2.35.5+18.10) ...
snapd.failure.service is a disabled or a static unit, not starting it.
snapd.snap-repair.service is a disabled or a static unit, not starting it.
(Reading database ... 66334 files and directories currently installed.)
Preparing to unpack .../00-lxd_1%3a0.4_all.deb ...
Warning: Stopping lxd.service, but it can still be activated by:
  lxd.socket
=> Installing the LXD snap
==> Checking connectivity with the snap store
==> Installing the LXD snap from the latest track for ubuntu-18.10
error: cannot perform the following tasks:
- Setup snap "core" (5548) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (5548) security profiles (cannot reload udev rules: exit status 2
udev output:
)
dpkg: error processing archive /tmp/apt-dpkg-install-R4N7rz/00-lxd_1%3a0.4_all.deb (--unpack):
 new lxd package pre-installation script subprocess returned error exit status 1
Preparing to unpack .../01-open-iscsi_2.0.874-5ubuntu9_amd64.deb ...
[...]

Interesting to me was that a subsequent
$ apt --fix-broken install
does fix it up.

Might there be an ordering issue in the snap/lxd updates that are not an issue for "real" Bionic->Cosmic upgraders?

(Reading database ... 66334 files and directories currently installed.)
Preparing to unpack .../archives/lxd_1%3a0.4_all.deb ...
Warning: Stopping lxd.service, but it can still be activated by:
  lxd.socket
=> Installing the LXD snap
==> Checking connectivity with the snap store
==> Installing the LXD snap from the latest track for ubuntu-18.10
2018-10-16T08:16:38Z INFO Waiting for restart...
lxd 3.6 from Canonical✓ installed
Channel stable/ubuntu-18.10 for lxd is closed; temporarily forwarding to stable.
==> Cleaning up leftovers
Synchronizing state of lxd.service with SysV service script with /lib/systemd/systemd-sysv-...

Read more...

Stuart Bishop (stub) wrote :

I just hit this in a 16.04 container, but for reasons I don't understand installing the core snap first worked around the problem:

$ sudo snap install go --classic
error: cannot perform the following tasks:
- Setup snap "core" (5662) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (5662) security profiles (cannot reload udev rules: exit status 2
udev output:
)

$ sudo snap install core
core 16-2.35.4 from 'canonical' installed

$ sudo snap install go --classic
go 1.11.1 from Michael Hudson-Doyle (mwhudson) installed

Stéphane Graber (stgraber) wrote :

Yeah, we've seen that re-running the command usually gets you past the error, so in your case, just running the "snap install go --classic" would likely have been enough.

Actually to get this working I only needed to use this:

# Mount cgroup in rw to get snaps working
lxc.mount.auto=cgroup:rw

No need to have whole sys and proc as rw (as the problem is due to the snap to try chowning `/sys/fs/cgroup/freezer/snap.*` dirs, however I'm wondering if there's a better way to do this inside the container itself, since this way I guess that two containers sharing the host would have troubles, isn't it?

Stéphane Graber (stgraber) wrote :

Hmm, cgroup:rw has absolutely nothing to do with this.
LXD uses a cgroup namespace by default which completely ignores that particular setting.

With the cgroup namespace, root in the container is allowed to do anything it wants to the /sys/fs/cgroup tree.

root@disco:~# mkdir /sys/fs/cgroup/freezer/snap.blah
root@disco:~# chown 1000:1000 /sys/fs/cgroup/freezer/snap.blah

The error also quite clearly comes from udev rather than anything cgroup related:

root@disco:~# snap install hello-world
error: cannot perform the following tasks:
- Setup snap "core" (6531) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (6531) security profiles (cannot reload udev rules: exit status 2
udev output:
)
root@disco:~# snap install hello-world
2019-03-27T20:18:56Z INFO Waiting for restart...
hello-world 6.3 from Canonical✓ installed
root@disco:~#

I was not doing this in lxd, but in an unprivileged lxc (not sure if it changes the things) that I've in my qnap nas, but without it I wasn't able to use snap at all.

I guess it reduces securty, but eventually I'm still protected by the container itself.

Stéphane Graber (stgraber) wrote :

Yeah, unprivileged LXC is likely to work pretty differently in the way it handles both cgroups and apparmor namespacing both of which are very relevant when you want to run snaps.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers