maas install fails inside of a 16.04 lxd container due to avahi problems
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Undecided
|
Unassigned | ||
avahi (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Trusty |
Fix Released
|
Medium
|
Trent Lloyd | ||
Xenial |
Fix Released
|
Medium
|
Trent Lloyd | ||
Artful |
Fix Released
|
Medium
|
Trent Lloyd | ||
lxd (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Trusty |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Artful |
Invalid
|
Undecided
|
Unassigned |
Bug Description
[Original Description]
The bug, and workaround, are clearly described in this mailing list thread:
https:/
I'm trying to install MAAS in a LXD container, but that's failing due to avahi package install problems. I'm tagging all packages here.
[Issue]
Avahi sets a number of rlimits on startup including the maximum number of processes (nproc=2) and limits on memory usage. These limits are hit in a number of cases - specifically the maximum process limit is hit if you run lxd containers in 'privileged' mode such that avahi has the same uid in multiple containers and large networks can trigger the memory limit.
The fix is to remove these default rlimits completely from the configuration file.
[Impact]
* Avahi is unable to start inside of containers without UID namespace isolation because an rlimit on the maximum number of processes is set by default to 2. When a container launches Avahi, the total number of processes on the system in all containers exceeds this limit and Avahi is killed. It also fails at install time, rather than runtime due to a failure to start the service.
* Some users also have issues with the maximum memory allocation causing Avahi to exit on networks with a large number of services as the memory limit was quite small (4MB). Refer LP #1638345
[Test Case]
* setup lxd (apt install lxd, lxd init, get working networking)
* lxc launch ubuntu:16.04 avahi-test --config security.
* lxc exec avahi-test sudo apt install avahi-daemon
This will fail if the parent host has avahi-daemon installed, however, if it does not you can setup a second container (avahi-test2) and install avahi there. That should then fail (as the issue requires 2 copies of avahi-daemon in the same uid namespace to fail)
[Regression Potential]
* The fix removes all rlimits configured by avahi on startup, this is an extra step avahi takes that most programs did not take (limiting memory usage, running process count, etc). It's possible an unknown bug then consumes significant system resources as a result of that limit no longer being in place, that was previously hidden by Avahi crashing instead. However I believe this risk is significantly reduced as this change has been shipping upstream for many months and have not seen any reports of new problems - however it has fixed a number of existing crashes/problems.
* The main case this may not fix the issue is if they have modified their avahi-daemon.conf file - but it will fix new installs and most installs as most users don't modify the file. And users may be prompted on upgrade to replace the file.
[Other Info]
* This change already exists upstream in 0.7 which is in bionic. SRU required to artful, xenial, trusty.
tags: | added: maas-at-home |
Changed in avahi (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Trent Lloyd (lathiat) |
description: | updated |
description: | updated |
Changed in avahi (Ubuntu Trusty): | |
assignee: | nobody → Trent Lloyd (lathiat) |
Changed in avahi (Ubuntu Xenial): | |
assignee: | nobody → Trent Lloyd (lathiat) |
Changed in avahi (Ubuntu Artful): | |
assignee: | nobody → Trent Lloyd (lathiat) |
Changed in avahi (Ubuntu Trusty): | |
importance: | Undecided → Medium |
Changed in avahi (Ubuntu Xenial): | |
importance: | Undecided → Medium |
Changed in avahi (Ubuntu Artful): | |
importance: | Undecided → Medium |
Changed in lxd (Ubuntu Trusty): | |
status: | New → Invalid |
Changed in lxd (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in lxd (Ubuntu Artful): | |
status: | New → Invalid |
Changed in avahi (Ubuntu Trusty): | |
status: | New → In Progress |
Changed in avahi (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in avahi (Ubuntu Artful): | |
status: | New → In Progress |
Changed in avahi (Ubuntu): | |
status: | In Progress → Fix Released |
assignee: | Trent Lloyd (lathiat) → nobody |
tags: |
added: verification-done-artful verification-done-trusty verification-done-xenial removed: verification-needed-artful verification-needed-trusty verification-needed-xenial |
Avahi is setting some rather strict rlimits which affect everything which uses that kernel uid, crossing container boundaries and so breaking containers.
Unfortunately MAAS requires a privileged container right now, so you can't resort to uid mapping to avoid this problem. At the LXD level, all we can do to avoid this problem is to allow you to have one distinct id map per container, which we already support. But that's only going to work for unprivileged containers.
One fix could be to tweak our avahi to relax or if not that useful, entirely remove those rlimits as it's a rather frequent pain point and I'm not sure of the benefit of those rlimits in the first place.
Another fix would be to not have MAAS depend on avahi and let you install and run it without avahi, which is effectively what Brian's instructions do (as they disable avahi-daemon in the container).
Marking the LXD task Invalid, as we're already doing all we can in this regard by supporting non-overlapping id maps for unprivileged containers.