> There is a lot of difficulty once you get into multiple applications that could be scheduled together. In a long run this looks quite complicated. Besides az-aware scheduling, affinity and anti-affinity and constraints there is also device-aware scheduling (cpu model, PCIe devices) if container profiles contain devices passed through. Tracking available vs allocated PCIe devices is another concern. I think it all leads to the need for a proper scheduler for containers but this is a large feature considering the support for different providers. So far we have been mostly placing control plane components into containers which do not require anything being passed through. Placing hypervisor components into LXD brings passing devices into the mix but probably on "one unit per node gets all devices" basis so no usage tracking needs to happen for a base case. We would then let Nova handle allocation in case of OpenStack. For storage components we often use dmcrypt + lvm which is problematic in unprivileged containers. There is also a problem with the fact that curtin-generated udev rules for creating /dev/disk/by-dname/ symlinks have to be copied to containers for us to use persistent block device names. https://lists.linuxcontainers.org/pipermail/lxc-users/2014-January/006076.html https://lists.linuxcontainers.org/pipermail/lxc-users/2014-January/006078.html To sum up: editing bundles is currently a pain as everything is hard-wired to machine numbers dynamically assigned to AZs and we mostly care about "simple" containers as of today. By the looks of it: 1) machine declarations in bundles with specific AZs would allow us to remove the dynamic part in initial bundles; 2) --to zone: for units would allow bare-metal-only components to be placeholders in addition to using machine declarations; 3) --to lxd:zone: plus evenly spreading units of the same application would allow us to avoid hard-wired placements in many cases. On Mon, Nov 5, 2018, 05:55 John Meinel,