Comment 5 for bug 1576341

Revision history for this message
Martin Pitt (pitti) wrote : Re: fails in lxd container

These four units belong to the systemd package itself:

> dev-hugepages.mount loaded failed failed Huge Pages File System
> systemd-journald-audit.socket loaded failed failed Journal Audit Socket

These units attempt to not start in containers with less privileges with ConditionCapability=CAP_SYS_ADMIN and CAP_AUDIT_READ. This does work in nspawn, but it seems the LXD unprivileged containers pretend to have all these caps:

Capabilities for `1': = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_syslog,cap_wake_alarm,cap_block_suspend,37+ep

Which is misleading. Can we start containers with only those capabilities which are actually namespace aware and available to the container, and hide the rest?

> systemd-sysctl.service loaded failed failed Apply Kernel Variables

This is supposed to not start via ConditionPathIsReadWrite=/proc/sys/, but tries anyway, and with debug logging I get

  systemd-sysctl.service: ConditionPathIsReadWrite=/proc/sys/ succeeded.

This is wrong as both "touch /proc/sys/foo" and "test -w /proc/sys" fail. I'll look into this.

> systemd-remount-fs.service loaded failed failed Remount Root and Kernel File Systems

This is has "ConditionPathExists=/etc/fstab", but that's true for lxd containers because they have a dummy /etc/fstab with no entries, just a comment (thus ConditionFileNotEmpty= would not work either). Checking for the CAP_SYS_ADMIN capability would be appropriate (which is required for mounting), but that wouldn't work because of the above issue.

This service does succeed in a container without apparmor restrictions (--config raw.lxc=lxc.aa_profile=unconfined).

Adding ConditionPathIsReadWrite=!/ may be the simplest and most straightforward solution here.