cgroup2 not a recognized filesystem type

Bug #1732725 reported by Christian Brauner on 2017-11-16
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
AppArmor
Undecided
Unassigned

Bug Description

Hey,

I think that AppArmor currently does not know about cgroup2 as a valid filesystem type. At least I see:

[ 9871.850641] audit: type=1400 audit(1510844560.570:94): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-artful_</var/lib/lxd>" name="/sys/fs/cgroup/unified/" pid=13442 comm="systemd" fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"

which seems to indicate it. If so then AppArmor should probably learn about cgroup2. :)

Christian

Seth Arnold (seth-arnold) wrote :

Hi Christian, can you include the mount rules from the lxd-artful_</var/lib/lxd> profile? It'd also be superb to get the mount syscall (mount or mount2 perhaps?) that systemd tried to execute.

Thanks

Christian Brauner (cbrauner) wrote :
Download full text (3.6 KiB)

Hi Seth,

Sure, no problem. Here are the mount rules for start:

 # ignore DENIED message on / remount
  deny mount options=(ro, remount) -> /,
  deny mount options=(ro, remount, silent) -> /,

  # allow tmpfs mounts everywhere
  mount fstype=tmpfs,

  # allow hugetlbfs mounts everywhere
  mount fstype=hugetlbfs,

  # allow mqueue mounts everywhere
  mount fstype=mqueue,

  # allow fuse mounts everywhere
  mount fstype=fuse,
  mount fstype=fuse.*,

  # deny writes in /proc/sys/fs but allow binfmt_misc to be mounted
  mount fstype=binfmt_misc -> /proc/sys/fs/binfmt_misc/,

  # allow efivars to be mounted, writing to it will be blocked though
  mount fstype=efivarfs -> /sys/firmware/efi/efivars/,

  mount fstype=fusectl -> /sys/fs/fuse/connections/,
  mount fstype=securityfs -> /sys/kernel/security/,
  mount fstype=debugfs -> /sys/kernel/debug/,
  deny mount fstype=debugfs -> /var/lib/ureadahead/debugfs/,
  mount fstype=proc -> /proc/,
  mount fstype=sysfs -> /sys/,
  mount options=(rw, nosuid, nodev, noexec, remount) -> /sys/,

  # note, /sys/kernel/security/** handled below
  mount options=(move) /sys/fs/cgroup/cgmanager/ -> /sys/fs/cgroup/cgmanager.lower/,
  mount options=(ro, nosuid, nodev, noexec, remount, strictatime) -> /sys/fs/cgroup/,

  # allow paths to be made slave, shared, private or unbindable
  # FIXME: This currently doesn't work due to the apparmor parser treating those as allowing all mounts.
# mount options=(rw,make-slave) -> **,
# mount options=(rw,make-rslave) -> **,
# mount options=(rw,make-shared) -> **,
# mount options=(rw,make-rshared) -> **,
# mount options=(rw,make-private) -> **,
# mount options=(rw,make-rprivate) -> **,
# mount options=(rw,make-unbindable) -> **,
# mount options=(rw,make-runbindable) -> **,

  # allow bind-mounts of anything except /proc, /sys and /dev
  mount options=(rw,bind) /[^spd]*{,/**},
  mount options=(rw,bind) /d[^e]*{,/**},
  mount options=(rw,bind) /de[^v]*{,/**},
  mount options=(rw,bind) /dev/.[^l]*{,/**},
  mount options=(rw,bind) /dev/.l[^x]*{,/**},
  mount options=(rw,bind) /dev/.lx[^c]*{,/**},
  mount options=(rw,bind) /dev/.lxc?*{,/**},
  mount options=(rw,bind) /dev/[^.]*{,/**},
  mount options=(rw,bind) /dev?*{,/**},
  mount options=(rw,bind) /p[^r]*{,/**},
  mount options=(rw,bind) /pr[^o]*{,/**},
  mount options=(rw,bind) /pro[^c]*{,/**},
  mount options=(rw,bind) /proc?*{,/**},
  mount options=(rw,bind) /s[^y]*{,/**},
  mount options=(rw,bind) /sy[^s]*{,/**},
  mount options=(rw,bind) /sys?*{,/**},

  # allow moving mounts except for /proc, /sys and /dev
  mount options=(rw,move) /[^spd]*{,/**},
  mount options=(rw,move) /d[^e]*{,/**},
  mount options=(rw,move) /de[^v]*{,/**},
  mount options=(rw,move) /dev/.[^l]*{,/**},
  mount options=(rw,move) /dev/.l[^x]*{,/**},
  mount options=(rw,move) /dev/.lx[^c]*{,/**},
  mount options=(rw,move) /dev/.lxc?*{,/**},
  mount options=(rw,move) /dev/[^.]*{,/**},
  mount options=(rw,move) /dev?*{,/**},
  mount options=(rw,move) /p[^r]*{,/**},
  mount options=(rw,move) /pr[^o]*{,/**},
  mount options=(rw,move) /pro[^c]*{,/**},
  mount options=(rw,move) /proc?*{,/**},
  mount options=(rw,move) /s[^y]*{,/**},
  mount options=(rw,move...

Read more...

Christian Brauner (cbrauner) wrote :

This is how we mount it manually for the container when it has dropped CAP_SYS_ADMIN:

mount("cgroup", "/usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/unified", "cgroup2", MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_RELATIME, NULL) = 0

And this is how systemd mounts by itself in the container with the appropriate AppArmor deny:
mount("cgroup", "/sys/fs/cgroup/unified", "cgroup2", MS_NOSUID|MS_NODEV|MS_NOEXEC, NULL) = -1 EACCES (Permission denied)

Christian Brauner (cbrauner) wrote :

This is blocking running LXD on cgroup2 only systems. Any update on this?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers