cgroup2 not a recognized filesystem type

Bug #1732725 reported by Christian Brauner
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
AppArmor
New
Undecided
Unassigned

Bug Description

Hey,

I think that AppArmor currently does not know about cgroup2 as a valid filesystem type. At least I see:

[ 9871.850641] audit: type=1400 audit(1510844560.570:94): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-artful_</var/lib/lxd>" name="/sys/fs/cgroup/unified/" pid=13442 comm="systemd" fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"

which seems to indicate it. If so then AppArmor should probably learn about cgroup2. :)

Christian

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Hi Christian, can you include the mount rules from the lxd-artful_</var/lib/lxd> profile? It'd also be superb to get the mount syscall (mount or mount2 perhaps?) that systemd tried to execute.

Thanks

Revision history for this message
Christian Brauner (cbrauner) wrote :
Download full text (3.6 KiB)

Hi Seth,

Sure, no problem. Here are the mount rules for start:

 # ignore DENIED message on / remount
  deny mount options=(ro, remount) -> /,
  deny mount options=(ro, remount, silent) -> /,

  # allow tmpfs mounts everywhere
  mount fstype=tmpfs,

  # allow hugetlbfs mounts everywhere
  mount fstype=hugetlbfs,

  # allow mqueue mounts everywhere
  mount fstype=mqueue,

  # allow fuse mounts everywhere
  mount fstype=fuse,
  mount fstype=fuse.*,

  # deny writes in /proc/sys/fs but allow binfmt_misc to be mounted
  mount fstype=binfmt_misc -> /proc/sys/fs/binfmt_misc/,

  # allow efivars to be mounted, writing to it will be blocked though
  mount fstype=efivarfs -> /sys/firmware/efi/efivars/,

  mount fstype=fusectl -> /sys/fs/fuse/connections/,
  mount fstype=securityfs -> /sys/kernel/security/,
  mount fstype=debugfs -> /sys/kernel/debug/,
  deny mount fstype=debugfs -> /var/lib/ureadahead/debugfs/,
  mount fstype=proc -> /proc/,
  mount fstype=sysfs -> /sys/,
  mount options=(rw, nosuid, nodev, noexec, remount) -> /sys/,

  # note, /sys/kernel/security/** handled below
  mount options=(move) /sys/fs/cgroup/cgmanager/ -> /sys/fs/cgroup/cgmanager.lower/,
  mount options=(ro, nosuid, nodev, noexec, remount, strictatime) -> /sys/fs/cgroup/,

  # allow paths to be made slave, shared, private or unbindable
  # FIXME: This currently doesn't work due to the apparmor parser treating those as allowing all mounts.
# mount options=(rw,make-slave) -> **,
# mount options=(rw,make-rslave) -> **,
# mount options=(rw,make-shared) -> **,
# mount options=(rw,make-rshared) -> **,
# mount options=(rw,make-private) -> **,
# mount options=(rw,make-rprivate) -> **,
# mount options=(rw,make-unbindable) -> **,
# mount options=(rw,make-runbindable) -> **,

  # allow bind-mounts of anything except /proc, /sys and /dev
  mount options=(rw,bind) /[^spd]*{,/**},
  mount options=(rw,bind) /d[^e]*{,/**},
  mount options=(rw,bind) /de[^v]*{,/**},
  mount options=(rw,bind) /dev/.[^l]*{,/**},
  mount options=(rw,bind) /dev/.l[^x]*{,/**},
  mount options=(rw,bind) /dev/.lx[^c]*{,/**},
  mount options=(rw,bind) /dev/.lxc?*{,/**},
  mount options=(rw,bind) /dev/[^.]*{,/**},
  mount options=(rw,bind) /dev?*{,/**},
  mount options=(rw,bind) /p[^r]*{,/**},
  mount options=(rw,bind) /pr[^o]*{,/**},
  mount options=(rw,bind) /pro[^c]*{,/**},
  mount options=(rw,bind) /proc?*{,/**},
  mount options=(rw,bind) /s[^y]*{,/**},
  mount options=(rw,bind) /sy[^s]*{,/**},
  mount options=(rw,bind) /sys?*{,/**},

  # allow moving mounts except for /proc, /sys and /dev
  mount options=(rw,move) /[^spd]*{,/**},
  mount options=(rw,move) /d[^e]*{,/**},
  mount options=(rw,move) /de[^v]*{,/**},
  mount options=(rw,move) /dev/.[^l]*{,/**},
  mount options=(rw,move) /dev/.l[^x]*{,/**},
  mount options=(rw,move) /dev/.lx[^c]*{,/**},
  mount options=(rw,move) /dev/.lxc?*{,/**},
  mount options=(rw,move) /dev/[^.]*{,/**},
  mount options=(rw,move) /dev?*{,/**},
  mount options=(rw,move) /p[^r]*{,/**},
  mount options=(rw,move) /pr[^o]*{,/**},
  mount options=(rw,move) /pro[^c]*{,/**},
  mount options=(rw,move) /proc?*{,/**},
  mount options=(rw,move) /s[^y]*{,/**},
  mount options=(rw,move...

Read more...

Revision history for this message
Christian Brauner (cbrauner) wrote :

This is how we mount it manually for the container when it has dropped CAP_SYS_ADMIN:

mount("cgroup", "/usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/unified", "cgroup2", MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_RELATIME, NULL) = 0

And this is how systemd mounts by itself in the container with the appropriate AppArmor deny:
mount("cgroup", "/sys/fs/cgroup/unified", "cgroup2", MS_NOSUID|MS_NODEV|MS_NOEXEC, NULL) = -1 EACCES (Permission denied)

Revision history for this message
Christian Brauner (cbrauner) wrote :

This is blocking running LXD on cgroup2 only systems. Any update on this?

Revision history for this message
Stéphane Graber (stgraber) wrote :

Ping

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.