cgroup v2 is not fully supported yet, proceeding with partial confinement

Bug #1850667 reported by Balint Reczey on 2019-10-30
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
High
Maciej Borzecki
docker.io (Ubuntu)
Undecided
Unassigned
lxc (Ubuntu)
Fix Released
Unknown
lxcfs (Ubuntu)
Undecided
Unassigned
lxd (Ubuntu)
Undecided
Unassigned
snapd (Ubuntu)
Undecided
Maciej Borzecki

Bug Description

Systemd upstream switched the default cgroup hierarchy to unified with v243. This change is reverted by the Ubuntu systemd packages, but as unified is the way to go per upstream support should be added to all relevant Ubuntu packges (and snaps):

https://github.com/systemd/systemd/blob/v243/NEWS#L56

        * systemd now defaults to the "unified" cgroup hierarchy setup during
          build-time, i.e. -Ddefault-hierarchy=unified is now the build-time
          default. Previously, -Ddefault-hierarchy=hybrid was the default. This
          change reflects the fact that cgroupsv2 support has matured
          substantially in both systemd and in the kernel, and is clearly the
          way forward. Downstream production distributions might want to
          continue to use -Ddefault-hierarchy=hybrid (or even =legacy) for
          their builds as unfortunately the popular container managers have not
          caught up with the kernel API changes.

Systemd is rebuilt using the new default and is available from the following PPA for testing:

https://launchpad.net/~rbalint/+archive/ubuntu/systemd-unified-cgh

The autopkgtest results against other packges are available here:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-eoan-rbalint-systemd-unified-cgh/?format=plain

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-focal-rbalint-systemd-unified-cgh/?format=plain

lxc autopkgtest failing:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-eoan-rbalint-systemd-unified-cgh/eoan/amd64/d/docker.io/20191030_155944_2331e@/log.gz

snapd autopkgtest failing:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-eoan-rbalint-systemd-unified-cgh/eoan/amd64/s/snapd/20191030_161354_94b26@/log.gz

docker.io autopkgtest failing:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-eoan-rbalint-systemd-unified-cgh/eoan/amd64/d/docker.io/20191030_155944_2331e@/log.gz

There's some ongoing work in snapd in this area. With 2.42 the snaps do not outright fail and a warning is printed out for the user. The current work on a named snapd v1 hierarchy should restore the snap process tracking capabilities.

Changed in snapd (Ubuntu):
status: New → In Progress
assignee: nobody → Maciej Borzecki (maciek-borzecki)
Ryutaroh Matsumoto (emojifreak) wrote :

When Ubuntu Eoan is started with systemd.unified_cgroup_hierarchy, lxc-start (version 3.0.4 packaged by Eoan) cannot be used in its default setting. It is a combination of unsuitable default configuration and an upstream bug in LXC 3.0.4. For detail, please refer to https://github.com/lxc/lxc/issues/3183

affects: lxc → lxc (Ubuntu)
Changed in lxc (Ubuntu):
status: Unknown → New
Ryutaroh Matsumoto (emojifreak) wrote :

This was reported to the upstream https://github.com/lxc/lxc/issues/3198
The purpose of libpam-cgfs is only chowning some CGroup directories to the login user.
When Linux is booted with systemd.unified_cgroup_hierarchy,
/sys/fs/cgroup/user.slice/user-$UID.slice/session-nnn.scope is not chowned to a login user.
So libpam-cgfs completely fails to function under cgroup v2.

Ryutaroh Matsumoto (emojifreak) wrote :

https://github.com/lxc/lxc/issues/3221 Another LXC-container-doesn't-start-at-all type issue also observed on Ubuntu Eoan with systemd.unified_cgroup_hierarchy as well as Fedora 31.

On Mon, Dec 09, 2019 at 08:41:18PM -0000, Ryutaroh Matsumoto wrote:
> https://github.com/lxc/lxc/issues/3221 Another LXC-container-doesn't
> -start-at-all type issue also observed on Ubuntu Eoan with
> systemd.unified_cgroup_hierarchy as well as Fedora 31.

That seems specific to LXC stable-3.0 which had barebone unified
hierarchy support to deal with systemd hyrbid cgroup layouts. However
the changes to git master which enable full cgroup2 compatibility have
been backported to the stable-3.0 branch and will be released with the
next bugfix release. In other words, the start-at-all on a pure unified
layout with 3.0.4 is expected unfortunately.

Ryutaroh Matsumoto (emojifreak) wrote :

https://github.com/lxc/lxd/issues/6587 When Ubuntu Eoan is booted with systemd.unified_cgroup_hierarchy, LXD cannot run Ubuntu Eoan in its container, but a small change to lxd/lxd/container_lxc.go enables LXD to operate as usual.

Ryutaroh Matsumoto (emojifreak) wrote :

lxc-checkpoint in the latest github master branch does not work under pure cgroup v2 as https://github.com/lxc/lxc/issues/3240

Stéphane Graber (stgraber) wrote :

LXD, LXCFS and LXC all have cgroupv2 support now.
It's certainly not perfect and things like CRIU (lxc-checkpoint) will not work until such time as cgroupv2 support is fully on part in the kernel with cgroupv1 and the needed additional interfaces are added to projects like CRIU.

But for normal day to day use, we should be in pretty good shape now.

Changed in lxd (Ubuntu):
status: New → Fix Released
Changed in lxcfs (Ubuntu):
status: New → Fix Released
Changed in lxc (Ubuntu):
importance: Unknown → Undecided
status: New → Fix Released
Changed in lxc (Ubuntu):
importance: Undecided → Unknown
Balint Reczey (rbalint) wrote :

Thank you everyone for implementing cgroup v2 support

Snapd is not reported here to be fixed, but it may be:
https://github.com/ubports/ubports-installer/issues/1448

@maciek-borzecki could you please confirm that snapd is fixed?

Debian plans switching systemd to use cgroupv2 by default and if every package listed as affected here is ready I plan making the switch in Ubuntu, too.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943981

Michael Vogt (mvo) wrote :

@rbalint Thanks for this heads up. Unfortunately we are not ready for cgroups v2. Snapd is working on v2 systems but a lot of the functionality is not ported. AIUI it requires quite a bit of work on our side and the two are quite different :/

Michael Vogt (mvo) wrote :

@rbalint Do you have a timeline when you plan this? The changes required make this most likely something we can only tackle during the 21.10 cycle :/

Balint Reczey (rbalint) wrote :

@mvo:

Fedora switched in 2019: https://medium.com/nttlabs/cgroup-v2-596d035be4d7
Debian switched with systemd 247.2-2 https://tracker.debian.org/news/1204112/accepted-systemd-2472-2-source-into-unstable/

I was about to follow Debian in systemd, but I'm holding the switch back for now. Could you please provide a link where snapd's progress can be tracked?

I plan keeping the current systemd default then for 21.04 to minimize disruption and give some more time for preparation, but I'd like to make the switch early in the 21.10 cycle to also have time to fix regressions by 21.10's release.

Changed in snapd:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Maciej Borzecki (maciek-borzecki)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in docker.io (Ubuntu):
status: New → Confirmed
Michael Vogt (mvo) wrote :

After discussing this we decided that we will leave cgroups v1 support for 21.04 because the snapd team will not be able to port all features to v2 in time. But early in the 21.10 cycle v1 is turned off and snapd needs to be ported to full v2 support.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.