memlock setting in systemd (pid 1) too low for containers (bionic)

Bug #1830746 reported by Kees Bos on 2019-05-28
30
This bug affects 3 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
High
Guilherme G. Piccoli
Bionic
High
Guilherme G. Piccoli
Cosmic
High
Guilherme G. Piccoli
Disco
High
Guilherme G. Piccoli
Eoan
High
Guilherme G. Piccoli
Focal
High
Guilherme G. Piccoli

Bug Description

[Impact]
* Since systemd commit fb3ae275cb ("main: bump RLIMIT_NOFILE for the root user substantially") [https://github.com/systemd/systemd/commit/fb3ae275cb], which is present in Bionic, the memlock ulimit value was bumped to 16M. It's an adjustable limit, but the default (in previous Ubuntu releases/systemd versions) was really small.
* Although bumping this value was a good thing, 16M is not enough and we can see failures on mlock'ed allocations on Bionic, like the one hereby reported by Kees or the recent introduced cryptsetup build failures (due to PPA builder updates to Bionic) - see https://bugs.launchpad.net/bugs//1891473.

* It's especially harmful in containers to have such "small" limit, so we are hereby SRUing a more recent bump from upstream systemd, in the form of commit 91cfdd8d29 ("core: bump mlock ulimit to 64Mb") [https://github.com/systemd/systemd/commit/91cfdd8d29]. Latest Ubuntu releases, like Focal and subsequent ones, already include this patch so effectively we're putting Bionic on-par with newer releases.

* A discussion about this topic (leading to this SRU) is present in ubuntu-devel ML: https://lists.ubuntu.com/archives/ubuntu-devel/2020-September/041159.html.

[Test Case]
* The straightforward test is to just look "ulimit -l" and "ulimit -Hl" in a current Bionic system, and then install an updated version with the hereby proposed SRU to see such limit bump from 16M to 64M (after a reboot) - a version containing this fix is available at my PPA as of 2020-09-10 [0] (likely to be deleted in next month or so).

* A more interesting test is to run a Focal container in a current Bionic system and try to build the cryptsetup package - it'll fail in some tests. After updating the host (Bionic) systemd to include the mlock bump patch, the build succeeds in the Focal container.

[Regression Potential]
* Since it's a simple bump and it makes Bionic behave like Focal, I don't foresee regressions. One potential issue would be if some users rely on the lower default limit (16M) and this value is bumped by a package update, but that could be circumvented by setting a lower limit in limits.conf. The benefits for such bump are likely much bigger than any "regression" caused for users relying on such default limit.

[0] https://launchpad.net/~gpiccoli/+archive/ubuntu/test1830746

Kees Bos (k-bos) wrote :

The attachment "fix-memlock-bump.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Brian Murray (brian-murray) wrote :

This has landed in Eoan in at least version 242 of systemd.

Changed in systemd (Ubuntu Eoan):
status: New → Fix Released
tags: added: rls-dd-incoming
Kain (kain-kain) wrote :

OLder systemds, (234-240, I think) have a different erroneous clamp on RLIMIT_MEMLOCK. See #1840435.

Kain (kain-kain) wrote :

Hmm, sorry, brainfart. At least 240. Not sure how far back it went.

Steve Langasek (vorlon) on 2020-07-02
Changed in systemd (Ubuntu Disco):
status: New → Won't Fix
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu Bionic):
status: New → Confirmed
Changed in systemd (Ubuntu Cosmic):
status: New → Confirmed
Sebastian (slovdahl) wrote :

Unfortunately I have not experience or knowledge of Ubuntu packaging or bug fixing processes, but is there anything I can do to help get this fixed in bionic?

Guilherme G. Piccoli (gpiccoli) wrote :

Hi Sebastian, thanks for offering help. And thanks of course Kees for reporting the issue!
Recently we faced a build breakage of cryptsetup package narrowed to this issue: https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1891473

I intend to bump this limit to 64M to match recent releases; I'm using the upstream systemd commit for this: https://github.com/systemd/systemd/commit/91cfdd8d29

Cheers,

Guilherme

Changed in systemd (Ubuntu Cosmic):
status: Confirmed → Won't Fix
Changed in systemd (Ubuntu):
importance: Undecided → High
Changed in systemd (Ubuntu Bionic):
importance: Undecided → High
Changed in systemd (Ubuntu Cosmic):
importance: Undecided → High
Changed in systemd (Ubuntu):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in systemd (Ubuntu Disco):
importance: Undecided → High
Changed in systemd (Ubuntu Eoan):
importance: Undecided → High
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in systemd (Ubuntu Cosmic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in systemd (Ubuntu Disco):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in systemd (Ubuntu Eoan):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in systemd (Ubuntu Bionic):
status: Confirmed → In Progress
description: updated
Changed in systemd (Ubuntu Focal):
status: New → Fix Released
importance: Undecided → High
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
tags: added: seg
removed: patch rls-dd-incoming
Dan Streetman (ddstreet) on 2020-09-21
tags: added: sts sts-sponsor-ddstreet

Hello Kees, or anyone else affected,

Accepted systemd into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.43 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic

All autopkgtests for the newly accepted systemd (237-3ubuntu10.43) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

linux-hwe-5.4/5.4.0-52.57~18.04.1 (armhf)
lxc/3.0.3-0ubuntu1~18.04.1 (ppc64el)
openssh/1:7.6p1-4ubuntu0.3 (s390x, arm64, armhf, ppc64el, i386, amd64)
linux-aws-5.0/unknown (amd64)
gvfs/1.36.1-0ubuntu1.3.3 (arm64)
suricata/3.2-2ubuntu3 (i386)
nftables/0.8.2-1 (amd64)
libvirt/4.0.0-1ubuntu8.17 (i386)
netplan.io/0.99-0ubuntu3~18.04.3 (amd64)
systemd/237-3ubuntu10.43 (i386)
linux-hwe-5.0/5.0.0-62.67 (arm64, armhf)
docker.io/19.03.6-0ubuntu1~18.04.2 (i386)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

I was able to verify this bug with systemd from bionic-proposed (version 237-3ubuntu10.43) by following the procedure in the test case; it's working as expected, I can see 64M in the memlock limit.

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10.43

---------------
systemd (237-3ubuntu10.43) bionic; urgency=medium

  [ Guilherme G. Piccoli ]
  * d/p/lp1830746-bump-mlock-ulimit-to-64Mb.patch:
    - Bump the memlock limit to match Focal and newer releases (LP: #1830746)
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=61adb797642f3dd2e5c14f7914c2949c665cefe8

  [ Victor Manuel Tapia King ]
  * d/p/lp1896614-core-Avoid-race-when-starting-dbus-services.patch:
    - Fix race when starting dbus services (LP: #1896614)
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=373cb6ccd6978a7112bbfd7e5cf4f703a9f8448e

  [ Dan Streetman ]
  * d/t/*,
    d/p/lp1892358/0001-test-increase-qemu-timeout-for-TEST-08-and-TEST-09.patch,
    d/p/lp1892358/0002-test-increase-timeout-for-TEST-17-UDEV-WANTS.patch,
    d/p/lp1892358/0003-test-increase-qemu-timeout-for-TEST-18-and-TEST-19.patch:
    - Increase QEMU_TIMEOUT on 'upstream' autopkgtest tests
    - Pull latest tests from newer releases to fix false negatives
    - Blacklist flaky 'upstream' TEST-03
      (LP: #1892358)
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=9fd8391c2499e163515b629a8ca5790898fc599d
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=d1756b3e1c3e625ed7162cff4909e7a29c315051
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=37f8d73516a84e85e4057d6a92204b4a174af718
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=229ed2076eb773efc548035262b8b8009bf89207
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f2d7b1f952667316cc07a4b3c5010e66ace07a90
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=659befe61bbfeb7afc9efa24458c9745412d7c6d

 -- Victor Manuel Tapia King <email address hidden> Wed, 07 Oct 2020 16:30:03 -0400

Changed in systemd (Ubuntu Bionic):
status: Fix Committed → Fix Released
Alkis Georgopoulos (alkisg) wrote :

Hi, this update makes slick-greeter segfault, so Ubuntu MATE 18.04 users doing normal updates now get a black screen with a flicking cursor.

A temporary workaround is to enable autologin in /etc/lightdm/lightdm.conf:

[Seat:*]
autologin-guest=false
autologin-user=administrator
autologin-user-timeout=0

*** What would be a proper fix for this? ***

A related discussion about memory limits and lightdm issues exists in this bug report:
https://bugs.launchpad.net/ubuntu/+source/unity-greeter/+bug/1662244

Alkis Georgopoulos (alkisg) wrote :

What torel proposed in https://bugs.launchpad.net/ubuntu/+source/unity-greeter/+bug/1662244/comments/14 avoids the segfault:

* soft memlock 262144
* hard memlock 262144

Should all lightdm users manually put that in limits.conf, or should we expect some update?

Łukasz Zemczak (sil2100) wrote :

Hey Alkis! Can you please fill in a new bug report with all the detailed information and tag it wit 'regression-update'? Thank you!

Alkis Georgopoulos (alkisg) wrote :

Thank you Łukasz, I filed it in LP: #1902879.

Sebastien Bacher (seb128) wrote :

The libghtdm-gtk-greeter is having the same issue which is breaking xubuntu, see bug #1902871. Could we revert that SRU to proposed instead of updates to avoid bricking more user systems until we have a better handle on the problem and at least have those greeter fixes out?

Andy Whitcroft (apw) wrote :

I have backed out the published version in bionic-updates to the previously published version in the pocket: 237-3ubuntu10.42.

tags: added: block-proposed-bionic
Dan Streetman (ddstreet) wrote :
Download full text (3.4 KiB)

To clarify, the regression appears to be the same problem that the rlimit increase is fixing, but the applications failing now are simply bigger. In general, any application that calls mlockall() with MCL_FUTURE, but doesn't adjust its rlimit (or change its systemd service file to adjust LimitMEMLOCK) is very likely destined to crash later in its life.

I believe the only lp bugs for this regression are bug 1890394 and bug 1902879, which are both fix-committed and verified, so this bug should be ok to release (again) after those are released. Also I will note both of those applications (slick-greeter and lightdm-gtk-greeter) were fixed by commenting out their calls to mlockall.

There is also bug 1902871 and bug 1903199 but I believe those are both dups of bug 1900394.

Also finally to reflect on cryptsetup's use of mlockall(), since it's the origin for this bug; cryptsetup is maybe "better" about its use of mlockall() since it keeps the mlock only for the duration of an 'action':

        if (action->required_memlock)
                crypt_memory_lock(NULL, 1);

        set_int_handler(0);
        r = action->handler();

        if (action->required_memlock)
                crypt_memory_lock(NULL, 0);

however, as this bug shows, that a...

Read more...

Łukasz Zemczak (sil2100) wrote :

Both the slick-greeter and lightdm-gtk-greeter packages have been now released into -updates. I think it should be now safe-ish to proceed with the systemd update once again. Let's think about it in the nearest time.

tags: removed: block-proposed-bionic
Dan Streetman (ddstreet) wrote :

found another 'special' application that thinks it needs all its memory locked: corosync.

opened bug 1911904

Dan Streetman (ddstreet) wrote :

Oh and also openvswitch, bug 1906280

To summarize, here are all the applications (found so far) that thought they needed to lock all their current and future memory:

slick-greeter (bug 1902879)
lightdm-gtk-greeter (bug 1890394)
corosync (bug 1911904)
openvswitch (bug 1906280)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers