lxd

units with credentials fail in LXD containers

Bug #2046486 reported by Nick Rosbrook
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
lxd
New
Unknown
lxd (Ubuntu)
Confirmed
Undecided
Unassigned
samba (Ubuntu)
Fix Released
Undecided
Unassigned
systemd (Ubuntu)
Triaged
High
Nick Rosbrook

Bug Description

Many units shipped by systemd use credentials in some way by default now (in v256). So this issue is now about much more than the original test case failure.

For example,

root@oracular:~# apt policy systemd
systemd:
  Installed: 256-1ubuntu1
  Candidate: 256-1ubuntu1
  Version table:
 *** 256-1ubuntu1 100
        100 http://archive.ubuntu.com/ubuntu oracular-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     255.4-1ubuntu8 500
        500 http://archive.ubuntu.com/ubuntu oracular/main amd64 Packages
root@oracular:~# for service in $(find /usr/lib/systemd/system -maxdepth 1 -name "systemd-*.service"); do grep -q "Credential.*=" "$service" && echo "$service"; done
/usr/lib/systemd/system/systemd-sysusers.service
/usr/lib/systemd/system/systemd-resolved.service
/usr/lib/systemd/system/systemd-firstboot.service
/usr/lib/systemd/system/systemd-network-generator.service
/usr/lib/systemd/system/systemd-journald.service
/usr/lib/systemd/system/systemd-sysctl.service
/usr/lib/systemd/system/systemd-tmpfiles-setup-dev-early.service
/usr/lib/systemd/system/systemd-tmpfiles-setup-dev.service
/usr/lib/systemd/system/systemd-tmpfiles-setup.service
/usr/lib/systemd/system/systemd-udev-load-credentials.service
/usr/lib/systemd/system/systemd-tmpfiles-clean.service
/usr/lib/systemd/system/systemd-networkd.service

root@oracular:~# systemctl status systemd-sysusers.service systemd-resolved.service systemd-firstboot.service systemd-network-generator.service systemd-journald.service systemd-sysctl.service systemd-tmpfiles-setup-dev-early.service systemd-tmpfiles-setup-dev.service systemd-tmpfiles-setup.service systemd-udev-load-credentials.service systemd-tmpfiles-clean.service systemd-networkd.service
○ systemd-sysusers.service - Create System Users
     Loaded: loaded (/usr/lib/systemd/system/systemd-sysusers.service; static)
     Active: inactive (dead)
  Condition: start condition unmet at Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
             ├─ ConditionNeedsUpdate=|/etc was not met
             └─ ConditionCredential=|sysusers.extra was not met
       Docs: man:sysusers.d(5)
             man:systemd-sysusers.service(8)

× systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:49 UTC; 59s ago
 Invocation: b1aaa662750f48868fe3388e4524c462
       Docs: man:systemd-resolved.service(8)
             man:org.freedesktop.resolve1(5)
             https://systemd.io/WRITING_NETWORK_CONFIGURATION_MANAGERS
             https://systemd.io/WRITING_RESOLVER_CLIENTS
    Process: 258 ExecStart=/usr/lib/systemd/systemd-resolved (code=exited, status=243/CREDENTIALS)
   Main PID: 258 (code=exited, status=243/CREDENTIALS)

○ systemd-firstboot.service - First Boot Wizard
     Loaded: loaded (/usr/lib/systemd/system/systemd-firstboot.service; static)
     Active: inactive (dead)
  Condition: start condition unmet at Mon 2024-06-24 18:58:48 UTC; 59s ago
             └─ ConditionFirstBoot=yes was not met
       Docs: man:systemd-firstboot(1)

○ systemd-network-generator.service - Generate network units from Kernel command line
     Loaded: loaded (/usr/lib/systemd/system/systemd-network-generator.service; disabled; preset: enabled)
     Active: inactive (dead)
       Docs: man:systemd-network-generator.service(8)

× systemd-journald.service - Journal Service
     Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
    Drop-In: /usr/lib/systemd/system/systemd-journald.service.d
             └─nice.conf
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: 7caace7a15c749f3a86fb15fcfb94dff
TriggeredBy: × systemd-journald-dev-log.socket
             × systemd-journald.socket
             ○ systemd-journald-audit.socket
       Docs: man:systemd-journald.service(8)
             man:journald.conf(5)
    Process: 124 ExecStart=/usr/lib/systemd/systemd-journald (code=exited, status=243/CREDENTIALS)
   Main PID: 124 (code=exited, status=243/CREDENTIALS)
   FD Store: 0 (limit: 4224)

× systemd-sysctl.service - Apply Kernel Variables
     Loaded: loaded (/usr/lib/systemd/system/systemd-sysctl.service; static)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: 5e90310a27b043ceae80c96e35c41451
       Docs: man:systemd-sysctl.service(8)
             man:sysctl.d(5)
    Process: 97 ExecStart=/usr/lib/systemd/systemd-sysctl (code=exited, status=243/CREDENTIALS)
   Main PID: 97 (code=exited, status=243/CREDENTIALS)

× systemd-tmpfiles-setup-dev-early.service - Create Static Device Nodes in /dev gracefully
     Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-setup-dev-early.service; static)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: 78e3c68cfa9a4a7982950b08c0f1385f
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 73 ExecStart=systemd-tmpfiles --prefix=/dev --create --boot --graceful (code=exited, status=243/CREDENTIALS)
   Main PID: 73 (code=exited, status=243/CREDENTIALS)

× systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev
     Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-setup-dev.service; static)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: 46458c7b6e134ef8be299900db7cc288
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 98 ExecStart=systemd-tmpfiles --prefix=/dev --create --boot (code=exited, status=243/CREDENTIALS)
   Main PID: 98 (code=exited, status=243/CREDENTIALS)

× systemd-tmpfiles-setup.service - Create Volatile Files and Directories
     Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-setup.service; static)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: f4e64afdc8774170a9b29b8cf2919f46
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 147 ExecStart=systemd-tmpfiles --create --remove --boot --exclude-prefix=/dev (code=exited, status=243/CREDENTIALS)
   Main PID: 147 (code=exited, status=243/CREDENTIALS)

× systemd-udev-load-credentials.service - Load udev Rules from Credentials
     Loaded: loaded (/usr/lib/systemd/system/systemd-udev-load-credentials.service; disabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:48 UTC; 1min 0s ago
 Invocation: cb5a1f43cde248de80fcf701b4b5d381
       Docs: man:udevadm(8)
             man:udev(7)
             man:systemd.system-credentials(7)
    Process: 75 ExecStart=udevadm control --load-credentials (code=exited, status=243/CREDENTIALS)
   Main PID: 75 (code=exited, status=243/CREDENTIALS)

○ systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
     Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static)
     Active: inactive (dead)
TriggeredBy: ● systemd-tmpfiles-clean.timer
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)

× systemd-networkd.service - Network Configuration
     Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-06-24 18:58:49 UTC; 59s ago
 Invocation: 5d960369ea944d5cbac4382e42ded1d0
TriggeredBy: × systemd-networkd.socket
       Docs: man:systemd-networkd.service(8)
             man:org.freedesktop.network1(5)
    Process: 280 ExecStart=/usr/lib/systemd/systemd-networkd (code=exited, status=243/CREDENTIALS)
   Main PID: 280 (code=exited, status=243/CREDENTIALS)
   FD Store: 0 (limit: 512)

[Original Description]

To demonstrate this, in an unprivileged LXD container, create the following unit (taken from the systemd test suite):

$ cat > /etc/systemd/system/exec-set-credential.service << EOF
# SPDX-License-Identifier: LGPL-2.1-or-later
[Unit]
Description=Test for SetCredential=

[Service]
ExecStart=/bin/sh -x -c 'test "$$(cat %d/test-execute.set-credential)" = "hoge"'
ExecStartPost=/bin/sh -x -c 'test "$$(cat %d/test-execute.set-credential)" = "hoge"'
ExecStop=/bin/sh -x -c 'test "$$(cat %d/test-execute.set-credential)" = "hoge"'
ExecStopPost=/bin/sh -x -c 'test "$$(cat %d/test-execute.set-credential)" = "hoge"'
Type=oneshot
SetCredential=test-execute.set-credential:hoge
EOF
$ systemctl daemon-reload
$ systemctl start exec-set-credential.service
Job for exec-set-credential.service failed because the control process exited with error code.
See "systemctl status exec-set-credential.service" and "journalctl -xeu exec-set-credential.service" for details.

With debug logs enabled, we see:

$ journalctl -u exec-set-credential.service -b --no-pager
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Trying to enqueue job exec-set-credential.service/start/replace
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Installed new job exec-set-credential.service/start as 2740
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Enqueued job exec-set-credential.service/start as 2740
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Will spawn child (service_enter_start): /bin/sh
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Failed to set 'trusted.invocation_id' xattr on control group /system.slice/exec-set-credential.service, ignoring: Operation not permitted
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Failed to remove 'trusted.delegate' xattr flag on control group /system.slice/exec-set-credential.service, ignoring: Operation not permitted
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Failed to remove 'trusted.survive_final_kill_signal' xattr flag on control group /system.slice/exec-set-credential.service, ignoring: Operation not permitted
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Passing 0 fds to service
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: About to execute: /bin/sh -x -c "test \"1031(cat /run/credentials/exec-set-credential.service/test-execute.set-credential)\" = \"hoge\""
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Forked /bin/sh as 2183
Dec 14 19:24:24 noble (sh)[2183]: PR_SET_MM_ARG_START failed: Operation not permitted
Dec 14 19:24:24 noble (sh)[2183]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Dec 14 19:24:24 noble (sh)[2183]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Changed dead -> start
Dec 14 19:24:24 noble systemd[1]: Starting exec-set-credential.service - Test for SetCredential=...
Dec 14 19:24:24 noble (sh)[2183]: Successfully forked off '(sd-mkdcreds)' as PID 2184.
Dec 14 19:24:24 noble (sd-[2184]: Changing mount propagation /dev (MS_REC|MS_SLAVE "")
Dec 14 19:24:24 noble (sd-[2184]: Mounting ramfs (ramfs) on /dev/shm (MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_NOSYMFOLLOW "mode=0700")...
Dec 14 19:24:24 noble (sd-[2184]: Changing mount flags /dev/shm (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_NOSYMFOLLOW|MS_BIND "")...
Dec 14 19:24:24 noble (sd-[2184]: Failed to mount n/a (type n/a) on /dev/shm (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_NOSYMFOLLOW|MS_BIND ""): Permission denied
Dec 14 19:24:24 noble (sh)[2183]: (sd-mkdcreds) failed with exit status 1.
Dec 14 19:24:24 noble (sh)[2183]: exec-set-credential.service: Failed to set up credentials: Protocol error
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Child 2183 belongs to exec-set-credential.service.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Will spawn child (service_enter_stop_post): /bin/sh
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: About to execute: /bin/sh -x -c "test \"1031(cat /run/credentials/exec-set-credential.service/test-execute.set-credential)\" = \"hoge\""
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Forked /bin/sh as 2186
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Changed start -> stop-post
Dec 14 19:24:24 noble (sh)[2186]: PR_SET_MM_ARG_START failed: Operation not permitted
Dec 14 19:24:24 noble (sh)[2186]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Dec 14 19:24:24 noble (sh)[2186]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Dec 14 19:24:24 noble sh[2186]: + test 1031(cat /run/credentials/exec-set-credential.service/test-execute.set-credential) = hoge
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Child 2186 belongs to exec-set-credential.service.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Control process exited, code=exited, status=1/FAILURE
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Got final SIGCHLD for state stop-post.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Failed with result 'exit-code'.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Service will not restart (restart setting)
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Changed stop-post -> failed
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Job 2740 exec-set-credential.service/start finished, result=failed
Dec 14 19:24:24 noble systemd[1]: Failed to start exec-set-credential.service - Test for SetCredential=.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Unit entered failed state.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Consumed 23ms CPU time.
Dec 14 19:24:24 noble systemd[1]: exec-set-credential.service: Releasing resources...

Related branches

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu):
status: New → Confirmed
status: Confirmed → New
importance: Undecided → High
assignee: nobody → Nick Rosbrook (enr0n)
Revision history for this message
Nick Rosbrook (enr0n) wrote :

This is the apparmor denial:

audit: type=1400 audit(1704299091.131:665): apparmor="DENIED" operation="mount" class="mount" info="failed flags match" error=-13 profile="lxd-noble_</var/snap/lxd/common/lxd>" name="/dev/shm/" pid=71828 comm="(sd-mkdcreds)" flags="ro, nosuid, nodev, noexec, remount, bind"

which corresponds to:

Dec 14 19:24:24 noble (sd-[2184]: Failed to mount n/a (type n/a) on /dev/shm (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_NOSYMFOLLOW|MS_BIND ""): Permission denied

from the journal output above. Taking a look at the AppArmor profile create by LXD, it seems that the problematic flag isMS_NOSYMFOLLOW; there is a rule in /var/snap/lxd/common/lxd/security/apparmor/profiles/lxd-noble on my machine that allows the flags (ro,remount,bind,nosuid,noexec,nodev) for /dev/shm and others.

I think it probably makes the most sense to allow this flag combination in the AppArmor profile create by LXD.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

It seems that the apparmor_parser in core22 does not understand the nosymfollow mount option:

$ lxc config set systemd-lxc raw.apparmor "mount options=(ro,remount,bind,nosuid,noexec,nodev,nosymfollow) /dev/shm,"
Error: Parse AppArmor profile: Failed to run: apparmor_parser -QWL /var/snap/lxd/common/lxd/security/apparmor/cache /var/snap/lxd/common/lxd/security/apparmor/profiles/lxd-systemd-lxc: exit status 1 (unsupported mount options)

So, patching the generated AppArmor policy might not be feasible until the lxd snap uses core24.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

I have created a PR with LXD to at least get feedback: https://github.com/canonical/lxd/pull/12698.

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu):
status: New → Triaged
Revision history for this message
Nick Rosbrook (enr0n) wrote (last edit ):

I found that after working around this issue (with seccomp rules) there are yet more AppArmor denials during namespace set up.

All in all, systemd services with sandboxing settings (i.e. settings that require the use of various namespaces) hit more and more denials in LXD containers. So, after discussing with LXD folks, the plan is to enable security.nesting: true by default for unprivileged containers [1].

[1] https://github.com/canonical/lxd/issues/13631

summary: - units with SetCredential= fail in LXD containers
+ units with credentials fail in LXD containers
tags: added: block-proposed
Revision history for this message
Nick Rosbrook (enr0n) wrote :

Adding block-proposed for now because I believe the armhf autopkgtest environment will be affected once systemd in oracular-proposed migrates. I hope to get security.nesting: true turned on in the autopkgtest environment soon to avoid this, and to let systemd migrate.

Revision history for this message
Tim Andersson (andersson123) wrote :

turns out this was actually already in the config and I missed it, so there's nothing blocking the version of systemd in proposed from the autopkgtest side

Nick Rosbrook (enr0n)
tags: removed: block-proposed
Revision history for this message
Nick Rosbrook (enr0n) wrote :

As I mentioned on the LXD PR[1] https://github.com/canonical/lxd/pull/12698#issuecomment-2105170174, many important services now use credentials in some way (including networkd, resolved, journald), so this issue is very apparent with systemd v256.

Nick Rosbrook (enr0n)
description: updated
description: updated
no longer affects: systemd
Changed in lxd:
status: Unknown → New
Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :

Hopefully, this will be fixed by https://github.com/canonical/lxd/pull/13681

I think we need some help with validation/review and testing.

Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :

https://lore<email address hidden>/

Revision history for this message
Nick Rosbrook (enr0n) wrote :

samba has a test failing[1] because of this, which is blocking systemd from migrating. The same apparmor denials can be seen in the test log. These denials cause systemd-resolved to fail to start in the member-server container, which ultimately causes the test to fail.

We should be able to fix the test for now by just starting the container with security.nesting=true.

[1] https://objectstorage.prodstack5.canonical.com/swift/v1/AUTH_0f9aae918d5b4744bf7b827671c86842/autopkgtest-oracular/oracular/amd64/s/samba/20240709_085237_01da3@/log.gz

tags: added: update-excuse
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package samba - 2:4.20.2+dfsg-2ubuntu2

---------------
samba (2:4.20.2+dfsg-2ubuntu2) oracular; urgency=medium

  * debian/tests: launch container with security.nesting=true
    Otherwise, systemd-resolved in the member-server container will
    fail to start, and cause the test to fail. (LP: #2046486)

 -- Nick Rosbrook <email address hidden> Tue, 09 Jul 2024 14:46:44 -0400

Changed in samba (Ubuntu):
status: New → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxd (Ubuntu):
status: New → Confirmed
Nick Rosbrook (enr0n)
tags: removed: update-excuse
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.