Services fail to start in noble deployed with TPM+FDE
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
apparmor (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
Noble |
New
|
Undecided
|
Unassigned | ||
cups (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Noble |
New
|
Undecided
|
Unassigned | ||
rsyslog (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Noble |
New
|
Undecided
|
Unassigned | ||
sssd (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Noble |
New
|
Undecided
|
Unassigned | ||
systemd (Ubuntu) |
Fix Released
|
High
|
Nick Rosbrook | ||
Noble |
Fix Released
|
High
|
Nick Rosbrook | ||
unbound (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
Noble |
New
|
Undecided
|
Unassigned |
Bug Description
[Impact]
On systems that have systemd in the initrd, after the switch root, services trying to access resources in /run (e.g. /run/systemd/
[Test Plan]
The simplest way to test this is to use dracut on a classic Ubuntu system:
1. Create a VM running Ubuntu 24.04 LTS. The virtualization implementation is not important.
2. Install dracut, and then reboot.
$ apt install -y dracut
3. Once rebooted, verify that systemd did a switch root:
$ journalctl -b --grep "Switching root"
4. Check for rsyslog AppArmor denials:
$ dmesg | grep rsyslog
On an affected system, the denials will be present. With the patch, there should be no denials (or at least not related to accessing files in /run).
[Where problems could occur]
Using MS_MOVE rather than MS_BIND for /run during the switch root means that there is a brief time where /run (in the old root) is not available for units running before the pivot_root(). So, if we were to see problems, it would likely
be related to problems with resources in /run, very close to the switch root timeframe. However, before noble, the switch root *is* done using MS_MOVE on /run (and /proc, /sys, and /dev), so have reasonable evidence that this is a safe change.
[Other information]
We only change the flags for /run because that is the filesystem that seems affected in practice. In particular, we leave /proc alone because code in systemd may use /proc between the time it is moved to the new root, but before the pivot_root(), which would be a riskier change.
[Original Description]
What's known so far:
- 24.04 desktop deployed with TPM+FDE shows this bug
- services confined with apparmor that need to access something in /run/systemd (like the notify socket) fail to do so, even if the apparmor profile is in complain mode. And the apparmor profile does already have rules to allow that access
- only after running aa-disable <path> can the service start fine
- paths logged by the apparmor DENIED or ALLOWED messages are missing the "/run" prefix from "/run/systemd/
- When we add rules to the profile using "/systemd/...." (i.e., also dropping the /run prefix), then it works
- other access in /run/systemd/ are also blocked, but the most noticeable one is the notify mechanism
- comment #2 also states that azure CVM images are also impacted
- comment #4 has instructions on how to create such a VM locally with LXD vms
Original description follows:
This might be related to #2064088
The rsyslog service is continually timing out and restarting. If I use a service drop-in file and change the 'Type' from 'notify' to 'simple', the service starts and appears to work normally.
In the journal, I can see the attached apparmor errors. I can't make sense of them, but if it's a similar issue to #2064088, then I suspect apparmor is preventing the systemd notify function from alerting systemd that the service is up and running.
ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: rsyslog 8.2312.0-3ubuntu9
ProcVersionSign
Uname: Linux 6.8.0-31-generic x86_64
ApportVersion: 2.28.1-0ubuntu2
Architecture: amd64
CasperMD5CheckM
CasperMD5CheckR
CurrentDesktop: ubuntu:GNOME
Date: Mon Apr 29 10:37:46 2024
ProcEnviron:
LANG=en_GB.UTF-8
PATH=(custom, no user)
SHELL=/bin/bash
TERM=xterm-
XDG_RUNTIME_
SourcePackage: rsyslog
UpgradeStatus: No upgrade log present (probably fresh install)
affects: | rsyslog (Ubuntu) → apparmor (Ubuntu) |
affects: | apparmor (Ubuntu) → rsyslog (Ubuntu) |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in systemd (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Nick Rosbrook (enr0n) |
Changed in systemd (Ubuntu Noble): | |
assignee: | nobody → Nick Rosbrook (enr0n) |
importance: | Undecided → High |
description: | updated |
description: | updated |
Changed in systemd (Ubuntu Noble): | |
status: | New → In Progress |
Azure CVM images are impacted by the same issue. I see on #2064088 that this is an tpm-backed FDE system. So I think it's the same problem here if those desktop images use an systemd-based initramfs.
For now I suspect that the issue is due to systemd starting services and setting up UNIX sockets (eg /run/systemd/ journal/ dev-log, /run/systemd/notify and others) before doing the pivot_root and reexec. Then, when apparmor tries to resolve the path of the peer socket it fails here[1] I believe.
[1] https:/ /git.launchpad. net/~ubuntu- kernel/ ubuntu/ +source/ linux/+ git/noble/ tree/fs/ d_path. c#n125