Race condition when starting dbus services
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
systemd |
Fix Released
|
Unknown
|
|||
systemd (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Victor Tapia |
Bug Description
[impact]
In certain scenarios, such as high load environments or when "systemctl daemon-reload" runs at the same time a dbus service is starting (e.g. systemd-logind), systemd is not able to track properly when the service has started, keeping the job 'running' forever.
[test case]
set up a 1-cpu VM with Bionic, and configure the system with a ssh key so the user can ssh to localhost. Then run something like:
$ while timeout 5 ssh localhost true; do echo 'reloading'; sudo systemctl restart systemd-logind & sudo systemctl daemon-reload; done
if that doesn't work try:
$ while timeout 5 ssh localhost true; do echo 'reloading'; sudo sh -c 'systemctl restart systemd-logind & systemctl daemon-reload'; done
once the reproducer exits the while loop, there should be a running job for systemd-logind, and any logins attempted after the bug is reproduced should also hang waiting for the systemd-logind job to complete, e.g.:
ubuntu@
JOB UNIT TYPE STATE
525 systemd-
669 session-6.scope start waiting
664 session-5.scope start waiting
3 jobs listed.
[regression potential]
any regression would likely involve services that are Type=dbus failing to complete starting. as with any systemd change, regressions could also involve assertion failures in systemd which causes it to exit.
[scope]
this is needed only for bionic.
this is fixed upstream with commit a5a8776ae5e4244
(per upstream bug) this was introduced by upstream commit 75152a4d6aedbfd
[original description]
In certain scenarios, such as high load environments or when "systemctl daemon-reload" runs at the same time a dbus service is starting (e.g. systemd-logind), systemd is not able to track properly when the service has started, keeping the job 'running' forever.
The issue appears when systemd runs the "AddMatch" dbus method call to track the service's "NameOwnerChange" once it has already ran. A working instance would look like this:
https:/
A failing instance would be:
https:/
I've been able to reproduce the issue on Bionic (237-3ubuntu10.42) running:
sudo systemctl daemon-reload & sudo systemctl restart systemd-logind
tags: | added: sts |
Changed in systemd (Ubuntu Bionic): | |
assignee: | nobody → Victor Tapia (vtapia) |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in systemd (Ubuntu): | |
status: | New → Fix Released |
description: | updated |
description: | updated |
Changed in systemd: | |
status: | Unknown → Fix Released |
description: | updated |
tags: |
added: verification-done verification-done-bionic removed: verification-needed verification-needed-bionic |
restarting systemd-logind is not safe, as existing sessions can be logged out.
also performing daemon-reload, mid-boot, also is not safe.
Can you explain the usecase and why these actions are performed together, racing each other?