Shutdown when triggering daemon-reload early in boot

Bug #2037281 reported by Valentin David
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
High
Nick Rosbrook
Jammy
Fix Released
High
Nick Rosbrook

Bug Description

In Ubuntu Core 20, and Ubuntu Core 22, we are encountering an issue where if a service, started earlier than devices are processed by udev, does `systemctl daemon-reload`, the system shuts down. This is due to devices for mounted filesystem temporarily taken dead, which pulls most units down.

This was fixed by upstream in https://github.com/systemd/systemd/pull/23218.

But this was not backported to the versions used by Ubuntu packages for focal and jammy. The needed commit from that PR is the one with message `core/device: ignore DEVICE_FOUND_UDEV bit on switching root`.

This patch applies to 245.4-4ubuntu3.22 (focal) without rebasing needed. And I suppose it does also for jammy.

I have manually tested the fix with Ubuntu Core 20, and this fixes our issue.

We would like this patch to be backported to focal-updates and jammy-updates.

Thank you in advance.

[ Impact ]

If a user adds a service that calls `systemctl daemon-reload`, and if this service is started before systemd-udevd. And if the initrd is systemd (the case of Ubuntu Core), then most service will be stopped or cancel, and the machine will mostly shutdown everything and hang.

The fix has been backported down to 250 upstream. It is already on kinetic and later.

The fix only affects systems where systemd is used in initrd.

[ Test Plan ]

On Ubuntu Core 20 (with Core 22 kernel) or on Ubuntu Core 22. Or on any system that uses systemd in initrd.

Add a systemd service that calls `systemctl daemon-reload`.
The service should have `DefaultDependencies=no` in order to start as soon as possible and be enabled.

Restart the machine.

If fix is not applied, after the service is started, most of units with be shutdown, and the system will be unusable.

[ Where problems could occur ]

This should affect systems with systemd in initrd.

There are risks on systems that have an udev rule in initrd not present in the main system.

There are risks on systems that use db_persist in initrd where the device can potentially get dead state. Though this does not seem to happen on Ubuntu Core 22, even though we use db_persist for dev mapper devices. Regression is upstream bug #23429. Commits named "core/device: device_coldplug(): don't set DEVICE_DEAD" and "core/device: do not downgrade device state if it is already enumerated" could be applied as well.

Related branches

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Focal):
importance: Undecided → High
Changed in systemd (Ubuntu Jammy):
importance: Undecided → High
Changed in systemd (Ubuntu):
status: New → Fix Released
tags: added: systemd-sru-next
Nick Rosbrook (enr0n)
tags: added: foundations-todo
Changed in systemd (Ubuntu Focal):
assignee: nobody → Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Jammy):
assignee: nobody → Nick Rosbrook (enr0n)
Michael Vogt (mvo)
description: updated
description: updated
Changed in systemd (Ubuntu):
importance: Undecided → Critical
importance: Critical → High
importance: High → Undecided
Revision history for this message
Valentin David (valentin.david) wrote :

Just adding a comment on how to reproduce the bug, because I have lost a day trying to figure it out again. This fails only when not using secure boot because the device mapper state is kept from initrd to main boot.

Steve Langasek (vorlon)
summary: - Shutdown when triggering daemon-reload eary in boot
+ Shutdown when triggering daemon-reload early in boot
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Valentin, or anyone else affected,

Accepted systemd into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/249.11-0ubuntu3.12 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Jammy):
status: New → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Valentin, or anyone else affected,

Accepted systemd into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/245.4-4ubuntu3.23 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/249.11-0ubuntu3.12)

All autopkgtests for the newly accepted systemd (249.11-0ubuntu3.12) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

casync/2+20201210-1build1 (arm64, ppc64el)
exim4/4.95-4ubuntu2.4 (ppc64el)
indicator-session/17.3.20+21.10.20210613.1-0ubuntu1 (armhf)
linux-aws-5.19/5.19.0-1029.30~22.04.1 (arm64)
linux-aws-6.5/6.5.0-1011.11~22.04.1 (arm64)
linux-azure-6.2/6.2.0-1018.18~22.04.1 (arm64)
linux-azure-6.5/6.5.0-1010.10~22.04.1 (arm64)
linux-gcp-5.19/5.19.0-1030.32~22.04.1 (arm64)
linux-gcp-6.2/6.2.0-1019.21~22.04.1 (arm64)
linux-hwe-5.19/5.19.0-50.50 (arm64)
linux-hwe-6.2/6.2.0-39.40~22.04.1 (arm64)
linux-lowlatency-hwe-6.2/6.2.0-1018.18~22.04.1 (arm64)
mediawiki/1:1.35.6-1 (s390x)
mosquitto/2.0.11-1ubuntu1.1 (amd64, arm64, armhf, s390x)
samba/2:4.15.13+dfsg-0ubuntu1.5 (ppc64el)
systemd/unknown (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Valentin David (valentin.david) wrote :

I have done manual tests with jammy-proposed and focal-proposed.

I have added in /etc/systemd/system, the following service and enabled it. Then rebooted.
```
[Unit]
DefaultDependencies=no
Before=systemd-udev-trigger.service

[Service]
Type=oneshot
ExecStart=systemctl daemon-reload

[Install]
WantedBy=multi-user.target
```

If the bug was present, it would not complete the boot and have filesystems unmounted. When the fix is working, the boot would complete and it is possible to ssh to it.

Results of the test:

Ubuntu Core 22
Without jammy-proposed: systemd 249.11-0ubuntu3.11 -> broken
With jammy-proposed: systemd 249.11-0ubuntu3.12 -> fixed

Ubuntu Core 20, with pc-kernel 22/beta
Without focal-proposed: systemd 245.4-4ubuntu3.22 -> broken
With focal-proposed: systemd 245.4-4ubuntu3.23 -> fixed

So I confirm the systemd packages in jammy-proposed and focal-proposed are fixing this issue.

tags: added: verification-done-focal verification-done-jammy
removed: verification-needed-focal verification-needed-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/245.4-4ubuntu3.23)

All autopkgtests for the newly accepted systemd (245.4-4ubuntu3.23) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

casync/2+20190213-1 (armhf)
gvfs/1.44.1-1ubuntu1.2 (ppc64el)
linux-gcp-5.15/5.15.0-1048.56~20.04.1 (arm64)
linux-hwe-5.15/5.15.0-91.101~20.04.1 (armhf)
linux-oracle-5.15/5.15.0-1049.55~20.04.1 (arm64)
mariadb-10.3/1:10.3.38-0ubuntu0.20.04.1 (armhf)
netplan.io/0.104-0ubuntu2~20.04.4 (s390x)
puppet/5.5.10-4ubuntu3 (armhf)
upower/0.99.11-1build2 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 249.11-0ubuntu3.12

---------------
systemd (249.11-0ubuntu3.12) jammy; urgency=medium

  * core/device: ignore DEVICE_FOUND_UDEV bit on switching root (LP: #2037281)
    File: debian/patches/lp2037281-core-device-ignore-DEVICE_FOUND_UDEV-bit-on-switching-roo.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=00f86f0b20f794f30aabe7181912d2ec2207e292
  * use read-only /etc hack in more places (LP: #2035122)
    File: debian/patches/debian/UBUNTU-Support-system-image-read-only-etc.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=c57406e850396a5d446aefe5e70a3aeaad080d72
  * autopkgtest: do not allow qemu to be used on ppc64el.
    Almost every run on ppc64el takes 12 to 24 hours, so do this as a last
    resort to relieve pressure on autopkgtest infrastructure.
    File: debian/tests/upstream
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=d125a1ed3f01e59dba2f370c13801bfb76c16f5d

 -- Nick Rosbrook <email address hidden> Tue, 21 Nov 2023 15:57:17 -0500

Changed in systemd (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 245.4-4ubuntu3.23

---------------
systemd (245.4-4ubuntu3.23) focal; urgency=medium

  [ Nick Rosbrook ]
  * core/device: ignore DEVICE_FOUND_UDEV bit on switching root (LP: #2037281)
    File: debian/patches/lp2037281-core-device-ignore-DEVICE_FOUND_UDEV-bit-on-switching-roo.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=7793563bb38a84a3dc6bc0da1c08546c3b915ab8
  * dns-query: bump CNAME_MAX to 16 (LP: #2024009)
    File: debian/patches/lp2024009-dns-query-bump-CNAME_MAX-to-16.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=193899d103d44c642d362e9916b14df844ec702f
  * Fall back to kexec when no kexec binary exists (LP: #1969365)
    File: debian/patches/lp1969365-Fall-back-to-kexec-when-no-kexec-binary-exists.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=3934f3794427dee4e72824998dd4c6e6d5875289
  * test: ignore LXC filesystem when checking for writable locations (LP: #2029352)
    File: debian/patches/lp2029352-test-ignore-LXC-filesystem-when-checking-for-writable-loc.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=70facbfbf54c4ffb31ba392dbe3fec3084fdf3bc

  [ Heitor Alves de Siqueira ]
  * core/mount: adjust deserialized state based on /proc/self/mountinfo (LP: #1837227)
    Author: Heitor Alves de Siqueira
    File: debian/patches/lp1837227-core-mount-adjust-deserialized-state-based-on-proc-self-m.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=a0a749953d309f48bc45140102adf205d1071c4d

 -- Nick Rosbrook <email address hidden> Tue, 21 Nov 2023 16:10:21 -0500

Changed in systemd (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.