LXC with r/w sys and udev keeps trying to unmount bind mounts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| systemd |
Fix Released
|
Undecided
|
Unassigned | |
| systemd (Ubuntu) |
Medium
|
Martin Pitt | ||
| Vivid |
Medium
|
Unassigned |
Bug Description
I recently hit ENOSPC which turned out to be the result of various syslogs from the current and previous boots totalling up to 30G.
They were filled with messages like
Apr 15 11:22:35 vivid systemd[1]: Unit var-lib-
Apr 15 11:22:35 vivid systemd[1]: Unmounting /var/lib/
Apr 15 11:22:35 vivid umount[31795]: umount: /var/lib/
Apr 15 11:22:35 vivid umount[31795]: (In some cases useful info about processes that
Apr 15 11:22:35 vivid umount[31795]: use the device is found by lsof(8) or fuser(1).)
Apr 15 11:22:35 vivid systemd[1]: var-lib-
Apr 15 11:22:35 vivid systemd[1]: Failed unmounting /var/lib/
Apr 15 11:22:35 vivid systemd[1]: Unit var-lib-
Apr 15 11:22:35 vivid systemd[1]: Unmounting /var/lib/
Apr 15 11:22:35 vivid systemd[1]: Unmounted /var/lib/
Apr 15 11:22:35 vivid systemd[1]: Unit var-lib-
looping constantly for the duration of any builds in sbuild.
Why is systemd trying to do this?
SRU TEST CASE:
--------------
- Create a vivid container, and set "lxc.aa_profile = unconfined" and "lxc.mount.auto = sys:rw cgroup" in its config. (Note that the latter is not supported!)
- Start the container
- Run "mount -v -o bind /bin /mnt"
- Observe that it doesn't stay mounted, but "sudo journalctl" says
Mai 06 09:17:08 test systemd[1]: Unit mnt.mount is bound to inactive unit dev-sda3.device. Stopping, too.
Mai 06 09:17:08 test systemd[1]: Unmounting /mnt...
Mai 06 09:17:08 test systemd[1]: Unmounted /mnt.
- With the fixed package bind mounts stay mounted.
REGRESSION POTENTIAL: This could potentially break cleanup of stale mount points of either hotplug devices which disappear, or media which get forcefully ejected (like CDs). Testing should include that these still work.
laney@vivid> apt-cache policy systemd
systemd:
Installed: 219-7ubuntu1
Candidate: 219-7ubuntu1
Version table:
*** 219-7ubuntu1 0
500 http://
100 /var/lib/
|
#4 |
Bump. System is unbootable with 219 release. Raising importance.
|
#5 |
Looks like that applying this patch:
https:/
allows to mount devices.
Iain Lane (laney) wrote : | #1 |
These are going over the bus
signal sender=:1.0 -> dest=(null destination) serial=10268593 path=/org/
string "org.freedeskto
array [
dict entry(
string "What"
variant string "/dev/disk/
)
dict entry(
string "Options"
variant string "rw,relatime,
)
dict entry(
string "Type"
variant string "ext4"
)
dict entry(
string "ControlPID"
variant uint32 0
)
dict entry(
string "Result"
variant string "exit-code"
)
]
array [
string "ExecMount"
string "ExecUnmount"
string "ExecRemount"
]
signal sender=:1.0 -> dest=(null destination) serial=10268594 path=/org/
string "org.freedeskto
array [
dict entry(
string "ActiveState"
variant string "active"
)
dict entry(
string "SubState"
variant string "mounted"
)
dict entry(
string "InactiveExitTi
variant uint64 1429097554602566
)
dict entry(
string "InactiveExitTi
variant uint64 10242667599
)
dict entry(
string "ActiveEnterTim
variant uint64 1429097829298497
)
dict entry(
string "ActiveEnterTim
variant uint64 10517363530
)
dict entry(
string "ActiveExitTime
variant uint64 1429097829292302
)
dict entry(
string "ActiveExitTime
variant uint64 10517357335
)
dict entry(
string "InactiveEnterT
variant uint64 0
)
dict entry(
string "InactiveEnterT
variant uint64 0
)
dict entry(
string "Job"
variant struct {
}
)
dict entry(
string "ConditionResult"
variant boolean false
)
dict entry(
string "AssertResult"
variant boolean false
)
dict entry(
string "ConditionTimes
variant uint64 0
)
dict entry(
string "ConditionTimes
variant uint64 0
)
dict entry(
string "AssertTimestamp"
...
|
#7 |
(In reply to Tomasz Paweł Gajc from comment #2)
> Looks like that applying this patch:
> https:/
> dependencies-
>
> allows to mount devices.
Thank you very much, this patch helps me, I am doubt why upstream don't fix this problem.
This is a bad bug.
Please see my report on the Arch Linux bug tracker on it here: https:/
This report is using systemd from Arch's [testing] repo which seems to be version 219 with some patches applied by the Arch maintainers which try to address this bug: systemd 219-6, https:/
In summary, even when the bug has been "fixed" (allowing mounts to be made again) there are cases when systemd is unmounting things it should not be when the user runs umount (seems like it unmounts ALL mounts associated with a device whenever any mount associated with that device is umounted).
A bug that breaks filesystem mouting/unmounting seems particularly critical. Whatever new logic 219 introduced that created this bug should probably be reverted from the systemd release until all the mounting and unmounting corner cases can be tested and covered.
|
#9 |
Another use case to complement the report:
- system is a home server running Arch linux, full disk encrypted with dmcrypt/LUKS and remotely unlocked via dropbear_
- it was updated today with the latest systemd packages ({lib}systemd{
What happens: after rebooting I am locked out at the initramfs stage with the following error messages:
- on client trying to ssh in dropbear_initrd:
Device /dev/disk/
- on server:
Running systemd 219
(dropbear initialization sequence, everything is ok)
Starting dropbear
[123] Apr 23 09:43:56 Running in background
(try to connect remotely via SSH)
Pubkey auth succeeded for 'root' with key xxx from 192.xxx
syslogin_
Exit (root) Disconnect received
ERROR: device '/dev/mapper/
ERROR: Unable to find rot device '/dev/mapper/
You are being dropped to a recovery shell
If I don't try to login via SSH there is a 15 seconds delay between [123] line and the first ERROR; there is no mean whatsoever to unlock dropbear locally as used to be the case ("enter passphrase for /dev/disk/
My kernel command line is BOOT_IMAGE=
|
#10 |
Created attachment 115298
Proposed patch
With this patch systemd does not umounts manually mounted devices
Iain Lane (laney) wrote : | #2 |
Seems the revert in the linked bug stops this happening for me.
Changed in systemd: | |
importance: | Unknown → Critical |
status: | Unknown → Confirmed |
Changed in systemd (Ubuntu): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Martin Pitt (pitti) |
Changed in systemd (Ubuntu Vivid): | |
milestone: | none → vivid-updates |
Martin Pitt (pitti) wrote : Re: While sbuilding, systemd loops attempting to umount the underlay | #11 |
Iian, https:/
- a full journalctl from boot up to the point where this surfaces, preferably with "debug"; or
- a recipe how to reproduce this in a VM; I (and I'm sure many others) use sbuild daily without getting this, so somehow your config is special.
Thanks!
Martin Pitt (pitti) wrote : | #12 |
Not fixed by that commit. Reproducible in ssh ubuntu@10.55.32.96
Changed in systemd: | |
importance: | Critical → Undecided |
status: | Confirmed → New |
summary: |
- While sbuilding, systemd loops attempting to umount the underlay + While sbuilding in LXLC, systemd loops attempting to umount the underlay |
summary: |
- While sbuilding in LXLC, systemd loops attempting to umount the underlay + While sbuilding in LXC, systemd loops attempting to umount the underlay |
Changed in systemd (Ubuntu): | |
milestone: | vivid-updates → none |
Martin Pitt (pitti) wrote : Re: While sbuilding in LXC, systemd loops attempting to umount the underlay | #13 |
Simpler reproducer:
- Create a vivid container, and set "lxc.aa_profile = unconfined"
- Start the container
- Run "mount -v -o bind /bin /mnt"
Mai 06 09:17:08 test systemd[1]: Unit mnt.mount is bound to inactive unit dev-sda3.device. Stopping, too.
Mai 06 09:17:08 test systemd[1]: Unmounting /mnt...
Mai 06 09:17:08 test systemd[1]: Unmounted /mnt.
Martin Pitt (pitti) wrote : | #14 |
In my local container, http://
Adding these options in the container starts udev and reproduces this issue again:
lxc.mount.auto = sys:rw cgroup
lxc.mount.entry = /etc/schroot /srv/lxc/
Martin Pitt (pitti) wrote : | #15 |
FTR: pull in http://
Martin Pitt (pitti) wrote : | #16 |
I proposed a patch for this upstream: http://
Iain, it would be great if you could confirm that it works for you as well? I attach a patched amd64 /lib/systemd/
Iain Lane (laney) wrote : Re: [Bug 1444402] Re: While sbuilding in LXC, systemd loops attempting to umount the underlay | #17 |
On Thu, May 14, 2015 at 10:55:50AM -0000, Martin Pitt wrote:
> I proposed a patch for this upstream:
> http://
>
> Iain, it would be great if you could confirm that it works for you as
> well? I attach a patched amd64 /lib/systemd/
> that's useful for you.
Cheers Martin. I applied that patch to Ubuntu's systemd package and
started the container with that.
Happy to report that it seems to work fine for me. The mounts stay
intact as they are supposed to.
--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]
Martin Pitt (pitti) wrote : Re: While sbuilding in LXC, systemd loops attempting to umount the underlay | #18 |
First patch was nack'ed; second proposal: http://
Attaching an updated /lib/systemd/
tags: | added: systemd-boot |
summary: |
- While sbuilding in LXC, systemd loops attempting to umount the underlay + LXC with r/w sys and udev keeps trying to unmount bind mounts |
Martin Pitt (pitti) wrote : | #19 |
This is the command line I'm using for test iteration. It's not that simple, so I want to keep it here in case I ever need it again.
This builds systemd from git, copies it into a "test" container, starts it, runs the bind mount/umount, and shows the debug log:
schroot -r -c session:
Martin Pitt (pitti) wrote : | #20 |
Refined command line to also verify that the tentative device *does* move to "dead" once the last reference gets unmounted:
schroot -r -c session:
Martin Pitt (pitti) wrote : | #21 |
New set of proposed patches:
http://
http://
Attached /lib/systemd/
Martin Pitt (pitti) wrote : | #22 |
I got http://
Changed in systemd: | |
status: | New → Fix Released |
Martin Pitt (pitti) wrote : | #23 |
After some more back and forth we fixed the remaining issues as well now:
http://
http://
Martin Pitt (pitti) wrote : | #24 |
Changed in systemd (Ubuntu): | |
status: | In Progress → Fix Committed |
Changed in systemd (Ubuntu Vivid): | |
assignee: | Martin Pitt (pitti) → nobody |
description: | updated |
Launchpad Janitor (janitor) wrote : | #25 |
This bug was fixed in the package systemd - 219-10ubuntu1
---------------
systemd (219-10ubuntu1) wily; urgency=medium
* Merge with Debian experimental branch. Remaining Ubuntu changes:
- Hack to support system-image read-only /etc, and modify files in
/
- Keep our much simpler udev maintainer scripts (all platforms must
support udev, no debconf).
- initramfs init-top: Drop $ROOTDELAY, we do that in a more sensible way
with wait-for-root. Will get applicable to Debian once Debian gets
wait-for-root in initramfs-tools.
- initramfs init-bottom: If LVM is installed, settle udev,
otherwise we get missing LV symlinks. Workaround for LP #1185394.
- Add debian/
dependencies to "lvm2" which is handled with udev rules in Ubuntu.
- Add debian/
script.
- Provide shutdown fallback for upstart. (LP: #1370329)
- debian/
really support "allow-hotplug" in Ubuntu at the moment, so we need to
deal with "auto" devices appearing after "/etc/init.
already ran. (LP: #1374521) Also run ifup in the background during boot,
to avoid blocking network.target. (LP: #1425376)
- ifup@.service: Drop dependency on networking.service (i. e.
/
This avoids unnecessary dependencies/
cycles if hooks wait for other interfaces to come up (like ifenslave
with bonding interfaces). (LP: #1414544)
- Add Get-RTC-
Ubuntu we currently keep the setting whether the RTC is in local or UTC
time in /etc/default/rcS "UTC=yes|no", instead of /etc/adjtime.
(LP: #1377258)
- Put session scopes into all cgroup controllers. This makes unprivileged
user LXC containers work under systemd. (LP: #1346734)
- systemctl: Don't forward telinit u to upstart. This works around
upstart's Restart() always reexec'ing /sbin/init on Restart(), even if
that changes to point to systemd during the upgrade. This avoids running
systemd during a dist-upgrade. (LP: #1430479)
- Drop hwdb-update dependency from udev-trigger.
introduced in v219-stable. This causes udev and plymouth to start too
late and isn't really needed in Ubuntu yet as we don't support stateless
systems yet and handle hwdb.bin updates through dpkg triggers. This can
be dropped again with initramfs-tools 0.117.
- Lower Breaks: to plymouth version which has the udev inotify fix in
Ubuntu.
- Lower libappamor dep to the Ubuntu version where it moved to /lib.
- Lower apparmor Breaks: to the Ubuntu version that dropped $remote_fs.
- Change systemd-sysv's conflicts to upstart-sysv. (LP: #1422681)
- Make failure of boot-and-services NSpawn.test_boot non-fatal for now.
This currently fails when being triggered by Jenkins, but is totally
...
Changed in systemd (Ubuntu): | |
status: | Fix Committed → Fix Released |
Ross Patterson (rossp) wrote : | #26 |
How can one get this fix in vivid?
Martin Pitt (pitti) wrote : | #27 |
@Ross: I already prepared the fix some two weeks ago, it's waiting in https:/
Hello Iain, or anyone else affected,
Accepted systemd into vivid-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-
Further information regarding the verification process can be found at https:/
Changed in systemd (Ubuntu Vivid): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed |
Iain Lane (laney) wrote : | #29 |
Confirmed the fix on a cloud VM, thanks!
tags: |
added: verification-done removed: verification-needed |
Launchpad Janitor (janitor) wrote : | #30 |
This bug was fixed in the package systemd - 219-7ubuntu6
---------------
systemd (219-7ubuntu6) vivid; urgency=medium
* Fix assertion crash with empty Exec*= paths. (LP: #1454173)
* systemd-fsckd autopkgtest: Stop assuming that
/etc/
* systemd-fsckd autopkgtest: Add missing plymouth test dependency.
* debian/
* Fix "tentative" state of devices which are not in /dev (mostly in
containers), and avoid overzealous cleanup unmounting of mounts from them.
(LP: #1444402)
* journal: Gracefully handle failure to bind to audit socket, which is known
to fail in namespaces (containers) with current kernels. Also
conditionalize systemd-
(LP: #1457054)
* Add sigpwr-
a container. This makes lxc-stop work for systemd containers.
(LP: #1457321)
-- Martin Pitt <email address hidden> Thu, 21 May 2015 14:47:46 +0200
Changed in systemd (Ubuntu Vivid): | |
status: | Fix Committed → Fix Released |
Chris J Arges (arges) wrote : Update Released | #31 |
The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Log story short.
After update to systemd-219, dracut during live iso boot was not able to mount devices for further switch root, whereby booting iso in live mode was not possible. An example:
LOOPDEV=$( losetup -f ) LiveOS/ squashfs. img
losetup -r $LOOPDEV /live/media/
mount -n -t squashfs -o ro $LOOPDEV /live/distrib
After a deep digging, and getting system to boot into a real root simple mount command was not mounting anything.
mount -n -t squashfs -o ro /media/ OpenMandriva_ 2015.0/ LiveOS/ squashfs. img /media/test
Applying those patches fixed issue: /bugzilla. gnome.org/ show_bug. cgi?id= 743891
https:/