Container file system corruption on libvirtd restart
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libvirt (Ubuntu) |
Incomplete
|
High
|
Unassigned |
Bug Description
A data corruption bug exists in the LXC driver for libvirt, that has just cost me a MySQL server.
Steps to reproduce:
- (for visualization only) In virt-manager add a connection to local lxc://
- create an LXC container, that has a loop-mounted image file and start it
- (for visualization only) the container shows as running in virt-manager
- systemctl stop libvirtd ; sleep 2 ; sync ; systemctl start libvirtd
- (for visualization only) the container shows as shut off in virt-manager
- The container no longer responds to network requests, has no attachable console
- The loop mount does no longer show up on host-side "mount" output
BUT: losetup -a reveals, that a loop device is still attached to the image file
BUT: In reality this loop device is still mounted, processes in the container still access the file system
BUT: There is no way to unmount or free it - losetup -d ends without an error but does nothing
- restart the container (virsh -c lxc:// start name-of-container or via virt-manager)
THIS SHOULD NOT BE ALLOWED
- The image file is now twice mounted and corruption starts creeping in
- Depending on how long this state persists (in terms of IO), the damage can be significant
When finally discovering the problem, the only way to unstick the container is a reboot. This is the final nail in the coffin: The hidden instance syncs AFTER the new instance, effectivly pushing back the past.
This can be quite nasty, if a libvirt restart results from an unattended upgrade.
I do understand, that libvirt/LXC is deprecated - this strikes me as a rather unsubtle way to push users to the newest incarnation, though.
In non-enterprisy environments (read SMB or NGO) virt-manager is often used as a "power user" tool, and those end users are unwilling if not unable to use different toolsets for containers and full-fledged VMs. And disabling unattended upgrades in such an environment is inviting trouble.
affects: | udev (Ubuntu) → libvirt (Ubuntu) |
Changed in libvirt (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
Hi,
this would be deemed a high priority bug for upstream libvirt, but Ubuntu has always, back to 2010, supported lxc, then lxd, instead of libvirt-lxc. (So it's not that libvirt-lxc is deprecated, rather it was never supported in Ubuntu)
Which version of Ubuntu are you using?
Can you reliably reproduce it? If you can give a recipe for "start with a clean ubuntu cloud VM image; set up a container "like this", do that, then it dies", then we may be able to nail down the cause and/or talk to upstream.
If at all possible to migrate you to using lxd containers, that would be ideal, but I assume you have control software written around libvirt's api making that untenable.