Server Team Backports PPA: latest upgrades to libvirt shuts down domains without restarting them

Bug #1965523 reported by Brian Turek
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

My focal host recently patched itself to the newest libvirt version available in the Server Team Backports PPA (https://launchpad.net/~canonical-server/+archive/ubuntu/server-backports) which caused all my domains to shutdown and **not** restart despite being marked for autostart.

# lsb_release -rd
Description: Ubuntu 20.04.4 LTS
Release: 20.04

Excerpt from my /var/log/apt/history.log

Start-Date: 2022-03-18 07:02:13
Commandline: /usr/bin/unattended-upgrade
Upgrade: libvirt-daemon-system-systemd:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon-config-network:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-clients:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon-driver-storage-rbd:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon-driver-qemu:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon-system:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt0:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon-config-nwfilter:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1), libvirt-daemon:amd64 (7.6.0-0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1, 7.6.0-0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1)
End-Date: 2022-03-18 07:02:46

Excerpt from journalctl -u libvirtd:

Mar 18 07:02:20 REDACTED systemd[1]: Stopping Virtualization daemon...
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Succeeded.
Mar 18 07:02:20 REDACTED systemd[1]: Stopped Virtualization daemon.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 9913 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 10299 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 19444 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 19917 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 27108 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: libvirtd.service: Found left-over process 27660 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:20 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:20 REDACTED systemd[1]: Starting Virtualization daemon...
Mar 18 07:02:20 REDACTED systemd[1]: Started Virtualization daemon.
Mar 18 07:02:21 REDACTED systemd[1]: Stopping Virtualization daemon...
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Main process exited, code=dumped, status=11/SEGV
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Failed with result 'core-dump'.
Mar 18 07:02:24 REDACTED systemd[1]: Stopped Virtualization daemon.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 9913 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 10299 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 19444 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 19917 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 27108 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: libvirtd.service: Found left-over process 27660 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:24 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:24 REDACTED systemd[1]: Starting Virtualization daemon...
Mar 18 07:02:25 REDACTED systemd[1]: Started Virtualization daemon.
Mar 18 07:02:27 REDACTED systemd[1]: Stopping Virtualization daemon...
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Succeeded.
Mar 18 07:02:27 REDACTED systemd[1]: Stopped Virtualization daemon.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 9913 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 10299 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 19444 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 19917 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 27108 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: libvirtd.service: Found left-over process 27660 (virtiofsd) in control group while starting unit. Ignoring.
Mar 18 07:02:27 REDACTED systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mar 18 07:02:27 REDACTED systemd[1]: Starting Virtualization daemon...
Mar 18 07:02:27 REDACTED systemd[1]: Started Virtualization daemon.
Mar 18 07:02:37 REDACTED libvirtd[4095158]: libvirt version: 7.6.0, package: 0ubuntu1.1~backport20.04-202203171551~ubuntu20.04.1 (Launchpad Package Builder <email address hidden> Thu, 17 Mar 2022 15:51:50 +0000)
Mar 18 07:02:37 REDACTED libvirtd[4095158]: hostname: REDACTED
Mar 18 07:02:37 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:37 REDACTED ovs-vsctl[4099485]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED ovs-vsctl[4099605]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:38 REDACTED ovs-vsctl[4099677]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:39 REDACTED ovs-vsctl[4099722]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:39 REDACTED ovs-vsctl[4102892]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:39 REDACTED ovs-vsctl[4103115]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:39 REDACTED ovs-vsctl[4103132]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:42 REDACTED libvirtd[4095158]: internal error: End of file from qemu monitor
Mar 18 07:02:42 REDACTED ovs-vsctl[4103313]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port REDACTED
Mar 18 07:02:44 REDACTED systemd[1]: Stopping Virtualization daemon...
Mar 18 07:02:44 REDACTED systemd[1]: libvirtd.service: Succeeded.
Mar 18 07:02:44 REDACTED systemd[1]: Stopped Virtualization daemon.
Mar 18 07:02:44 REDACTED systemd[1]: Starting Virtualization daemon...
Mar 18 07:02:44 REDACTED systemd[1]: Started Virtualization daemon.
Mar 18 09:04:14 athens ovs-vsctl[4163579]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth--REDACTED -- set Interface veth-REDACTED "external-ids:attached-mac=\"REDACTED\"" -- set Interface veth-REDACTED "external-ids:vm-id=\"REDACTED\"" -- set Interface veth-REDACTED external-ids:iface-status=active

The last log entry is me manually starting up a VM that is a critical network service by simply doing a virsh start <DOMAIN> after I realized there was a problem.

What a "normal" libvirtd log looks like after a restart:

Mar 18 09:25:00 REDACTED systemd[1]: Starting Virtualization daemon...
Mar 18 09:25:04 REDACTED systemd[1]: Started Virtualization daemon.
Mar 18 09:25:11 REDACTED ovs-vsctl[6500]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:16 REDACTED ovs-vsctl[7031]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:19 REDACTED ovs-vsctl[9403]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:21 REDACTED ovs-vsctl[11914]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:24 REDACTED ovs-vsctl[17752]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:27 REDACTED ovs-vsctl[23094]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:29 REDACTED ovs-vsctl[27019]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...
Mar 18 09:25:31 REDACTED ovs-vsctl[29650]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-port br0 veth-REDACTED ...

Excerpt from journalctl -u libvirt-guests

Mar 18 07:02:29 REDACTED systemd[1]: Stopping Suspend/Resume Running libvirt Guests...
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095560]: Running guests on default URI:
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095527]: REDACTED, REDACTED, REDACTED, REDACTED, REDACTED, REDACTED, REDACTED, REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095604]: Shutting down guests on default URI...
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095622]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095639]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095654]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095671]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095692]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095713]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095732]: Starting shutdown on guest: REDACTED
Mar 18 07:02:29 REDACTED libvirt-guests.sh[4095749]: Starting shutdown on guest: REDACTED
Mar 18 07:02:30 REDACTED libvirt-guests.sh[4095527]: Waiting for 8 guests to shut down, 120 seconds left
Mar 18 07:02:36 REDACTED libvirt-guests.sh[4095527]: Waiting for 8 guests to shut down, 115 seconds left
Mar 18 07:02:38 REDACTED libvirt-guests.sh[4099549]: Shutdown of guest REDACTED complete.
Mar 18 07:02:39 REDACTED libvirt-guests.sh[4103083]: Shutdown of guest REDACTED complete.
Mar 18 07:02:39 REDACTED libvirt-guests.sh[4103099]: Shutdown of guest REDACTED complete.
Mar 18 07:02:39 REDACTED libvirt-guests.sh[4103105]: Shutdown of guest REDACTED complete.
Mar 18 07:02:39 REDACTED libvirt-guests.sh[4103111]: Shutdown of guest REDACTED complete.
Mar 18 07:02:40 REDACTED libvirt-guests.sh[4103154]: Shutdown of guest REDACTED complete.
Mar 18 07:02:40 REDACTED libvirt-guests.sh[4103161]: Shutdown of guest REDACTED complete.
Mar 18 07:02:42 REDACTED libvirt-guests.sh[4095527]: Waiting for 1 guests to shut down, 110 seconds left
Mar 18 07:02:42 REDACTED libvirt-guests.sh[4103324]: Shutdown of guest REDACTED complete.
Mar 18 07:02:42 REDACTED systemd[1]: libvirt-guests.service: Succeeded.
Mar 18 07:02:42 REDACTED systemd[1]: Stopped Suspend/Resume Running libvirt Guests.
Mar 18 07:02:42 REDACTED systemd[1]: Starting Suspend/Resume Running libvirt Guests...
Mar 18 07:02:42 REDACTED libvirt-guests.sh[4103333]: libvirt-guests is configured not to start any guests on boot
Mar 18 07:02:42 REDACTED systemd[1]: Finished Suspend/Resume Running libvirt Guests.

# virsh list --autostart
 Id Name State
-----------------------------
 1 REDACTED running
 2 REDACTED running
 3 REDACTED running
 4 REDACTED running
 5 REDACTED running
 6 REDACTED running
 7 REDACTED running
 8 REDACTED running

It is very unexpected behavior for an upgrade to (cleanly) shutdown all domains and not restart them.

Tags: server-todo
description: updated
Revision history for this message
Brian Turek (brian-turek) wrote :

I should add that I believe this actually started happening with 7.6.0, package: 0ubuntu1.1~backport20.04-202203140809~ubuntu20.04.1 but I initially thought it was a one-off problem.

Revision history for this message
Brian Turek (brian-turek) wrote :

I got some coffee in me and narrowed down the problem: the postinst file in libvirt-daemon-system was changed from just starting libvirt-guests.service to restarting libvirt-guests.service. libvirt-guests.service now shuts down all VMs but the autostart mechanism of libvirtd doesn't kick in because it was only a service restart, not a system restart.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Brian, that is very unfortunate - I've seen and worked on that in the most recent release (22.04) and there we fixed it interim-wise in libvirt and then finally in debhelper.

For some context see: https://bugs.launchpad.net/ubuntu/+source/debhelper/+bug/1959054

But that was due to a change in the -devel release which is kind of "ok" to break as long as things are fixed until the release. To happen to the backports in Focal means that someone backported the debhelper change to Focal which clearly violates SRU - a lot.
It turns out there is a gray area, which is that we are using debhelper from focal-backports and that still has the broken version.

=> debhelper | 13.5.2ubuntu1~bpo20.04.1 | focal-backports | source, all

Paride just filed bug 1965758 to address this in focal-backports.

For the time being I'm applying the same interim mitigation to the server-backports that we had in jammy until the problem was fixed. Let us hope not too many got affected by that.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I have built a fixed libvirt in server-backports mitigating the issue until debhelper is fixed.
=> 7.6.0-0ubuntu1.1~backport20.04-202203211319~ubuntu20.04.1
=> 7.6.0-0ubuntu1.1~backport18.04-202203211344~ubuntu18.04.1

I'll keep the bug open as I want to drop this change as soon as debhelper is fixed (via bug 1965758).

Revision history for this message
Brian Turek (brian-turek) wrote :

I can confirm that the upgrade now works as anticipated with VMs staying up, thanks for the quick turnaround!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the confirmation, sorry for it to happen in the first place.
Would I have been aware of the backport of the broken debhelper it could have been prevented.

I'll drop the delta on next major merge after rechecking if debhelper is fixed then.
Until then I'm marking this fix released.

Changed in libvirt (Ubuntu):
status: New → Fix Released
tags: added: server-todo
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.