never kills dnsmasq servers

Bug #1818711 reported by Dimitri John Ledkov on 2019-03-05
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Low
Unassigned

Bug Description

never kills dnsmasq servers....

So on my machine, libvirtd.service spawns multiple dnsmasq daemons (one per network), yet because the libvirtd.service is set to KillMode=process these are never killed upon stopping libvirtd.

This has a side effect that throughout the shutdown transaction, dnsmasq processes remain running, all the way until systemd-shutdownd does a final kilall spree.

This is suboptimal.

I don't know why libvirt has KillMode=process, but my guess is that one ideally wants to be able to restart libvirt without killing VMs / dnsmasq state. However, when doing shutdown, these should be really killed.

I have a couple of "solutions". Can we like create libvirt-dnsmasq@.service units that would be like wantedby libvirtd.service (libvirt target?!). That way libvirtd can be restarted/killed without killing/restarting dnsmasq daemons. But these dnsmasq daemons can be stopped by systemd "normally" during shutdown.

But i don't know what other processes libvirt can launch, imho all of them should be "systemdified" such that libvirtd.service control group does not contain any stray processes.

Alternative, we might want to add a libvirt-shutdown.service that would do: `systemctl kill libvirtd.service` to actually kill all the things libvirtd might be running, after libvirtd is stopped.

This seems to appear to affect system shutdown / reboot speed.

Jim Fehlig (jfehlig) wrote :

Does your libvirt have this patch?

https://www.redhat.com/archives/libvir-list/2019-January/msg00389.html

If so, your dnsmasq profile may need updated to accommodate that change.

Dimitri John Ledkov (xnox) wrote :

I don't believe I do, nor does the surrounding context look similar.

Steve Langasek (vorlon) wrote :

> I don't know why libvirt has KillMode=process, but my guess is
> that one ideally wants to be able to restart libvirt without
> killing VMs / dnsmasq state. However, when doing shutdown,
> these should be really killed.

Yes, but the VM shutdown should also be managed by libvirt itself, since libvirt has configurable policy for VMs on shutdown (e.g. you may want to save the VM state to disk and resume on restart, NOT shoot the VM in the head). So any solution needs to take this into account.

Currently (quite a while actually [3]) the ordering on shutdown is like this:
  libvirtd -> domain -> virt-guest-shutdown.target -> save-restore.service
And as Steve mentioned "save-restore" is to allow to shutdown or save guest state - and that not only via the default libvirt-guests.service but if wanted also via other services, therefore the .target in [3].

As you assumed the KillMode [1] of libvirtd is explicitly set to NOT kill [2] processes as you'd want to start/restart domains (e.g. qemu processes) and networks (e.g. dnsmasq processes) independently and don't want to loose them only since you re-started (or even crashed) the libvirt daemon itself.

Therefore you'd really not want a cgroup cleanup or anything like a signal handler in libvirtd that cleans up all childs - but nobody here asked for it anyway so we are fine. Lets discuss how we could clean it up better on shutdown.

To only take action at the right time we'd need a way to differ between:
- "libvirtd.service is stopping or restarting"
- "the system is shutting down".

That is already done to shutdown/save guests to preserve them in libvirt-guests.service which - before libvirtd is gone. If it does nothing or fails it is considered fine that later on that the systemd killing spree reaps what is left.

To properly remove dnsmasq processes you'd want the equivalent of a "for each net do virsh net-destroy $net". That would need to be:
- on system shutdown only
- could use similar simple mechanisms like libvirt-guests.service
- would need to be ordered to be on shutdown after libvirt-guests.service (so guests still have network when they shutdown/safe)

#1
We could now make this part of libvirt-guests.service actions to do this - after cleaning up guests.
Pro:
- no new service in the jungle of dependencies
Con:
- other components might not assume it will reap networks
- the name isn't indicating it will reap networks
- purpose overloading of the already somewhat awkward libvirt-guests.service script

#2
Or we could add another service that is very much like libvirt-guests.service maybe libvirt-network-shutdown.service - ordered to run on shutdown (would have no real start action) after libvirt-guests.service and to iterate cleaning up all currently active libvirtd networks.
It would link to [3] and actually every other ordering the same way as libvirt-guests.service does.
Pro:
- cleaner separation of guest/network services
- explicit ordering
- ability to enable/disable service individually
Con:
- one more (albeit shutdown only oneshot) service

Once we have had a pre-discussion on this here I could write something patch worthy for an initial RFC. I'd lean towards #2, but want to hear other opinions first.

[1]: https://www.freedesktop.org/software/systemd/man/systemd.kill.html#
[2]: https://libvirt.org/git/?p=libvirt.git;a=commit;h=cb640543c86428a9f472e0c425b65e462d118afa
[3]: https://libvirt.org/git/?p=libvirt.git;a=commit;h=01079727fe29dfeafac751a07e564e0d6bb53389

Dimitri John Ledkov (xnox) wrote :

Meh, I'd rather see /usr/lib/libvirt/libvirt-guests.sh at the end of stop, iterate across `virsh net-list` and call `virsh net-destroy` on them.

However, this is too heavy weight for shutdown. As net-destroy does much more than just killing dnsmasq. It appears to remove the bridge & nics too (which in turn triggers netwokrd/networkd-dispatcher/udevd churn).

Ideally, I only want to kill the dnsmasq processes somehow.

I don't like #2 approach at all. It seems like introducing simple template units libvirt-dnsmasq@.service is better, with net-name as the instance name. And make them be part of the libvirtd cgroup, although not required. Because we want these to be shutdown normally, and it shouldn't be needed to e.g. destroy/recreate network interface, just to restart dnsmasq for a libvirt network. And libvirt should be using systemctl (on systemd booted machines) to start/stop libvirt-dnsmasq@.service instances imho. But also i guess, much larger code change, as libvirtd now need to learn to call systemctl to start/stop/restart units, rather than fork/exec dnsmasq.

Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers