Comment 0 for bug 1829823

Revision history for this message
Matthew Ruffell (mruffell) wrote :

[Impact]

 * libvirt-bin, in: libvirt-1.3.1-1ubuntu10.24~cloud0 in trusty mitaka uca
   and the parent package in xenial, libvirt-1.3.1-1ubuntu10.24 are effected.

 * When you shutdown a system in trusty which is running some kvm virtual machines,
   the libvirt-bin service is stopped before libvirt-guests. libvirt-guests tries
   to connect to the libvirt socket to send shutdown commands to the running vms,
   which cannot happen since libvirtd is not running.

 * On some machines, the qemu processes behind the virtual machines are not killed
   and are left behind as defunct processes, which can cause the system to hang
   on them not being terminated.

 * The bug is caused by the libvirt-bin upstart script [1] calling a non-existant
   script, /usr/lib/libvirt/libvirt-stop-guests [2]. This script used to exist
   in the upstart script itself in version 1.2.2-0ubuntu13.1.27 [3], the version
   in the trusty archives. In liberty UCA, version 1.2.16-2ubuntu11.15.10.4~cloud0
   [4], the script was seperated out into /usr/lib/libvirt/libvirt-stop-guests [2].
   In the mitaka release, the libvirt-stop-guests script was removed and rewritten
   as /etc/init.d/libvirt-guests [5], but the script in [1] was never updated to
   point to it.

   [1] http://paste.ubuntu.com/p/GxxBczkCmk
   [2] http://paste.ubuntu.com/p/fKCDQh46vh
   [3] http://paste.ubuntu.com/p/QrKXqK2Bvz
   [4] http://paste.ubuntu.com/p/W8DgQwpYv3
   [5] http://paste.ubuntu.com/p/Z28Sp2fPd6

 * Since the upstart script was never updated to point to it, libvirt-bin stops
   without stopping libvirt-guests first. When libvirt-guests is stopped later,
   it cannot access the libvirt socket, cannot shut down the machines, causing
   the bug.

 * The fix is to change the upstart script to point to the new libvirt-guests
   script.

[Test Case]

* You can reproduce this in trusty with the mitaka UCA enabled.

1) Enable mitaka UCA and install libvirt0 and libvirt-bin

$ sudo add-apt-repository cloud-archive:mitaka
$ sudo apt update
$ sudo apt install libvirt0 libvirt-bin

2) Install a virtual machine, either by using virt-install or virt-manager.
   I used a bionic VM.

3) Enable debugging on libvirt-guests so you can see what is going on

Modify /etc/init.d/libvirt-guests and add "-x" to the end of "!/bin/sh"

4) With the vm running, shut down the system

$ sudo shutdown -h now

5) Check /var/log/upstart/libvirt-bin.log, on reboot. It will say
"No such file or directory: /usr/lib/libvirt/libvirt-stop-guests"

6) During that shutdown, you will see messages like:
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

What should happen:

If you follow the same steps with the fixed package, when you look at
/var/log/upstart/libvirt-bin.log, you will see output of libvirt-guests
connecting to and shutting down the virtual machines which lookes a little like
this: https://paste.ubuntu.com/p/s4jyJX2y9F/

[Regression Potential]

 * There is only one file modified, the upstart script for libvirt-bin.
   Currently this upstart file references a file which doesn't exist, so fixing
   it will restore the behaviour in a way which aligns with excactly what took
   place in previous versions.

 * In xenial, there is no concern of stopping an already stopped service in the
   event that the upstart script pre-stop section is called by systemd.

 * This change only effects systems during shutdown while they still
   have virtual machines running, and do not effect starting and stopping services
   while the machine is running normally.

 * I believe the regression potential is low.

[Other Info]

 * Xenial is not effected by this bug even though it ships the exact same packages.
   This is because xenial uses insserv to generate service dependency files
   ".depend.boot" ".depend.start" ".depend.stop" which parse the scripts in
   /etc/init.d/ and systemd respects the dependency ordering in these files.
   libvirt-guests reports a dependency on libvirt-bin in the script header,
   so systemd will always stop libvirt-guests before libvirt-bin, avoiding the
   problem seen in trusty.

 * The fix is needed in trusty mitaka UCA and xenial will likely need the SRU
   as part of the process.