Comment 35 for bug 1571209

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thank you Guessi for your summary once more.
I tried to get closer to your case.

# Install a Trusty system with Raid-1, that has space for many guests (a lot of memory) but is slow (1 cpu)
$ wget http://releases.ubuntu.com/14.04/ubuntu-14.04.5-server-amd64.iso
$ qemu-img create -f qcow2 trusty-disk1.qcow2 10G
$ qemu-img create -f qcow2 trusty-disk2.qcow2 10G
$ sudo qemu-system-x86_64 -enable-kvm -hda trusty-disk1.qcow2 -hdb trusty-disk2.qcow2 -cdrom ubuntu-14.04.5-server-amd64.iso -m 16384 -smp 1
# On this install SW Raid1 according to [1]
# I set up 20 guests with autostart as before
# system seems slow enough, but in addition break the secondary raid device

# I also took a copy and blasted a long 256M whole of zeroes after its partition table
$ dd if=/dev/zero bs=1024 of=trusty-disk2.qcow2 seek=4096 count=$((4096*64))
# Start Trusty system again, but that blow was too much for it to recover at all
# given that the startup took so long (I made the qemu guest rather slow in general) I don't think I need that part of it to slow it down more.
# So further on I killed it hard to cause minor FS issues, but that matches your power loss case better anyway

On reboot then the service still was good and not timing out.
Guests were still starting a minute later, but nothing stopped libvirt from not having it's socket ready in time.

Message in /var/log/upstart/libvirt-bin.log is just ".. ready", no failure to be seen.

Instead of making the system even slower (unusable) I decided to shorten the retries and instead of 5*2sec I did only 2*1 sec, with that it finally triggered *yay*.

Checked another reboot and yes, still showing up.

Error is:
...
Giving up waiting for /var/tun/libvirt/libvirt-sock
libvirt-bin stop/post-start, process 1217
    post-start process 1218
/usr/sbin/libvirtd: error: unable to determine if daemon is running: No such file or directory

So in >=Xenial as I mentioned all of this doesn't matter unless you explicitly opt into sysv/upstart. But in Trusty we now finally have a case to work with.

Trying a fix combining your suggestion with:
- incremental sleep times
- considering the potential path being changes in libvirtd.conf

[1]: https://help.ubuntu.com/community/Installation/SoftwareRAID