Comment 24 for bug 2059272

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Steps with test packages on Focal (normal restarts)
---

Restart libvirt 100 times with 10 QEMU domains.

All domains continued to be managed by libvirt.

Create 10 test VMs (test-vm-1, test-vm-2, ..., test-vm-10):

 for NAME in test-vm-{1..10}; do cat <<-EOF >test-vms.xml && virsh define test-vms.xml && virsh start $NAME; done
 <domain type='qemu'>
   <name>${NAME}</name>
   <os>
     <type>hvm</type>
   </os>
   <memory unit='MiB'>32</memory>
   <vcpu>1</vcpu>
 </domain>
 EOF

Disable the systemd unit rate limiting for (re)starts:

 sudo mkdir -p /etc/systemd/system/libvirtd.service.d/
 cat <<EOF | sudo tee /etc/systemd/system/libvirtd.service.d/override.conf
 [Unit]
 StartLimitIntervalSec=0
 EOF
 sudo systemctl daemon-reload

Now, we'll restart libvirtd 100 times with a configurable interval between restarts.

We'll restart it 100 times _for each_ configurable interval, so as to stress the
initialization code path running at that time to go together with shutdown path
in subtle variations of timing.

The restart intervals range from 0.1 seconds to 2.0 seconds, in steps of 0.1 second.
(i.e., we restart libvirtd (2.0 / 0.1) * (1 + 100) = 2020 times in this test.)

 for SLEEP in $(seq 0.1 0.1 2.0); do

   echo 'Reset libvirtd debug log'
   sudo systemctl stop 'libvirtd*'
   sudo rm -f /var/log/libvirt/libvirtd-debug.log
   sudo systemctl start libvirtd.service
   sleep 5

   for RESTART in $(seq 1 100); do
     echo "Sleep $SLEEP, Restart $RESTART"
     sudo systemctl restart libvirtd.service
     sleep $SLEEP
   done

   echo 'Check libvirtd debug log'
   sudo grep 'Leaving the update of' /var/log/libvirt/libvirtd-debug.log
   sudo cp /var/log/libvirt/libvirtd-debug.log /tmp/libvirtd-debug.log.SLEEP-${SLEEP}
   echo
 done 2>&1 | tee /tmp/libvirtd-restart.log

 Reset libvirtd debug log
 Sleep 0.1, Restart 1
 ...
 Sleep 0.1, Restart 100
 Check libvirtd debug log

 Reset libvirtd debug log
 Sleep 0.2, Restart 1
 ...
 Sleep 0.2, Restart 100
 Check libvirtd debug log

 ...

 Reset libvirtd debug log
 Sleep 2.0, Restart 1

 Sleep 2.0, Restart 100
 Check libvirtd debug log

Checking that libvirtd is started 1+100 times for each restart interval:

 $ sudo grep -c 'libvirt version' /tmp/libvirtd-debug.log.SLEEP-*
 /tmp/libvirtd-debug.log.SLEEP-0.1:101
 /tmp/libvirtd-debug.log.SLEEP-0.2:101
 /tmp/libvirtd-debug.log.SLEEP-0.3:101
 /tmp/libvirtd-debug.log.SLEEP-0.4:101
 /tmp/libvirtd-debug.log.SLEEP-0.5:101
 /tmp/libvirtd-debug.log.SLEEP-0.6:101
 /tmp/libvirtd-debug.log.SLEEP-0.7:101
 /tmp/libvirtd-debug.log.SLEEP-0.8:101
 /tmp/libvirtd-debug.log.SLEEP-0.9:101
 /tmp/libvirtd-debug.log.SLEEP-1.0:101
 /tmp/libvirtd-debug.log.SLEEP-1.1:101
 /tmp/libvirtd-debug.log.SLEEP-1.2:101
 /tmp/libvirtd-debug.log.SLEEP-1.3:101
 /tmp/libvirtd-debug.log.SLEEP-1.4:101
 /tmp/libvirtd-debug.log.SLEEP-1.5:101
 /tmp/libvirtd-debug.log.SLEEP-1.6:101
 /tmp/libvirtd-debug.log.SLEEP-1.7:101
 /tmp/libvirtd-debug.log.SLEEP-1.8:101
 /tmp/libvirtd-debug.log.SLEEP-1.9:101
 /tmp/libvirtd-debug.log.SLEEP-2.0:101

All VMs are still managed by libvirt:

 $ virsh list
  Id Name State
 ----------------------------
  2 test-vm-1 running
  3 test-vm-2 running
  4 test-vm-3 running
  5 test-vm-4 running
  6 test-vm-5 running
  7 test-vm-6 running
  8 test-vm-7 running
  9 test-vm-8 running
  10 test-vm-9 running
  11 test-vm-10 running

Remove test VMs:

 for NAME in test-vm-{1..10}; do virsh destroy $NAME && virsh undefine $NAME; done

Note that the race condition for the shutdown-on-init condition
is so tight, that it has not happened once in 2020 restarts
(the new fix logs it). It really needs a synthetic reproducer.

 $ sudo grep 'Leaving' /tmp/libvirtd-debug.log.SLEEP-*
 $