Comment 32 for bug 2059272

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done on focal-proposed, following comments 23, 24, 25, 26.

Including in this comment a few key snippets from each test/comment.


LXD virtual machine

 lxc launch --vm ubuntu:focal lp2059272-focal
 lxc exec lp2059272-focal -- su - ubuntu

Enable proposed & debug symbols

 cat <<EOF | sudo tee /etc/apt/sources.list.d/proposed.list
 deb focal-proposed main universe
 deb focal-proposed main universe

 cat <<EOF | sudo tee /etc/apt/preferences.d/proposed
 Package: *
 Pin: release a=focal-proposed
 Pin-Priority: 400

 sudo apt install --yes --no-install-recommends gdb qemu-system-x86 ubuntu-dbgsym-keyring
 sudo apt update
 sudo apt install --yes --no-install-recommends -t focal-proposed libvirt{0,-daemon{,-driver-qemu,-system}}{,-dbgsym} libvirt-clients

 $ apt-cache policy libvirt-daemon-driver-qemu
   Installed: 6.0.0-0ubuntu8.20
   Candidate: 6.0.0-0ubuntu8.20
   Version table:
  *** 6.0.0-0ubuntu8.20 400
  400 focal-proposed/main amd64 Packages
  100 /var/lib/dpkg/status
      6.0.0-0ubuntu8.19 500
  500 focal-updates/main amd64 Packages
  500 focal-security/main amd64 Packages
      6.0.0-0ubuntu8 500
  500 focal/main amd64 Packages

 newgrp libvirt # or logout/login

Libvirtd debug logging

 cat <<-EOF | sudo tee -a /etc/libvirt/libvirtd.conf
 log_filters="1:qemu 1:libvirt"
 log_outputs="3:syslog:libvirtd 1:file:/var/log/libvirt/libvirtd-debug.log"

Steps with test packages on Focal (normal restarts)

 for SLEEP in $(seq 0.1 0.1 2.0); do

All VMs are still managed by libvirt:

 $ virsh list
  Id Name State
  1 test-vm-1 running
  2 test-vm-2 running
  3 test-vm-3 running
  4 test-vm-4 running
  5 test-vm-5 running
  6 test-vm-6 running
  7 test-vm-7 running
  8 test-vm-8 running
  9 test-vm-9 running
  10 test-vm-10 running

Steps with test packages on Focal (shutdown-on-init)

Scenario 1) Shutdown wins race against XML update (ie, shutdown happens first)


Now, let the qemuProcessReconnect thread continue, it will not update the XML file,
because 'quit' is set (ie, shutdown in progress)

 (gdb) t 20
 (gdb) p ((virNetDaemonPtr)anyobj)->quit
 $2 = true

 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml

 (gdb) c &

 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml


 $ sudo grep 'Leaving the update of .* domain status XML' /var/log/libvirt/libvirtd-debug.log
 2024-04-24 12:08:40.054+0000: 3770: info : qemuProcessReconnect:8157 : Leaving the update of 'test-vm' domain status XML for the next initialization (shutdown detected on this initialization).


 $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml
 <domstatus state='running' reason='booted' pid='3726'>
   <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/>
   <domain type='qemu' id='1'>

Scenario 2) Shutdown loses race against XML update (ie, update happens first)


Instead, let the qemuProcessReconnect thread take the lock, and update the XML file, but not unlock yet


 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml

 (gdb) b virObjectUnlock thread 20 if anyobj == $ptr
 (gdb) c

 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml


 $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml
 <domstatus state='running' reason='booted' pid='3726'>
   <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/>
   <domain type='qemu' id='1'>

Scenario 3) Shutdown happens along QEMU monitor calls (ie, calls don't finish)


 The XML was not updated, as expected:

 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml

 $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml
 <domstatus state='running' reason='booted' pid='3726'>
   <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/>
   <domain type='qemu' id='1'>

Now, the next time libvirtd starts, it correctly parses that XML:

  $ sudo systemctl start libvirtd.service

  $ journalctl -b -u libvirtd.service | grep -A1 error

And libvirt is aware of the domain, and can manage it:

 $ virsh list
  Id Name State
  1 test-vm running

 $ virsh destroy test-vm
 Domain test-vm destroyed

 $ virsh undefine test-vm
 Domain test-vm has been undefined

Steps with test packages on Focal (shutdown-on-runtime)

Check the formatter/options again; it is *STILL* referenced, not 0x0 anymore:

 (gdb) t 20
 (gdb) p xmlopt.privateData.format
 $3 = (virDomainXMLPrivateDataFormatFunc) 0x7fd08c3437c0 <qemuDomainObjPrivateXMLFormat>
 (gdb) p/x xmlopt.parent
 $4 = {u = {dummy_align1 = 0x1cafe0026, dummy_align2 = 0x1cafe0026, s = {magic = 0xcafe0026, refs = 0x1}}, klass = 0x7fd080043170}

Let the save function continue, and libvirt finishes shutting down:
Check the VM status XML *after*:

 $ ls -l /run/libvirt/qemu/test-vm.xml
 -rw------- 1 root root 10251 Apr 24 12:28 /run/libvirt/qemu/test-vm.xml

 $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml
 <domstatus state='running' reason='booted' pid='4055'>
   <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/>
   <domain type='qemu' id='1'>

Now, the next time libvirtd starts, it correctly parses that XML:

 $ sudo systemctl start libvirtd.service

 $ journalctl -b -u libvirtd.service | grep -A1 error

And libvirt is aware of the domain, and can manage it:

 $ virsh list
 Id Name State
 1 test-vm running

 $ virsh destroy test-vm
 Domain test-vm destroyed

 $ virsh undefine test-vm
 Domain test-vm has been undefined