Steps with test packages on Focal (shutdown-on-runtime)
---
Stop libvirtd systemd units
sudo systemctl stop 'libvirtd*'
Start libvirt in GDB
sudo gdb \
-iex 'set confirm off' \
-iex 'set pagination off' \
-ex 'set non-stop on' \
-ex 'handle SIGTERM nostop noprint pass' \
-ex 'add-symbol-file /usr/sbin/libvirtd' \
-ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \
-ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \
-ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \
/usr/sbin/libvirtd
Add breakpoints for qemu driver cleanup and device deleted event
b qemuStateCleanup
b processDeviceDeletedEvent
run
Start test VM with an USB mouse device
cat <<-EOF >test-vm.xml
test-vmhvm321
EOF
virsh define test-vm.xml
virsh start test-vm
$ virsh list
Id Name State
-------------------------
1 test-vm running
Delete the USB mouse device
DEVICE_ID=$(virsh qemu-monitor-command test-vm --hmp 'info qtree' | grep 'dev: usb-mouse' | cut -d'"' -f2)
virsh qemu-monitor-command test-vm --hmp "device_del $DEVICE_ID"
Back to GDB
Thread 20 "libvirtd" hit Breakpoint 2, 0x00007ffba902204e in processDeviceDeletedEvent (devAlias=, vm=0x7ffbac00de90, driver=0x7ffbac021380) at ../../../src/qemu/qemu_driver.c:4888
Add breakpoint to domain status XML save, and continue the thread above
b virDomainObjSave
t 20
c
Thread 20 "libvirtd" hit Breakpoint 3, virDomainObjSave (obj=0x7ffbac00de90, xmlopt=0x7ffbac044130, statusDir=0x7ffbac01f530 "/run/libvirt/qemu") at ../../../src/conf/domain_conf.c:29157
Check the backtrace of the domain status XML save function, coming from device deleted event
(gdb) bt
#0 virDomainObjSave (obj=0x7ffbac00de90, xmlopt=0x7ffbac044130, statusDir=0x7ffbac01f530 "/run/libvirt/qemu") at ../../../src/conf/domain_conf.c:29157
#1 0x00007ffba9022127 in processDeviceDeletedEvent (devAlias=0x556074b5e3f0 "input0", vm=0x7ffbac00de90, driver=0x7ffbac021380) at ../../../src/qemu/qemu_driver.c:4312
#2 qemuProcessEventHandler (data=0x556074b63a10, opaque=0x7ffbac021380) at ../../../src/qemu/qemu_driver.c:4888
#3 0x00007ffbbee8f1af in virThreadPoolWorker (opaque=opaque@entry=0x556074c047a0) at ../../../src/util/virthreadpool.c:163
#4 0x00007ffbbee8e51c in virThreadHelper (data=) at ../../../src/util/virthread.c:196
#5 0x00007ffbbeb4f609 in start_thread (arg=) at pthread_create.c:477
#6 0x00007ffbbea74353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Leave the thread at this point
Let's trigger the shutdown path
$ sudo kill $(pidof libvirtd)
Thread 1 "libvirtd" hit Breakpoint 1, qemuStateCleanup () at ../../../src/qemu/qemu_driver.c:1127
Check the function pointer is non-NULL _before_ cleanup
(gdb) p xmlopt.privateData.format
$1 = (virDomainXMLPrivateDataFormatFunc) 0x7ffba8f7c7c0
(gdb) p/x xmlopt.parent
$2 = {u = {dummy_align1 = 0x1cafe0027, dummy_align2 = 0x1cafe0027, s = {magic = 0xcafe0027, refs = 0x1}}, klass = 0x7ffbac044100}
Let cleanup run:
t 1
c &
Check the formatter/options again; it is *STILL* referenced, not 0x0 anymore:
(gdb) p xmlopt.privateData.format
$3 = (virDomainXMLPrivateDataFormatFunc) 0x7ffba8f7c7c0
(gdb) p/x xmlopt.parent
$4 = {u = {dummy_align1 = 0x1cafe0027, dummy_align2 = 0x1cafe0027, s = {magic = 0xcafe0027, refs = 0x1}}, klass = 0x7ffbac044100}
Check the shutdown/cleanup thread is waiting for it,
in the path to free the worker thread pool:
(gdb) i th 1
Id Target Id Frame
1 Thread 0x7ffbbb035b40 (LWP 5887) "libvirtd" (running)
(gdb) t 1
(gdb) interrupt
(gdb) bt
#0 futex_wait_cancelable (private=, expected=0, futex_word=0x7ffbac05fd60) at ../sysdeps/nptl/futex-internal.h:183
#1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffbac05fce0, cond=0x7ffbac05fd38) at pthread_cond_wait.c:508
#2 __pthread_cond_wait (cond=0x7ffbac05fd38, mutex=0x7ffbac05fce0) at pthread_cond_wait.c:647
#3 0x00007ffbbee8e79b in virCondWait (c=, m=) at ../../../src/util/virthread.c:144
#4 0x00007ffbbee8f438 in virThreadPoolFree (pool=) at ../../../src/util/virthreadpool.c:286
#5 0x00007ffba8fed5d1 in qemuStateCleanup () at ../../../src/qemu/qemu_driver.c:1131
#6 0x00007ffbbf02c47f in virStateCleanup () at ../../../src/libvirt.c:669
#7 0x0000556072acebc8 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1447
Let the save function continue, and libvirt finishes shutting down:
(gdb) c &
Continuing.
(gdb) t 20
(gdb) c
[Inferior 1 (process 5887) exited normally]
(gdb) q
Check the VM status XML *after*:
$ sudo grep -e '
Now, the next time libvirtd starts, it correctly parses that XML:
$ sudo systemctl start libvirtd.service
$ journalctl -b -u libvirtd.service | grep -A1 error
$
And libvirt is aware of the domain, and can manage it:
$ virsh list
Id Name State
-------------------------
1 test-vm running
$ virsh destroy test-vm
Domain test-vm destroyed
$ virsh undefine test-vm
Domain test-vm has been undefined