Steps with test packages on Focal (shutdown-on-init) --- Start test VM cat <<-EOF >test-vm.xml test-vm hvm 32 1 EOF virsh define test-vm.xml virsh start test-vm $ virsh list Id Name State ------------------------- 1 test-vm running Stop libvirtd systemd units sudo systemctl stop 'libvirtd*' Scenario 1) Shutdown wins race against XML update (ie, shutdown happens first) Start libvirtd in GDB sudo gdb \ -iex 'set confirm off' \ -iex 'set pagination off' \ -ex 'set non-stop on' \ -ex 'handle SIGTERM nostop noprint pass' \ -ex 'add-symbol-file /usr/sbin/libvirtd' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \ /usr/sbin/libvirtd Stop on initialization (gdb) b qemuStateInitialize (gdb) run Thread 17 "libvirtd" hit Breakpoint 1, qemuStateInitialize (privileged=true, callback=0x5558939f10c0 , opaque=0x555893b905d0) at ../../../src/qemu/qemu_driver.c:644 Save the daemon 'opaque' pointer in $ptr (global variable qemu_driver_dmn is not accessible): (gdb) p qemu_driver_dmn Cannot access memory at address 0x1e39a8 (gdb) p 'src/qemu/qemu_driver.c'::qemu_driver_dmn Cannot access memory at address 0x1e39a8 (gdb) t 17 (gdb) set $ptr = opaque Run until qemuProcessReconnect (gdb) b qemuProcessReconnect (gdb) c Thread 20 "libvirtd" hit Breakpoint 2, qemuProcessReconnect (opaque=0x7fd82c054900) at ../../../src/qemu/qemu_process.c:7922 Run this thread until the lock on qemu_driver_dmn: (gdb) b virObjectLock thread 20 if anyobj == $ptr (gdb) t 20 (gdb) c Thread 20 "libvirtd" hit Breakpoint 3, virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 See the daemon is not yet shutting down (gdb) t 20 (gdb) p ((virNetDaemonPtr)anyobj)->quit $1 = false Stop the shutdown path in the main thread on the lock on qemu_driver_dmn (gdb) b virObjectLock thread 1 if anyobj == $ptr $ sudo kill $(pidof libvirtd) Thread 1 "libvirtd" hit Breakpoint 4, virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 (gdb) t 1 #0 virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 #1 0x00007fd83eabc2d5 in virNetDaemonSignalEvent (watch=watch@entry=2, fd=, events=events@entry=1, opaque=opaque@entry=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:630 #2 0x00007fd83e97da0d in virEventPollDispatchHandles (fds=0x555893bc21c0, nfds=) at ../../../src/util/vireventpoll.c:503 #3 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658 #4 0x00007fd83e97c095 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:353 #5 0x00007fd83eabd495 in virNetDaemonRun (dmn=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:836 #6 0x00005558939ef7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 Let it deliver the signal (gdb) c Thread 1 "libvirtd" hit Breakpoint 4, virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 (gdb) bt #0 virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 #1 0x00007fd83eabd2ed in virNetDaemonQuit (dmn=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:854 #2 0x00007fd83eabc33e in virNetDaemonSignalEvent (watch=watch@entry=2, fd=, events=events@entry=1, opaque=opaque@entry=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:645 #3 0x00007fd83e97da0d in virEventPollDispatchHandles (fds=0x555893bc21c0, nfds=) at ../../../src/util/vireventpoll.c:503 #4 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658 #5 0x00007fd83e97c095 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:353 #6 0x00007fd83eabd495 in virNetDaemonRun (dmn=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:836 #7 0x00005558939ef7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 Let it set 'quit' (gdb) c Thread 1 "libvirtd" hit Breakpoint 4, virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 (gdb) bt #0 virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 #1 0x00007fd83eabd4a5 in virNetDaemonRun (dmn=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:841 #2 0x00005558939ef7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 Let it take the lock in the event loop (gdb) finish Run till exit from #0 virObjectLock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:427 virNetDaemonRun (dmn=0x555893b905d0) at ../../../src/rpc/virnetdaemon.c:843 And run until unlocking, and unlock it (gdb) b virObjectUnlock thread 1 if anyobj == $ptr (gdb) c Thread 1 "libvirtd" hit Breakpoint 5, virObjectUnlock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:504 (gdb) finish Run till exit from #0 virObjectUnlock (anyobj=0x555893b905d0) at ../../../src/util/virobject.c:504 main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1434 Now, let the qemuProcessReconnect thread continue, it will not update the XML file, because 'quit' is set (ie, shutdown in progress) (gdb) t 20 (gdb) p ((virNetDaemonPtr)anyobj)->quit $2 = true $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 12 19:03 /run/libvirt/qemu/test-vm.xml (gdb) c & $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 12 19:03 /run/libvirt/qemu/test-vm.xml This can be confirmed in the log at 'info' level: $ sudo grep 'Leaving the update of .* domain status XML' /var/log/libvirt/libvirtd-debug.log 2024-04-12 19:22:55.466+0000: 5274: info : qemuProcessReconnect:8157 : Leaving the update of 'test-vm' domain status XML for the next initialization (shutdown detected on this initialization). Delete breakpoints and let it finish to completion. libvirtd finishes. (gdb) del br (gdb) t 1 (gdb) c [Inferior 1 (process 5194) exited normally] (gdb) q The XML file still has the ' Scenario 2) Shutdown loses race against XML update (ie, update happens first) sudo gdb \ -iex 'set confirm off' \ -iex 'set pagination off' \ -ex 'set non-stop on' \ -ex 'handle SIGTERM nostop noprint pass' \ -ex 'add-symbol-file /usr/sbin/libvirtd' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \ /usr/sbin/libvirtd (gdb) b qemuStateInitialize (gdb) run Thread 17 "libvirtd" hit Breakpoint 1, qemuStateInitialize (privileged=true, callback=0x56262420d0c0 , opaque=0x562624b325d0) at ../../../src/qemu/qemu_driver.c:644 Save the 'opaque' pointer (qemu_driver_dmn): (gdb) t 17 (gdb) set $ptr = opaque Run until qemuProcessReconnect (gdb) b qemuProcessReconnect (gdb) c Thread 20 "libvirtd" hit Breakpoint 2, qemuProcessReconnect (opaque=0x7fb50c261f60) at ../../../src/qemu/qemu_process.c:7922 Run this thread until the lock on qemu_driver_dmn: (gdb) b virObjectLock thread 20 if anyobj == $ptr (gdb) t 20 (gdb) c Thread 20 "libvirtd" hit Breakpoint 3, virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 See the daemon is not yet shutting down (gdb) t 20 (gdb) p ((virNetDaemonPtr)anyobj)->quit $1 = false Stop the main thread on the lock on qemu_driver_dmn, in the event loop (gdb) b virObjectLock thread 1 if anyobj == $ptr $ sudo kill $(pidof libvirtd) Thread 1 "libvirtd" hit Breakpoint 4, virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 (gdb) t 1 (gdb) bt #0 virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 #1 0x00007fae5e7a12d5 in virNetDaemonSignalEvent (watch=watch@entry=2, fd=, events=events@entry=1, opaque=opaque@entry=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:630 #2 0x00007fae5e662a0d in virEventPollDispatchHandles (fds=0x562624b641c0, nfds=) at ../../../src/util/vireventpoll.c:503 #3 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658 #4 0x00007fae5e661095 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:353 #5 0x00007fae5e7a2495 in virNetDaemonRun (dmn=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:836 #6 0x000056262420b7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 Let it deliver the signal (gdb) c Thread 1 "libvirtd" hit Breakpoint 4, virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 (gdb) bt #0 virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 #1 0x00007fae5e7a22ed in virNetDaemonQuit (dmn=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:854 #2 0x00007fae5e7a133e in virNetDaemonSignalEvent (watch=watch@entry=2, fd=, events=events@entry=1, opaque=opaque@entry=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:645 #3 0x00007fae5e662a0d in virEventPollDispatchHandles (fds=0x562624b641c0, nfds=) at ../../../src/util/vireventpoll.c:503 #4 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658 #5 0x00007fae5e661095 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:353 #6 0x00007fae5e7a2495 in virNetDaemonRun (dmn=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:836 #7 0x000056262420b7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 Do NOT let it set 'quit' yet Instead, let the qemuProcessReconnect thread take the lock, and update the XML file, but not unlock yet (gdb) t 20 (gdb) bt #0 virObjectLock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:427 #1 0x00007fae487b922d in qemuProcessReconnect (opaque=) at ../../../src/qemu/qemu_process.c:8155 #2 0x00007fae5e6c054a in virThreadHelper (data=) at ../../../src/util/virthread.c:196 #3 0x00007fae5e381609 in start_thread (arg=) at pthread_create.c:477 #4 0x00007fae5e2a6353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ubuntu@lp2059272:~$ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 12 19:03 /run/libvirt/qemu/test-vm.xml (gdb) b virObjectUnlock thread 20 if anyobj == $ptr (gdb) c Thread 20 "libvirtd" hit Breakpoint 5, virObjectUnlock (anyobj=0x562624b325d0) at ../../../src/util/virobject.c:504 ubuntu@lp2059272:~$ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 12 19:31 /run/libvirt/qemu/test-vm.xml Let the main thread run again, and see it is blocked waiting on the lock, to set 'quit' (gdb) t 1 (gdb) c & (gdb) i th 1 Id Target Id Frame * 1 Thread 0x7f57fde12b40 (LWP 97120) "libvirtd" (running) (gdb) interrupt (gdb) bt #0 __lll_lock_wait (futex=futex@entry=0x562624b325e0, private=0) at lowlevellock.c:52 #1 0x00007fae5e3840a3 in __GI___pthread_mutex_lock (mutex=0x562624b325e0) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007fae5e7a22ed in virNetDaemonQuit (dmn=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:854 #3 0x00007fae5e7a133e in virNetDaemonSignalEvent (watch=watch@entry=2, fd=, events=events@entry=1, opaque=opaque@entry=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:645 #4 0x00007fae5e662a0d in virEventPollDispatchHandles (fds=0x562624b641c0, nfds=) at ../../../src/util/vireventpoll.c:503 #5 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658 #6 0x00007fae5e661095 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:353 #7 0x00007fae5e7a2495 in virNetDaemonRun (dmn=0x562624b325d0) at ../../../src/rpc/virnetdaemon.c:836 #8 0x000056262420b7d1 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1430 (gdb) c & Let the qemuProcessReconnect finish, and the main thread is going to unblock and finish too: (gdb) del br (gdb) t 20 (gdb) c ... [Inferior 1 (process 5335) exited normally] (gdb) q The XML file still has the ' Scenario 3) Shutdown happens along QEMU monitor calls (ie, calls don't finish) sudo gdb \ -iex 'set confirm off' \ -iex 'set pagination off' \ -ex 'set non-stop on' \ -ex 'handle SIGTERM nostop noprint pass' \ -ex 'add-symbol-file /usr/sbin/libvirtd' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \ -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \ /usr/sbin/libvirtd (gdb) b qemuProcessReconnect (gdb) run Thread 20 "libvirtd" hit Breakpoint 1, qemuProcessReconnect (opaque=0x7f23b0055d30) at ../../../src/qemu/qemu_process.c:7922 Run this thread until a QEMU monitor send call: (gdb) t 20 (gdb) b qemuMonitorSend thread 20 (gdb) c Thread 20 "libvirtd" hit Breakpoint 2, qemuMonitorSend (mon=0x7f23980023c0, msg=0x7f238e35f7b0) at ../../../src/qemu/qemu_monitor.c:979 Stop the main thread on the QEMU driver cleanup, after the event loop is gone: (gdb) b qemuStateCleanup ubuntu@lp2059272:~$ sudo kill $(pidof libvirtd) Thread 1 "libvirtd" hit Breakpoint 3, qemuStateCleanup () at ../../../src/qemu/qemu_driver.c:1127 (gdb) t 1 (gdb) bt #0 qemuStateCleanup () at ../../../src/qemu/qemu_driver.c:1127 #1 0x00007f23c4a8c47f in virStateCleanup () at ../../../src/libvirt.c:669 #2 0x000055ccdfadebc8 in main (argc=, argv=) at ../../../src/remote/remote_daemon.c:1447 Let it finish (gdb) finish Run till exit from #0 qemuStateCleanup () at ../../../src/qemu/qemu_driver.c:1127 0x00007f23c4a8c47f in virStateCleanup () at ../../../src/libvirt.c:669 Let the qemuProcessReconnect thread continue, and see it is blocked waiting on reply/recv from event loop (gdb) t 20 (gdb) c & (gdb) i th 20 Id Target Id Frame * 20 Thread 0x7f9a157fa700 (LWP 97193) "libvirtd" (running) (gdb) interrupt Thread 20 "libvirtd" stopped. (gdb) bt #0 futex_wait_cancelable (private=, expected=0, futex_word=0x7f2398002420) at ../sysdeps/nptl/futex-internal.h:183 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f23980023d0, cond=0x7f23980023f8) at pthread_cond_wait.c:508 #2 __pthread_cond_wait (cond=0x7f23980023f8, mutex=0x7f23980023d0) at pthread_cond_wait.c:647 #3 0x00007f23c48ee79b in virCondWait (c=, m=) at ../../../src/util/virthread.c:144 #4 0x00007f239e994684 in qemuMonitorSend (mon=0x7f23980023c0, msg=) at ../../../src/qemu/qemu_monitor.c:998 #5 0x00007f239e9a3dc8 in qemuMonitorJSONCommandWithFd (mon=0x7f23980023c0, cmd=0x7f23980027b0, scm_fd=-1, reply=0x7f238e35f840) at ../../../src/qemu/qemu_monitor_json.c:328 #6 0x00007f239e9a5eb5 in qemuMonitorJSONCommand (reply=0x7f238e35f840, cmd=0x7f23980027b0, mon=) at ../../../src/qemu/qemu_monitor_json.c:1602 #7 qemuMonitorJSONSetCapabilities (mon=) at ../../../src/qemu/qemu_monitor_json.c:1602 #8 0x00007f239e973b4c in qemuProcessInitMonitor (asyncJob=QEMU_ASYNC_JOB_NONE, vm=0x7f23b004f9b0, driver=0x7f23b000f1e0) at ../../../src/qemu/qemu_process.c:1932 #9 qemuConnectMonitor (driver=driver@entry=0x7f23b000f1e0, vm=0x7f23b004f9b0, asyncJob=asyncJob@entry=0, retry=retry@entry=false, logCtxt=logCtxt@entry=0x0) at ../../../src/qemu/qemu_process.c:1992 #10 0x00007f239e97fbca in qemuProcessReconnect (opaque=) at ../../../src/qemu/qemu_process.c:7978 #11 0x00007f23c48ee54a in virThreadHelper (data=) at ../../../src/util/virthread.c:196 #12 0x00007f23c45af609 in start_thread (arg=) at pthread_create.c:477 #13 0x00007f23c44d4353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Let it run again; it does not unblock, even if the main thread finishes: (gdb) c & (gdb) i th 1 20 Id Target Id Frame 1 Thread 0x7f23c0a95b40 (LWP 5512) "libvirtd" 0x00007f23c4a8c47f in virStateCleanup () at ../../../src/libvirt.c:669 * 20 Thread 0x7f238e360700 (LWP 5590) "libvirtd" (running) (gdb) t 1 (gdb) c Continuing. [Thread 0x7f238e360700 (LWP 5590) exited] Thread-specific breakpoint 2 deleted - thread 20 no longer in the thread list. ... [Inferior 1 (process 5512) exited normally] (gdb) q The XML was not updated, as expected: $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 12 19:31 /run/libvirt/qemu/test-vm.xml $ sudo grep -e ' Now, the next time libvirtd starts, it correctly parses that XML: $ sudo systemctl start libvirtd.service $ journalctl -b -u libvirtd.service | grep -A1 error $ And libvirt is aware of the domain, and can manage it: $ virsh list Id Name State ------------------------- 1 test-vm running $ virsh destroy test-vm Domain test-vm destroyed $ virsh undefine test-vm Domain test-vm has been undefined