Activity log for bug #1943481

Date Who What changed Old value New value Message
2021-09-13 17:52:13 Paul Saab bug added bug
2021-09-13 19:05:17 Paul Saab bug added subscriber Frode Nordahl
2021-09-13 20:55:39 Paul Saab removed subscriber Frode Nordahl
2021-09-13 21:14:52 Paul Saab attachment added lp-1892132-Add-phys_port_name-support-on-virPCIGetNetName.patch https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1943481/+attachment/5525046/+files/lp-1892132-Add-phys_port_name-support-on-virPCIGetNetName.patch
2021-09-13 23:09:02 Dan Streetman tags regression-update
2021-09-13 23:09:11 Dan Streetman bug added subscriber Frode Nordahl
2021-09-13 23:09:21 Dan Streetman bug added subscriber Christian Ehrhardt 
2021-09-13 23:09:25 Dan Streetman libvirt (Ubuntu): importance Undecided Critical
2021-09-13 23:13:49 Dan Streetman bug added subscriber Dan Streetman
2021-09-13 23:25:38 Dominique Poulain bug added subscriber Dominique Poulain
2021-09-13 23:39:31 Matthew Ruffell nominated for series Ubuntu Focal
2021-09-13 23:39:31 Matthew Ruffell bug task added libvirt (Ubuntu Focal)
2021-09-13 23:39:40 Matthew Ruffell libvirt (Ubuntu Focal): importance Undecided Critical
2021-09-13 23:39:43 Matthew Ruffell libvirt (Ubuntu Focal): status New In Progress
2021-09-14 00:22:30 Matthew Ruffell libvirt (Ubuntu Focal): assignee Matthew Ruffell (mruffell)
2021-09-14 00:25:28 Ubuntu Foundations Team Bug Bot tags regression-update patch regression-update
2021-09-14 00:25:38 Ubuntu Foundations Team Bug Bot bug added subscriber Ubuntu Review Team
2021-09-14 00:32:43 Brett Milford bug added subscriber Brett Milford
2021-09-14 02:22:52 Matthew Ruffell attachment added debdiff for libvirt on Focal https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1943481/+attachment/5525063/+files/lp1943481_focal.debdiff
2021-09-14 04:00:22 Matthew Ruffell bug added subscriber Matthew Ruffell
2021-09-14 06:00:00 Christian Ehrhardt  libvirt (Ubuntu): status New Invalid
2021-09-14 06:00:04 Christian Ehrhardt  libvirt (Ubuntu): importance Critical Undecided
2021-09-14 06:06:11 Matthew Ruffell description The latest libvirtd (6.0.0-0ubuntu8.13) crashes when trying to bring up network pools with the stacktrace below. I tracked down the problem to the newly added patch (lp-1892132-Add-phys_port_name-support-on-virPCIGetNetName.patch). Assigning *netname = firstEntryName; ends up in memory corruption. Looking at the mainline, I changed it to the following: *netname = g_steal_pointer(&firstEntryName); or you can just do firstEntryName = NULL; Both will solve the problem. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f40e5d1c859 in __GI_abort () at abort.c:79 #2 0x00007f40e5d873ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f40e5eb1285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #3 0x00007f40e5d8f47c in malloc_printerr (str=str@entry=0x7f40e5eb35d0 "free(): double free detected in tcache 2") at malloc.c:5347 #4 0x00007f40e5d910ed in _int_free (av=0x7f40c8000020, p=0x7f40c80079e0, have_lock=0) at malloc.c:4201 #5 0x00007f40e61a9a4f in virFree (ptrptr=0x7f40c8003b60) at ../../../src/util/viralloc.c:348 #6 0x00007f40dd0cf8b1 in networkCreateInterfacePool (netdef=0x7f40840187f0) at ../../../src/network/bridge_driver.c:2849 #7 0x00007f40dd0d799c in networkStartNetworkExternal (obj=0x7f408400f720) at ../../../src/network/bridge_driver.c:2938 #8 networkStartNetwork (driver=driver@entry=0x7f408400a7a0, obj=0x7f408400f720) at ../../../src/network/bridge_driver.c:2938 #9 0x00007f40dd0d854d in networkCreate (net=0x7f40c8000c60) at ../../../src/network/bridge_driver.c:4013 #10 0x00007f40e63fac3f in virNetworkCreate (network=network@entry=0x7f40c8000c60) at ../../../src/libvirt-network.c:585 #11 0x0000560240e255d1 in remoteDispatchNetworkCreate (server=0x560240ea4280, msg=0x560240ee8200, args=0x7f40c8000c40, rerr=0x7f40e00ec9a0, client=<optimized out>) at ./remote/remote_daemon_dispatch_stubs.h:13570 #12 remoteDispatchNetworkCreateHelper (server=0x560240ea4280, client=<optimized out>, msg=0x560240ee8200, rerr=0x7f40e00ec9a0, args=0x7f40c8000c40, ret=0x0) at ./remote/remote_daemon_dispatch_stubs.h:13549 #13 0x00007f40e630c970 in virNetServerProgramDispatchCall (msg=0x560240ee8200, client=0x560240eea270, server=0x560240ea4280, prog=0x560240ee1520) at ../../../src/rpc/virnetserverprogram.c:430 #14 virNetServerProgramDispatch (prog=0x560240ee1520, server=server@entry=0x560240ea4280, client=0x560240eea270, msg=0x560240ee8200) at ../../../src/rpc/virnetserverprogram.c:302 #15 0x00007f40e6311c2c in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x560240ea4280) at ../../../src/rpc/virnetserver.c:136 #16 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x560240ea4280) at ../../../src/rpc/virnetserver.c:153 #17 0x00007f40e62301af in virThreadPoolWorker (opaque=opaque@entry=0x560240e885f0) at ../../../src/util/virthreadpool.c:163 #18 0x00007f40e622f51c in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:196 #19 0x00007f40e5ef2609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #20 0x00007f40e5e19293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 [Impact] A regression was introduced in libvirt 6.0.0-0ubuntu8.13 for Focal, that affects users who use SR-IOV to pass through VF devices to KVM guests. The problem was introduced in the recent lp-1892132-Add-phys_port_name-support-on-virPCIGetNetName.patch patch, which changes how virPCIGetNetName() fetches the name of the underlying VF device, so it can be used to send netlink commands. There is a fallback case where we record the name of the device at the beginning, and if we fail all other lookups, we simply return the beginning name. In libvirt 6.0.0-0ubuntu8.13, a line to drop the reference to firstEntryName was dropped incorrectly: - if (firstEntryName) { - *netname = firstEntryName; - firstEntryName = NULL; - ret = 0; + if (firstEntryName) { + *netname = firstEntryName; + ret = 0; This results in a double free, as netname and firstEntryName are freed, and results in the gdb trace: #1 0x00007f40e5d1c859 in __GI_abort () at abort.c:79 #2 0x00007f40e5d873ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f40e5eb1285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #3 0x00007f40e5d8f47c in malloc_printerr (str=str@entry=0x7f40e5eb35d0 "free(): double free detected in tcache 2") at malloc.c:5347 #4 0x00007f40e5d910ed in _int_free (av=0x7f40c8000020, p=0x7f40c80079e0, have_lock=0) at malloc.c:4201 #5 0x00007f40e61a9a4f in virFree (ptrptr=0x7f40c8003b60) at ../../../src/util/viralloc.c:348 #6 0x00007f40dd0cf8b1 in networkCreateInterfacePool (netdef=0x7f40840187f0) at ../../../src/network/bridge_driver.c:2849 #7 0x00007f40dd0d799c in networkStartNetworkExternal (obj=0x7f408400f720) at ../../../src/network/bridge_driver.c:2938 #8 networkStartNetwork (driver=driver@entry=0x7f408400a7a0, obj=0x7f408400f720) at ../../../src/network/bridge_driver.c:2938 #9 0x00007f40dd0d854d in networkCreate (net=0x7f40c8000c60) at ../../../src/network/bridge_driver.c:4013 #10 0x00007f40e63fac3f in virNetworkCreate (network=network@entry=0x7f40c8000c60) at ../../../src/libvirt-network.c:585 #11 0x0000560240e255d1 in remoteDispatchNetworkCreate (server=0x560240ea4280, msg=0x560240ee8200, args=0x7f40c8000c40, rerr=0x7f40e00ec9a0, client=<optimized out>) at ./remote/remote_daemon_dispatch_stubs.h:13570 #12 remoteDispatchNetworkCreateHelper (server=0x560240ea4280, client=<optimized out>, msg=0x560240ee8200, rerr=0x7f40e00ec9a0, args=0x7f40c8000c40, ret=0x0) at ./remote/remote_daemon_dispatch_stubs.h:13549 #13 0x00007f40e630c970 in virNetServerProgramDispatchCall (msg=0x560240ee8200, client=0x560240eea270, server=0x560240ea4280, prog=0x560240ee1520) at ../../../src/rpc/virnetserverprogram.c:430 #14 virNetServerProgramDispatch (prog=0x560240ee1520, server=server@entry=0x560240ea4280, client=0x560240eea270, msg=0x560240ee8200) at ../../../src/rpc/virnetserverprogram.c:302 #15 0x00007f40e6311c2c in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x560240ea4280) at ../../../src/rpc/virnetserver.c:136 #16 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x560240ea4280) at ../../../src/rpc/virnetserver.c:153 #17 0x00007f40e62301af in virThreadPoolWorker (opaque=opaque@entry=0x560240e885f0) at ../../../src/util/virthreadpool.c:163 #18 0x00007f40e622f51c in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:196 #19 0x00007f40e5ef2609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #20 0x00007f40e5e19293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The fix is to either make sure that firstEntryName = NULL; like before, or we replace with the upstream call to g_steal_pointer(&firstEntryName); which does the same. static inline gpointer g_steal_pointer (gpointer pp) { gpointer *ptr = (gpointer *) pp; gpointer ref; ref = *ptr; *ptr = NULL; return ref; } [Testcase] Deploy a machine with a NIC that supports SR-IOV. Note, only particular NICs will reach the end of virPCIGetNetName(). Install KVM stack: $ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils Edit /etc/default/grub and add "intel_iommu=on" to the kernel command line. $ sudo update-grub $ sudo reboot Create the VFs via the sysfs node: $ sudo -s # cat /sys/class/net/eno49/device/sriov_totalvfs 63 # echo '7' > /sys/class/net/eno49/device/sriov_numvfs Next we need to define a virsh network, save the following in /tmp/passthrough.xml, changing "eno49" to your network interface. <network> <name>passthrough</name> <forward mode='hostdev' managed='yes'> <pf dev='eno49'/> </forward> </network> $ virsh net-define /tmp/passthrough.xml $ virsh net-autostart passthrough $ virsh net-start passthrough We need to make an apparmor rule to enable vfio of our VF device. Edit /etc/apparmor.d/local/abstractions/libvirt-qemu Add the line: /dev/vfio/* rw, Then restart apparmor: $ sudo systemctl restart apparmor.service Next make a Focal VM: $ sudo apt install uvtool-libvirt $ ssh-keygen $ uvt-simplestreams-libvirt sync release=focal arch=amd64 $ uvt-kvm create --cpu 4 --memory 4096 --disk 8 [ --password insecure ] focal-vm release=focal arch=amd64 $ uvt-kvm wait focal-vm $ uvt-kvm ssh focal-vm # for ssh, key-based authentication. $ virsh console focal-vm # for serial console, user ubuntu, password above. Next, edit the virsh xml $ virsh shutdown focal-vm $ virsh edit focal-vm Add: <interface type='network'> <source network='passthrough'> </interface> Save and reboot the VM. $ virsh start focal-vm [Where problems could occur] If a regression were to occur, it would affect users who use SR-IOV to pass through VF devices into KVM guests, which is a large amount of our enterprise users. The fix is a single line change, and simply replaces what was existing, but was mistakenly removed. The changes should be safe.
2021-09-14 06:13:42 Chris Halse Rogers libvirt (Ubuntu Focal): status In Progress Fix Committed
2021-09-14 06:13:52 Chris Halse Rogers bug added subscriber Ubuntu Stable Release Updates Team
2021-09-14 06:14:05 Chris Halse Rogers bug added subscriber SRU Verification
2021-09-14 06:14:48 Chris Halse Rogers tags patch regression-update patch regression-update verification-needed verification-needed-focal
2021-09-14 08:17:42 John Lewis bug added subscriber John Lewis
2021-09-14 15:35:31 Christian Ehrhardt  tags patch regression-update verification-needed verification-needed-focal patch regression-update verification-done verification-done-focal
2021-09-15 14:20:21 Brian Murray removed subscriber Ubuntu Stable Release Updates Team
2021-09-15 14:41:38 Launchpad Janitor libvirt (Ubuntu Focal): status Fix Committed Fix Released