Comment 9 for bug 1844455

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

The function that allocates the structure _virPCIDeviceAddress is: virPCIGetDeviceAddressFromSysfsLink().

This function is called in the following path, specifically in virPCIGetPhysicalFunction(), which is itself called from nodeDeviceSysfsGetPCISRIOVCaps().

[0] [1] [2]
-----------------------------------
                 |
                 |
nodeDeviceSysfsGetPCIRelatedDevCaps()
  nodeDeviceSysfsGetPCISRIOVCaps()

Since the tree is a bit large, I've split in 3 parts: [0], [1] and [2].
In that terminology, the bottom function in each stack calls nodeDeviceSysfsGetPCIRelatedDevCaps().

[0] src/node_device/node_device_driver.c

virNodeDeviceGetXMLDesc() [libvirt-nodedev.c]
  virNodeDeviceDriver node_device_driver callback <.nodeDeviceGetXMLDesc(), in udev/hal/remote>
    nodeDeviceGetXMLDesc()
      update_caps()

[1] src/node_device/node_device_udev.c
   virStateDriver udevStateDriver [callback register, stateInitialize()]
                               |
            --------------------------------------
            | |
            | nodeStateInitialize()
            | |
   nodeStateInitialize() udevEnumerateDevices()
               | |
udevEventHandleCallback() || udevProcessDeviceListEntry
  udevAddOneDevice()
    udevGetDeviceDetails()
      udevProcessPCI()

[2] src/node_device/node_device_hal.c << deprecated >>
virStateDriver halStateDriver callbacks register [stateInitialize(), stateInitialize()]
                      | libhal_ctx_set_device_added() |-> multiple calls in this file
         ----------------------- | |
         | | | |
nodeStateInitialize() || nodeStateReload || device_added() || dev_refresh()
         |
      dev_create() libhal_ctx_set_device_new_capability() <HAL callback>
         | |
gather_capabilities() || device_cap_added()
  gather_capability()
    caps_tbl[VIR_NODE_DEV_CAP_PCI_DEV]() <gather_fn ptr>
      gather_pci_cap()

We can skip the path [2] given that HAL is deprecated and only udev is available on Ubuntu. Trying to follow the path [1] led us to exercise the allocation of the structure responsible for the leak in question, but at the same time, udevEventHandleCallback() is able to deallocate the variable.
The trigger is the following:

while true; do udevadm trigger; done

It goes to the aforementioned described udev path ([1]), and allocates one instance of the PCI _virPCIDeviceAddress structure per device; it is deallocated though according to the following backtrace collected with gdb:

#0 virNodeDevCapsDefFree (caps=0x558d824d6ba0) at ../../../src/conf/node_device_conf.c:1709
#1 0x00007f7f1244db6c in virNodeDeviceDefFree (def=0x558d824c29c0) at ../../../src/conf/node_device_conf.c:146
#2 0x00007f7f1244fe99 in virNodeDeviceAssignDef (devs=0x7f7ee00e68b8, def=0x558d824cde50)
    at ../../../src/conf/node_device_conf.c:182
#3 0x00007f7eebe1efae in udevAddOneDevice (device=device@entry=0x558d824d85e0)
    at ../../../src/node_device/node_device_udev.c:1402
#4 0x00007f7eebe202f8 in udevEventHandleCallback (watch=watch@entry=7, fd=<optimized out>, events=events@entry=1,
    data=data@entry=0x0) at ../../../src/node_device/node_device_udev.c:1546
#5 0x00007f7f12395cf0 in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>)
    at ../../../src/util/vireventpoll.c:509
#6 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:658
#7 0x00007f7f12394332 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314
#8 0x00007f7f1250789d in virNetDaemonRun (dmn=0x558d8249f340) at ../../../src/rpc/virnetdaemon.c:701

Remains to us reproduce the issue with path [0]. To exercise that path, we can run the following command:

while true; do virsh nodedev-dumpxml pci_0000_08_12_0 >/dev/null; done

It worked, I was able to observe the following stack on Valgrind when exercising the leak path:

16,752 bytes in 1,047 blocks are definitely lost in loss record 580 of 586
at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x50A6283: virAlloc (viralloc.c:144)
by 0x50F1282: virPCIGetDeviceAddressFromSysfsLink (virpci.c:2453)
by 0x50F1417: virPCIGetPhysicalFunction (virpci.c:2486)
by 0x2BD50141: nodeDeviceSysfsGetPCISRIOVCaps (node_device_linux_sysfs.c:157)
by 0x2BD50141: nodeDeviceSysfsGetPCIRelatedDevCaps (node_device_linux_sysfs.c:225)
by 0x2BD4F1D0: update_caps (node_device_driver.c:66)
by 0x2BD4F1D0: nodeDeviceGetXMLDesc (node_device_driver.c:346)
by 0x51C838E: virNodeDeviceGetXMLDesc (libvirt-nodedev.c:292)
by 0x138D6E: remoteDispatchNodeDeviceGetXMLDesc (remote_dispatch.h:12746)
by 0x138D6E: remoteDispatchNodeDeviceGetXMLDescHelper (remote_dispatch.h:12720)
by 0x5213FE8: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
by 0x5213FE8: virNetServerProgramDispatch (virnetserverprogram.c:307)
by 0x520F4F7: virNetServerProcessMsg (virnetserver.c:135)
by 0x520F4F7: virNetServerHandleJob (virnetserver.c:156)
by 0x5106565: virThreadPoolWorker (virthreadpool.c:145)
by 0x5105AE7: virThreadHelper (virthread.c:206)

This is for my libvirt 1.3.1 (Xenial) testing, but I've noticied also in Bionic/Eoan testing. Also, 0000:08:12.0 is a Virtual Function, so the reproducer requires a SR-IOV capable PCI device.
Not only the virsh nodedev command is capable of exercising the leak path; Openstack/Nova composes XMLs for PCI functions and can run libvirt API and so cause the same effect.