Comment 11 for bug 1643911

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Now, we seem to be stuck in a limbo here, unable to diagnose this to get to the root cause. So I asked upstream libvirt maintainers on IRC. And Dan Berrange responds [text formatted a little bit for readability here]:

"Running libvirt under Valgrind will likely point to a root cause. However, it's impossible to run libvirtd under Valgrind in the openstack CI system, unless you're happy to have many hours longer running time and massively more RAM used.

"The only way to debug it is to deploy custom libvirtd builds. Meaning: whatever extra debugging info is needed in the area of code that is suspected to be broken; there's no right answer here - you just have to experiment repeatedly until you find what you need. And deploy this custom build either by providing new packages in the the repos, or by using a hack [via 'rootwrap' facility] to install custom libvirtd in the Nova startup code.

"Also, a core dump in this scenario will not be helpful. With memory corruption, a core dump is rarely useful, because the actual problem you care about will have occurred some time before the crash happens. This is especially true for multithreaded applications like libvirtd. Because the thread showing the abrt/segv is quite often not the thread which caused the corruption."