Comment 13 for bug 1838575

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Summary:
As I mentioned before (on the other bug that I referred).
The problem is that with a PT device it needs to reset and map the VDIO devices.
So with >0 PT devices attached it needs an init that scales with memory size of the guest (see my fast results with PT but small guest memory).

As I outlined before the "fix" for that is either:
- (admin) use Huge pages (1G if possible) for the very huge guest which will make this way more efficient.
- (kernel development) get the kernel to use more cores for this initialization (currently 1)

Last time we disussed this Nvidia said they consider 1G Hugepages as recommended setup for huge guests (due to the benefits). But not for guests in general (due to the complexity managing the memory).
I don't know where it ended, btu I expect it to be in some guidelines.
That is why I wondered about th elack of hueg pages in the spec to test qemu passthrough perf if you remember.

I don't think there is much more qemu can do at this point (at least I'd not know), but we can use this re-discovery of this issue as a chance. If you want to dive (or ask one of your teammates) into the kernel side to check if it could be made multi-cpu initialization for better scaling.
I'm adding a kernel task for your consideration.