qemu fails to init vhost_user if > 8 memory regions

Bug #1886704 reported by Dan Streetman
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Won't Fix
Undecided
Unassigned
Focal
Won't Fix
Undecided
Unassigned
Groovy
Won't Fix
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned

Bug Description

[impact]

when using qemu with interfaces that use the vhost_user backend (i.e. dpdk), there is a hardcoded limit of 8 memory regions that the vhost_user backend driver can support. This hardcoded limit is still present upstream, and appears to be a limitation of the vhost-user specification's "set mem table" protocol.

There are some optimization and bugfixes in code over time, and in particular there is a very recent upstream commit (27598393a23215cfbf92ad550b9541675b0b8f2b) that allows increasing this max, however the vhost-user backend (that qemu talks to) must support and negotiate to use the new method.

[test case]

start a qemu guest with at least one vhost-user interface, and more than 8 discontiguous memory regions (which may result from multiple passthrough pci devices or other causes). The vhost-user device will fail to init due to exceeding its max memory region limit.

[regression potential]

tbd

[scope]

this limit still exists upstream, and appears to be a vhost-user specification protocol limit, so the only resolution may to be to switch to the new protocol feature added in the upstream commit referenced in the impact section. thus, any change would likely be needed in all releases, but as the new protocol feature is not a bug fix, but a new feature, it's unlikely it would qualify for backporting to any/all SRU releases.

however, see other info section.

[other info]

there are at least some, and possibly multiple, bugfixes through the versions in x/b/f that improve how vhost ignores some memory regions, and merges contiguous regions, so it's possible the limit of 8 could be reached for some guests that fail on older releases.

tbd identifying any/all such patches that might help.

note some upstream patches that help with this bug, but do not actually increase the limit of 8, are in bug 1887525

for reference, the vhost-user api function that imposes this limit:
https://www.qemu.org/docs/master/interop/vhost-user.html#memory-regions-description

Revision history for this message
Jasvinder Singh Kwatra (jasvinder1107) wrote :
Dan Streetman (ddstreet)
description: updated
description: updated
Revision history for this message
Dan Streetman (ddstreet) wrote :

Sorry for the delay in following up on this; as mentioned in the description, since this is a limitation of the underlying vhost-user specificiation, there are only 2 possible ways to address it: 1) change qemu to use the newer vhost-user function which doesn't have the 8-region limit, and 2) fix/patch qemu as much as possible to ignore/elide memory regions that the vhost-user implementation won't need to access, to reduce the total number.

The #1 change to use the new api function is already done by upstream qemu starting in v5.1.0, which is included in hirsute (see comment 1 for upstream commit references).

The #2 change has been done upstream in various patches, and some of those patches were backported in bug 1887525. There may be other upstream patches for this that could further help, but that should be investigated and handled in a separate new bug, IFF someone encounters the 8-region limit again in the future; the usage of vhost-user is rather narrow (just dpdk I believe) and generally the instances do not have a huge number of discontinuous memory regions.

So for this bug, I believe the question is if the new vhost-user function should be proactively backported to sru releases, and I think the answer to that should be no. For the sru to be useful, the dpdk package would also need to be updated to support the new vhost-user api function, and the backporting of this has the potential to introduce regressions.

So I believe we should reject this bug as wontfix, without a clear and immediate need for this backporting work, as well as explanation of why the dpdk instance(s) can't simply be adjusted to reduce their number of discontiguous memory regions, and also after investigation into other patches that might workaround the limit (as mentioned in #2).

Note that since the commits are included in hirsute, I'll mark this fixreleased in h, and wontfix for sru releases.

Changed in qemu (Ubuntu Hirsute):
status: New → Fix Released
Changed in qemu (Ubuntu Groovy):
status: New → Won't Fix
Changed in qemu (Ubuntu Bionic):
status: New → Won't Fix
Changed in qemu (Ubuntu Focal):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers